Identify Images Using IBM Watson’s Remote APIs

Some of the most interesting web services you can use with Xojo through remote API calls are related to Artificial Intelligence. There are many different APIs provided by the main players in the AI sector, but IBM’s Watson is by far the most well known.

I’m going to show you how to connect to IBM’s Watson services with REST APIs and how to use them with Xojo projects to identify images. This is just one example, of course, of the many ways to utilize Watson and AI in your Xojo apps.

IBM Watson

IBM’s Watson services have been online for several years and are continuously updated. They are paid services but you can create a Lite user to test and evaluate them. The documentation and variety of services are huge and includes (among other things) image identification services.

Identify images with Watson

This service mainly offers two possibilities: identifying the faces in an image (returning the position, the probable gender of the subject and the probable age range) and the classification (i.e. the recognition of possible tags that can be associated with the image).

The key concept of this type of process is: Probability
The result is a probability, not a certainty. It’s up to your app to accept or reject the result and how to use it. For example, you can set a threshold and automatically accept the results and send the user the uncertain result to confirm or refuse.

Watson’s Natural Language Classifier, is a service that “applies cognitive computing techniques to return best matching predefined classes for short text inputs, such as a sentence or phrase”. It is really large (and the definitions are localized in many languages); moreover you can create your own classifiers specific to projects. The Lite user can create only one classifier at a time that is replaceable but not updatable. Another limit is the number and total size of the images that you can use to address your classifier.

Watson’s documentation is complete and easy to use. After creating the service in your account, you can start using them with the terminal or through the web page relative to the selected service.

How to use Watson with Xojo

The services use a REST API so you’ll use Xojo.Net.HTTPSocket as the base class to create the object that will consume these services. To learn more about using Xojo with a REST API, you should read up on that before continuing: Cats Up: Using the HTTPSocket with the Cat REST API, PDF File Generation? There is an API for that and Web Services Video Playlist.

In this example, the base class is called “WatsonAPI”. This class will deal with the communication with the API (sending and initial interpretation of the answer) and will have some common features (such as zipping several images so they can be sent all at once). Moreover, since the interaction with the service is asynchronous, the class will have to manage the serialization of the different requests and take care to return the result to the correct call.

Next, you’ll define a delegate who will have as an argument the answer (positive, negative or error) that will have as its signature a Xojo.Core.Dictionary:

WatsonReplyFunction(reply as Xojo.Core.Dictionary)

Now you’ll define two private properties: id and callback. You’ll use id to distinguish the various call, and callback will be the function to call when there’s a the result.

Private Property id as Text
Private Property callback as WatsonReplyFunction

The constructor (protected because the subclasses will call it) can be of this kind:

Protected Sub Constructor( cb as WatsonAPI.WatsonReplyFunction)
  // Register the callback
  callback=cb
  Super.Constructor
  register Me
End Sub

Register is a private shared function that assigns the identifier to the object and saves the pair id, object in a shared dictionary. There’s also a deRegister function that will delete the object whose identifier is passed.

As noted before, you’ll use a dictionary to represent the API result because it’s the format used for positive or negative API replies. Moreover, we can use the same format for network error replies. The dictionary will contain at least 3 values: success as boolean for a positive or negative reply, status as integer for the reply http status and result as text or dictionary to represent the current reply. Since it’s a dictionary, we can easly add more information as needed. Subclasses or consumer classes can transform this data structure in a more specific way (class, record or whatever).

Now you will implement the major events:

Sub Error(err as RuntimeException) Handles Error
  //All replies will have the same structure
  //In the event of an error, the returned dictionary must be structured as correct one
  Dim d As New Xojo.Core.Dictionary
  d.Value("success")=False
  d.Value("status")=0
  d.Value("result")=err.Reason
  //Delete the object from the dictionary
  deRegister(Me)
  //return the reply
  callback.Invoke(d)
End Sub


Sub PageReceived(URL as Text, HTTPStatus as Integer, Content as Xojo.Core.MemoryBlock>) Handles PageReceived
  #Pragma Unused url

  //evaluate the reply
  Dim d As Xojo.Core.Dictionary
  Dim t As Text
  Try
    t=Xojo.Core.TextEncoding.UTF8.ConvertDataToText(Content)
  Catch
    t=""
  End Try
  If Not t.Empty Then
    Try
      d=Xojo.Data.ParseJSON(t)
    Catch
      d=Nil
    End Try
  End If
  Dim reply As New Xojo.Core.Dictionary
  reply.Value("success")=HTTPStatus=200 And d<>Nil
  reply.Value("status")=HTTPStatus
  If d=Nil Then
    reply.Value("result")=t
  Else
    reply.Value("result")=d
  End If
  //Delete the object from the dictionary
  deRegister(Me)
  //return the reply
  callback.Invoke(reply)
End Sub

Now you can create a subclass to classify the images: WatsonVisualRecognition as subclass of WastonAPI.

For the image classification you can use one or more of the classifiers and/or the default one or even those in beta (currently Food and Explicit). So let’s define the constants related to these classifiers:

Public Const IBMDefault as Text =default
Public Const IBMExplicit as Text = explicit
Public Const IBMFood As Text = food

… and the ones related to the services:

//The current version of the service
Private Const version As Text = 2016-05-20

//The address actually depends on the user settings
Private Const kBaseUrl As Text = https://gateway.watsonplatform.net/visual-recognition/api/v3/

//The key to use the service
Private Const keyVision As Text = •••••••

You can now define a public shared method for analyzing an image on the web:

Public Shared Sub classifyImage(cb As WastonAPI.WatsonReplyFunction, imageUrl As Text, threshold As Single=0.5, paramArray classifiers As Text)
  //The method requires a method to be called to return the results,
  // the address of the image to be analyzed
  // the minimum value to be considered for recognition
  // a list of classifiers to use

  //Let's create the instance linking it to the callback
  Dim w As New WatsonVisualRecognition(cb)

  //threshold is the minimum acceptable value for classification
  //  must be between 0 and 1
  If threshold<0.0 Then threshold=0.0
  If threshold>1.0 Then threshold=1.0

  //For classifiers I can use both the ones provided and mine
  //none means the default one
  Dim useIBM As Boolean
  Dim usePersonal As Boolean
  Dim usedClassifiers() As Text
  For i As Integer=0 To classifiers.Ubound
    Select Case classifiers(i)
    Case IBMDefault
      useIBM=True
    Case IBMExplicit, IBMFood
      useIBM=True
      //These classifiers are in English only 
      w.RequestHeader("Accept-Language")="en"
    Else
      usePersonal=True
    End Select
    If usedClassifiers.IndexOf(classifiers(i))=-11 Then usedClassifiers.Append  classifiers(i)
  Next
  Dim classifierIds As Text=Text.Join(usedClassifiers, ",")

  //Set the kind of the classifiers used
  Dim usedOwners() As Text
  If useIBM Then usedOwners.Append "IBM"
  If usePersonal Then usedOwners.Append "me"
  Dim owners As Text=Text.Join(usedOwners, ",")

  //Create the URL to be called
  Dim url As Text=kBaseUrl+"classify"

  //I create the list of arguments
  Dim args() As Text
  args.Append "api_key="+keyVision
  args.Append "version="+version
  args.Append "url="+imageUrl
  If Not owners.Empty Then args.Append "owners="+owners
  If Not classifierIds.Empty Then args.Append "classifier_ids="+classifierIds
  If threshold>0 Then args.Append "threshold="+threshold.ToText
  Dim parameters As Text=Text.>Join(args, "&")

  w.Send "GET", url+If(parameters.Empty, "", "?"+parameters)
End Sub

Finally, you can request the classification of an image. For example, put a button in a Window and in the Action event put the following code:

WatsonVisualRecognition.classifyImage(WeakAddressOf analyzeResponse, "https://watson-developer-cloud.github.io/doc-tutorial-downloads/visual-recognition/fruitbowl.jpg", .3)

Where analyzeResponse is the method that will read the results translating the Dictionary into something useful like actions for a database, a textual list or a simple text message.

Starting from this simple method, it is possible to create all the others, adding a bit of utility functions to the WatsonAPI base class.

As an example, it is possible to obtain the basic information about the people in this image and display it as an overlay:

Or update a database of images with classifications to then find the images of a specific type:

Create your own classifiers

Starting from this object it is easy to create an application to generate, update, and verify a classifier that is specific to this solution. While you can do it with the provided web interface, you can create a Xojo app that can return the feedback in a much easier and manageable way; and you can add some methods to automate the process of automatically discarding bad images and adding new ones in order to refine your classifier better.

Conclusions

Watson’s API services allow you to add a bit of artificial intelligence to your Xojo projects. The simplicity of the classes required to do this are clear proof of Xojo’s versatility.

It’s important to keep in mind that this service is not instantaneous. This is due in large part to network traffic; generally for sending the data and receiving the answer.

A really interesting option if you develop for MacOS would be to download your classifier in CoreML format and use it offline with MBS Core ML plugin.

Antonio Rinaldi is a professional engineer, Xojo developer for almost twenty years, Xojo evangelist for Italy since 2014, consultant and teacher. He develops extensions for Xojo iOS that you can find in the Xojo Store, and manages XojoItaliaBlog.com. Musician, composer, lover of good food, traveler and constantly curious, he is always looking for new ideas to expand his knowledge.