Working with Semantic Expressions

As briefly mentioned in the introduction, semantically meaningful operations can be carried out on fingerprints by performing simple binary operations on the positions of the fingerprints. Semantic relationships between fingerprints can be discovered by looking at their overlapping positions in the semantic space. This allows us, for example, to subtract the meaning of one term from the meaning of another term to obtain a more specific representation. In the /expressions endpoint, we offer these binary operations on the fingerprints and, along with this, a flexible way of specifying the input data.

Input Formatting in json

Inputs to the expression endpoints require the user to format the query data in json. There are three constructs for entering data as expressions:

{
    "term": "jaguar"
}
{
    "text": "The jaguar is a big cat, a feline in the Panthera genus."
}
{
    "positions" : [ 2, 3, 44, 264, 539, 951 ]
}

Querying the /expressions endpoint will simply resolve all input elements, perform any operations (see below), and return the resulting Fingerprint object.

For example using curl to get the fingerprint of the term jaguar can be done like this (all on one line):

curl -k -X POST -H "api-key: yourApiKey" -H "Content-Type: application/json"
   "http://api.cortical.io/rest/expressions?retina_name=en_associative&sparsity=1.0"
      -d '{"term": "jaguar"}'

The alert reader will at this point notice, that this functionality is the same as what is offered under the /terms and /text endpoints. You can retrieve fingerprints for texts and terms with expressions as well as using these specific endpoints. But you can do more than this:

Using Operators in Expressions

When querying the /expressions/similar_terms endpoint with this input (and using the en_associative retina):

{
     "term" : "apple"
}

we get the following terms in return:

  • apple, desktop, macintosh, hardware, iphone, processor, mac os, os, software, microsoft, compatible, ipod, cpu, compatibility, interface

which was perhaps not what we wanted if we were looking for similar terms for the fruit. In order to achieve this we can simply subtract the fingerprint of a small text description of the company in an expression:

{
  "sub" : [
    {
      "term" : "apple"
    },
    {
      "text" : "Mac OS is a series of graphical user interface-based operating systems
                 developed by Apple Inc. for their Macintosh line of computer systems."
    }
  ]
}

The similar terms for this expression now results in the following list:

  • apple, fruit, apples, fruits, pears, sweet, cherry, banana, honey, mango, juice, olive, corn, plum, palm

In other words, we can specify the meaning of the fingerprint by doing binary operations on the semantic space. The operations available for building expressions are the following:

Operator Input Result  
and { "and": [ { "positions": [0, 2, 4] }, { "positions" : [0, 4] } ] } { "positions": [ 0, 4 ] }  
or { "or": [ { "positions": [0, 2, 4] }, { "positions" : [0, 4] } ] } { "positions": [ 0, 2, 4 ] }  
sub { "sub": [ { "positions": [0, 2, 4] }, { "positions" : [0, 4] } ] } { "positions": [ 2 ] }  
xor { "xor": [ { "positions": [0, 2, 4] }, { "positions" : [0, 4] } ] } { "positions": [ 2 ] }  

As you can see, these do what you expect once the text and term input has been converted into positions. The methods can also be applied to a list of arguments, in which case precedence rules apply.

One thing that can be useful, is the sparsify parameter. This will simply re-sparsify a fingerprint’s positions down to a given value. For example, when OR’ing a number of fingerprints together you may end up having a fingerprint of a high sparsity – e.g. 10%. By setting the sparsity value to 0.02 only 2% of the positions will be returned, which is the sparsity that individual terms in the Retinas have for the time being.

Bulk Requests

The endpoints ending in /bulk are there to pack a list of expressions to evaluate into a single HTTP call in order to decrease the network load. For instance, from the input:

[
    {
        "text": "This is an example text to demonstrate how text elements may be used in a
                 JSON list,"
    }
    ,
    {
        "text": "in order to make a call to the /text/bulk endpoint."
    }
    ,
    {
        "text": "Only text elements may be used with the /text/bulk endpoint."
    }
]

a list of 3 Fingerprint objects, corresponding to the the three input texts, will be returned.

API Clients

The FullClient object available in the Java, Python, and JavaScript client libraries has the following methods for calling the expressions endpoints:

  • getFingerprintForExpression
  • getContextsForExpression
  • getSimilarTermsForExpression
  • getFingerprintsForExpressions
  • getContextsForExpressions
  • getSimilarTermsForExpressions