Defining Requests, Models & Keywords
Full definitions of all endpoints and schemas are on the API’s swagger page.
In most cases, the values should be self-explanatory.
This is a brief description of the less obvious values and how they are used.
A Model is where you define a set of keywords. Multiple different models can be passed in each request and a score between 1 and 100 is then returned for
each of them. The Covid-19 example model opposite shows that each keyword is made up of six components:
Word The word or phrase you want to match against. Phrases with multiple words are considered stronger matches and are
weighted higher than single words.
Weighting The default weighting for a word is 1. The “Weighting” value allows you to set a weighting between 1 and 5.
Setting a high weighting on important keywords allows you to boost the score for the model when that word or phrase is found.
For example, “Covid” and “Covid-19” are given a weighting of 5 in the sample opposite, as their presence in the text will
ALWAYS signify that the text is to some degree about Covid. Keywords such as “mask” or “vaccine,” while indicative of a Covid
related piece of text in the current climate, will not always be so, as they are also likely to appear in text related to other pandemics.
MustHave When you set “MustHave” to true for a keyword, you are saying that at least one of the keywords defined as MustHaves
must be present in the text. If none are present, the model will not match to the text regardless of what other lesser keywords might be found.
CaseSensitive This should be used when you are certain that the only occurrences of the word are capitalized, or where lack of
capitalization can lead to false matches. For example: references to “Trump” in the US Politics model should match successfully to Donald
Trump where the match is case sensitive. If lowercase matches were allowed, false matches could occur as the word “trump” is also a verb and a
noun with their own meanings.
Case sensitive matches are scored slightly higher than regular matches.
AllowPlurals Set to true by default, this will cause a match for keywords such as “mask” where “masks” is found. You should always
allow plurals unless there is a domain specific reason not to.
AllowSynonyms Synonym matches are useful but can be unreliable. For example, applying synonyms to “mask” will correctly match “covering”
if it finds it — which would catch instances of “face covering,” a less common word used for masks in the Covid context. But it would also match
“disguise” and “concealment,” words that are incorrect for the covid model.
For this reasons, synonym matches are scored considerably lower than regular matches. However, don’t let this deter you from allowing synonyms,
as their use can improve the overall quality of the matches despite the false positives.
The quickest way to test all these values is to run some of the sample requests in the
Postman project mentioned earlier.
Tweaking the keywords and options will allow you to see how the model score changes as keywords are added, removed or edited.