API queries using Elasticsearch
This article provides an overview of the use of Elasticsearch as part of a query within an API call.
Introduction
At Cortex we use the world-leading Elasticsearch capability to facilitate queries within API calls. This article provides an overview of its capabilities and syntax. For more detail on the depth of Elasticsearch, please refer to the Elastic website - Elasticsearch Guide.
Important note
Using Elasticsearch with wildcards and the type of complex search structures discussed below can be extremely demanding of computational resources. Elasticsearch is primarily intended for use in powering more orthodox requests, such as the search of a website or for finding a specific text string within an article.
In this respect, it is significantly more efficient to use a search limited to a specific endpoint rather than requesting a search of everything. For example, it is preferable to find articles on the basis of a category slug using the articles endpoint, like this:
GET 💻
https://article-cms.cortextech.io/v2/articles?clientId=DEMO&categorySlug=news
than it is to search for the category slug text using a generic search structure like this:
GET 💻
https://stage-article-cms.cortextech.io/v2/articles/search?clientId=DEMO&query=news
Using ElasticSearch
A query can be made against specified sort fields as part of an API call’s options, like this:
GET 💻
https://{environment-id}/v2/articles/search?clientId={clientId}&query={query_string}
In which
{environment-id}
is the URL for stage (test) or production (live).
- Stage: https://stage-article-cms.cortextech.io
- Production: https://article-cms.cortextech.io
- Production (US environment): https://article-cms.cortextech.us
{clientId}
is your client ID, assigned during onboarding.
{query_string}
is the text to search for in selecting articles to fetch.
For example, this call will return all of the articles that match to the client with ID “DEMO”, containing the search string: “Matchday”:
GET 💻
https://article-cms.cortextech.io/v2/articles/search?clientId=DEMO&query=Matchday
However, because the query={query_string}
parameter uses Elasticsearch, there are a range of options that can be applied. Elasticsearch uses a syntax to parse and split the provided query string based on operators, such as AND
or NOT
. The query then analyses each section of the search independently before returning matching documents.
You can also create a complex search that includes wildcard characters, searches across multiple fields, and more. While versatile, the query is strict and returns an error if the query string includes any invalid syntax.
The query string is parsed into a series of terms and operators.
A term can be a single word, for example quick
or brown
, or a phrase, surrounded by double quotes, such as "quick brown fox"
which searches for all the words in the phrase, in the same order.
Operators allow you to customize the search. The key options we use are as follows.
status:active
returns results where thestatus
field containsactive
title:(quick OR brown)
returns results where thetitle
field contains a string, for example:quick
orbrown
author:"John Smith"
returns results where theauthor
field contains an exact phrase, in this case "john smith"book.\*:(quick OR brown)
returns results where any of the fieldsbook.title
,book.content
orbook.date
contains a string, such as:quick
orbrown
(note how we need to escape the wildcard character*
- with a backslash):_exists_:title
returns results where the fieldtitle
has any non-null value.
Wildcards
Wildcard searches can be run on individual terms, using ?
to replace a single character, and *
to replace zero or more characters. For example qu?ck bro*
Be aware that wildcard queries can use a large amount of memory and poorly structured queries will perform badly; for example, think how many terms need to be queried to match the query string `"a* b* c*"`.
Pure wildcards `\*` are rewritten to the `exists` query format for efficiency. As a consequence, the wildcard `"field:*"` would match documents with an empty value like the following: `{"field": ""}` ...and would **not** match if the field is missing or set with an explicit null value like the following: `{"field": null}`
Updated about 2 months ago