Query Predicates

Predicates provide a way for complex filter expressions when querying resources.

The queryable APIs support ad-hoc filtering of resources through flexible predicates. They do so via the where query parameter that accepts a predicate expression to determine whether a specific resource representation should be included in the result.

Please note that this syntax differs from the ones used for other predicate types, most notably the predicates used to define discount targets.

The structure of predicates and the names of the fields follow the structure and naming of the fields in the documented response representation of the respective query results.

API endpoints that support query predicates allow passing input variables as separate HTTP query parameters.

Query Predicates by Example

// Compare a field's value to a given value
name = "Peter" // For exact match to "Peter". This does not perform substring match.
name != "Peter"
age < 42
age > 42
age <= 42
age >= 42
age <> 42
// Combine any two conditional expressions in a logical conjunction / disjunction
name = "Peter" and age < 42
name = "Peter" or age < 42
// Negate any other conditional expression
not (name = "Peter" and age < 42)
// Check whether a field's value is or is not contained in
// a specified set of values.
age in (42, 43, 44)
age not in (42, 43, 44)
// to be noted: 'in' is much more efficient than several '='
// prefer:
name in ("Peter", "Barbara")
// to:
name = "Peter" or name = "Barbara"
// Check whether an array contains all or any of a set of values
tags contains all ("a", "b", "c")
tags contains any ("a", "b", "c")
// Check whether an array is empty
tags is empty
// Check whether a field exists & has a non-null value
name is defined
name is not defined
// Descend into nested objects
dog(age < 7 and name = "Beethoven")
// Descend into nested arrays of objects
cities(zip > 10000 and zip < 20000)
// Query GeoJSON field within a circle
// The two first parameters are the longitude and latitude of the circle's center.
// The third parameter is the radius of the circle in meter.
geoLocation within circle(13.37770, 52.51627, 1000)
// Query for ProductProjections with attribute values
// - to get all results add a predicate of the same form, but starting with 'masterVariant' instead of 'variants'
// - to query for Products instead enclose the examples with 'masterData(current(<example>))' or 'masterData(staged(<example>))'
// ---------------------------
// for missing attribute
// for single attribute value of TextType
variants(attributes(name="attribute-name" and value="attribute-value"))
// for multiple attribute values of TextType with same name
variants(attributes(name="attribute-name" and value in ("attribute-value-1", "attribute-value-2")))
// for single attribute value of LTextType
variants(attributes(name="attribute-name" and value(en="attribute-value")))
// for multiple attribute values of LTextType with same name
variants(attributes(name="attribute-name" and value(en="english-value" or de="german-value")))
// for EnumType or LocalizableEnumType
variants(attributes(name="attribute-name" and value(key="enum-key")))
// for MoneyType (currencyCode is required)
variants(attributes(name="attribute-name" and value(centAmount=999 and currencyCode="EUR")))
// for MoneyType with centAmount within a specific range (currencyCode is required)
variants(attributes(name="attribute-name" and value(centAmount > 999 and centAmount < 1001 and currencyCode="EUR")))
// for NumberType
variants(attributes(name="attribute-name" and value=999))
// for NumberType with value within a specific range
variants(attributes(name="attribute-name" and value > 999 and value < 1001 ))
// for DateType, TimeType, or DateTimeType
variants(attributes(name="attribute-name" and value="attribute-value"))
// for DateType, TimeType, or DateTimeType with a value within a specific range
variants(attributes(name="attribute-name" and value > "value-start" and value < "value-end"))
// for ReferenceType
variants(attributes(name="attribute-name" and value(typeId="reference-type-id" and id="reference-id")))

A query endpoint usually restricts predicates to only be allowed on a specified subset of a resource representation's fields. The documentation of the endpoint lists fields that can be used for constructing predicates.

If multiple predicates are specified via multiple where query parameters, the individual predicates are combined in a logical conjunction, just as if they had been specified in a single where query parameter and combined with and.

Note: The encoding of the predicates is UTF-8 and the predicate must be URL-encoded in the HTTP request.

Example predicate for querying products:

# decoded predicate
masterData(current(slug(en="peter-42") and name(en="Peter")))
# URL-encoded predicate

Input Variables

Query predicates support the use of input variables to simplify working with query strings that contain dynamic values. Using input variables also eases log analysis because identical query use cases have identical where query parameter values.

Inside the query predicate string, references to input variables must be prefixed with a colon :.

All input variables referenced in the query predicate must be added to the URI as separate HTTP query parameters whose names must be prefixed with var.. The same input parameter can be passed multiple times to be used as an array of values.

The actual names of the input variables must consist of alphanumeric characters only.

Input Variable Examples

HTTP query using one input variable:

# decoded:
?where=firstName = :name&var.name=Peter
# URL-encoded:

HTTP query using an array input variable:

# decoded:
?where=masterVariant(sku in :skus) or variants(sku in :skus)&var.skus=sku1&var.skus=sku2&var.skus=sku3
# URL-encoded:

Referencing input variables in query predicates:

// Compare a field's value to a given input variable value
name = :name
// Check whether a field's value is or is not contained in
// a specified set of input variable values.
age in :ages
age in (:age1, :age2, :age3)
age not in :ages
age not in (:age1, :age2, :age3)
// Check whether an array contains all or any of a set of input variable values
tags contains all :tags
tags contains all (:tag1, :tag2, :tag3)
tags contains any :tags
tags contains any (:tag1, :tag2, :tag3)
// Referencing an input variable multiple times
masterVariant(sku in :skus) or variants(sku in :skus)

Performance Considerations

Query predicates are translated to database queries whose efficiency depends on how well the database can use indexes.

Indexes are managed automatically by the commercetools platform. Some indexes are present on all projects, others are added dynamically. For example, if you add a custom field to your carts and start querying it, the commercetools platform will add an index to the project to improve performance if it meets criteria like query frequency.

The automatic index creation needs to collect a significant amount of data to not optimize for outlier scenarios. That's why it can take up to two weeks before a new index is added.

Efficient queries can be fast on extremely large datasets and inefficient queries can be fast on small datasets, too. But inefficient query patterns on large datasets cause long-running and resource intensive queries. Such queries can affect the overall performance of a project.

Inefficient Patterns

Not all query predicates can be easily supported with an index, so if possible avoid the following patterns on large datasets:

  • Fields nested inside arrays - the index becomes inefficient if it contains too many entries. For example, variants(attributes(name = "attribute-name" and value = "attribute-value")).
  • Querying for a condition that is true for the majority of resources, for example custom(state = "Done").
  • Indexes on product attributes on the Products and Product Projections endpoints are not provided. Use the Product Projection Search endpoint instead or try to narrow down your result set with an additional restriction, for example on the productType or the state field.

Efficient Patterns

The following patterns are supporting efficient query execution:

  • Non-nested fields that heavily reduce the subset of resources to filter, for example custom(state = "WaitingForExternalApproval") (assuming there are few resources waiting for external approval)
  • If possible, prefer equality over other operators. For example, (state = "Done" or state = "Canceled)" can be faster than (state != "Open") in a query that contains further expressions.
  • Queries on orders, carts, customers etc. may be fast in the beginning, but slow down over time as your project grows. Include a time range, for example lastModifiedAt > $1-week-ago and ... (replace $1-week-ago with an actual date). Try defaulting the time range to the smallest value that's acceptable for the use case. Alternatively, try filtering by a field value that naturally only occurs in recently created resources.

Sorting and query performance

Sorting can also be supported by indexes. For best performance, the same index can be used for filtering and sorting. If possible, re-use a field from the query predicate for sorting.

For example, if your filter query is lastModifiedAt > $1-week-ago, sorting on lastModifiedAt is advised since it is more performant than sorting on a different field, like,id.

Disable calculating the total

Disabling calculation of the total field in the PagedQueryResult will improve the performance of the query. Whenever the total is not needed, deactivate its calculation by using the query parameter withTotal=false.

Query predicate specification

You can view the query predicate specification here. The predicate specification is useful when building language parsers.