Query performance considerations
Content Advanced Search queries can become complex, and certain patterns may lead to performance degradation or increased resource consumption. This document highlights common pitfalls and best practices to avoid them.
Performance considerations
Wildcard field expansion
Consideration: Wildcards that expand to many fields can significantly impact query performance and resource usage.
Potential issues:
- Increased query execution time
- Higher memory consumption
- Possible query rejection if expansion is excessive
- Expanded fields may include unintended data
Best practices:
| Practice | Reason |
|---|---|
| Prefer explicit field lists over wildcards | Predictable performance and clearer intent |
| Use wildcards only when the field count is known to be reasonable | Prevents unexpected expansions |
| Test wildcard queries with explain mode to understand field expansion | Visibility into actual fields being searched |
| Consider refactoring data structure if wildcards are required for many fields | Better data modelling can eliminate need for broad wildcards |
Example: Avoid:
{
"text": {
"query": "championship",
"path": "content.*" // Unknown expansion count
}
}Example: Preferred:
{
"text": {
"query": "championship",
"path": ["content.text", "content.altText", "heroMedia.title"] // Explicit, predictable
}
}Query nesting depth
Consideration: Deeply nested compound queries can be difficult to understand, maintain, and may impact performance.
Potential issues:
- Increased query complexity and execution time
- Harder to debug and maintain
- Possible stack or recursion limits in implementation
Best practices:
| Practice | Reason |
|---|---|
| Keep nesting shallow (prefer 2-3 levels maximum) | Easier to understand and maintain |
| Flatten nested queries when possible | Simpler structure with same logical outcome |
| Use multiple clauses at the same level | Takes advantage of Boolean logic without deep nesting |
| Consider breaking complex queries into simpler parts | May be easier to test and optimise individually |
Example: Avoid (deeply nested):
{
"compound": {
"must": [
{
"compound": {
"must": [
{
"compound": {
// Too many levels of nesting
}
}
]
}
}
]
}
}Example: Preferred (flattened):
{
"compound": {
"must": [
{ "text": { ... } },
{ "equal": { ... } },
{ "range": { ... } }
],
"filter": [
{ "equal": { ... } },
{ "range": { ... } }
]
}
}Number of clauses
Consideration: Queries with many clauses in a compound operator can become unwieldy and may impact performance.
Potential issues:
- Longer execution time as more conditions are evaluated
- Increased memory usage
- Difficult to maintain and debug
- May indicate the need for better data modelling
Best practices:
| Practice | Reason |
|---|---|
Use in operator for multiple value matching | More efficient than many equal clauses in should |
| Consider restructuring data if many clauses are needed | Better schema design can simplify queries |
| Break complex queries into multiple simpler queries | Easier to test, debug, and optimise |
| Group related conditions in separate clause types | Use must, should, filter, mustNot appropriately |
Example: Avoid (many clauses):
{
"compound": {
"should": [
{ "equal": { "path": "tags", "value": "tag1" } },
{ "equal": { "path": "tags", "value": "tag2" } }
// Many more clauses...
]
}
}Example: Preferred (single operator):
{
"in": {
"path": "tags",
"values": ["tag1", "tag2", "tag3", "tag4", "tag5"]
}
}Large value arrays
Consideration: Using the in operator with very large arrays of values can impact performance and maintainability.
Potential issues:
- Query payload size increases
- More values to evaluate during query execution
- May indicate better data modelling is needed
- Difficult to maintain and test
Best practices:
| Practice | Reason |
|---|---|
Keep in operator value arrays reasonably sized | Easier to maintain and better performance |
| Consider data model changes for very large value sets | e.g., use categories instead of individual tags |
| Use range or prefix queries when applicable | More efficient for certain types of data |
| Break into multiple queries if value set is very large | May be more maintainable |
Query complexity
Consideration: Complex queries should be monitored to ensure they perform well in production.
Best practices:
| Practice | Reason |
|---|---|
Use explain mode during development | Understand query cost and performance characteristics |
| Monitor query performance in production | Identify slow queries early |
| Set reasonable timeouts | Prevent runaway queries from consuming resources |
| Test queries with production-like data volumes | Ensures performance is acceptable at scale |
| Document complex queries | Makes maintenance easier |
Combining filters and searches
Consideration: Combining multiple filters and search conditions can lead to complex queries that may affect performance.
Best practices:
| Practice | Reason |
|---|---|
| Use filters for non-scoring conditions | Filters are generally more efficient than search clauses |
| Combine related conditions into single clauses | Reduces overall query complexity |
Example: Avoid:
{
"compound": {
"must": [
{
"range": {
"path": "publishDate",
"gte": "now-7d"
}
},
{
"text": {
"query": "football",
"path": "heroMedia.title"
}
}
]
}
}Example: Preferred:
{
"compound": {
"must": [
{
"text": {
"query": "football",
"path": "heroMedia.title"
}
}
],
"filter": [
{
"range": {
"path": "publishDate",
"gte": "now-7d"
}
}
]
}
}Isolated filters allow the search engine to optimise execution.
Monitoring query performance
Use explain mode to understand how your queries perform:
{
"query": {
// your query
},
"explain": true
}Key metrics to review:
| Metric | What to look for |
|---|---|
_explain.cost.total | Overall query cost - lower is better |
_explain.cost.breakdown.fields | Number of fields being searched - check wildcard expansions |
| Response time | How long the query takes to execute |
See Explain mode for detailed information on query diagnostics.
Updated 13 days ago
