Performance considerations

Wildcard field expansion

Consideration: Wildcards that expand to many fields can significantly impact query performance and resource usage.

Potential issues:

Increased query execution time
Higher memory consumption
Possible query rejection if expansion is excessive
Expanded fields may include unintended data

Best practices:

Practice	Reason
Prefer explicit field lists over wildcards	Predictable performance and clearer intent
Use wildcards only when the field count is known to be reasonable	Prevents unexpected expansions
Test wildcard queries with explain mode to understand field expansion	Visibility into actual fields being searched
Consider refactoring data structure if wildcards are required for many fields	Better data modelling can eliminate need for broad wildcards

Example: Avoid:

{
  "text": {
    "query": "championship",
    "path": "content.*" // Unknown expansion count
  }
}

Example: Preferred:

{
  "text": {
    "query": "championship",
    "path": ["content.text", "content.altText", "heroMedia.title"] // Explicit, predictable
  }
}

Query nesting depth

Consideration: Deeply nested compound queries can be difficult to understand, maintain, and may impact performance.

Potential issues:

Increased query complexity and execution time
Harder to debug and maintain
Possible stack or recursion limits in implementation

Best practices:

Practice	Reason
Keep nesting shallow (prefer 2-3 levels maximum)	Easier to understand and maintain
Flatten nested queries when possible	Simpler structure with same logical outcome
Use multiple clauses at the same level	Takes advantage of Boolean logic without deep nesting
Consider breaking complex queries into simpler parts	May be easier to test and optimise individually

Example: Avoid (deeply nested):

{
  "compound": {
    "must": [
      {
        "compound": {
          "must": [
            {
              "compound": {
                // Too many levels of nesting
              }
            }
          ]
        }
      }
    ]
  }
}

Example: Preferred (flattened):

{
  "compound": {
    "must": [
      { "text": { ... } },
      { "equal": { ... } },
      { "range": { ... } }
    ],
    "filter": [
      { "equal": { ... } },
      { "range": { ... } }
    ]
  }
}

Number of clauses

Consideration: Queries with many clauses in a compound operator can become unwieldy and may impact performance.

Potential issues:

Longer execution time as more conditions are evaluated
Increased memory usage
Difficult to maintain and debug
May indicate the need for better data modelling

Best practices:

Practice	Reason
Use `in` operator for multiple value matching	More efficient than many `equal` clauses in `should`
Consider restructuring data if many clauses are needed	Better schema design can simplify queries
Break complex queries into multiple simpler queries	Easier to test, debug, and optimise
Group related conditions in separate clause types	Use `must`, `should`, `filter`, `mustNot` appropriately

Example: Avoid (many clauses):

{
  "compound": {
    "should": [
      { "equal": { "path": "tags", "value": "tag1" } },
      { "equal": { "path": "tags", "value": "tag2" } }
      // Many more clauses...
    ]
  }
}

Example: Preferred (single operator):

{
  "in": {
    "path": "tags",
    "values": ["tag1", "tag2", "tag3", "tag4", "tag5"]
  }
}

Large value arrays

Consideration: Using the in operator with very large arrays of values can impact performance and maintainability.

Potential issues:

Query payload size increases
More values to evaluate during query execution
May indicate better data modelling is needed
Difficult to maintain and test

Best practices:

Practice	Reason
Keep `in` operator value arrays reasonably sized	Easier to maintain and better performance
Consider data model changes for very large value sets	e.g., use categories instead of individual tags
Use range or prefix queries when applicable	More efficient for certain types of data
Break into multiple queries if value set is very large	May be more maintainable

Query complexity

Consideration: Complex queries should be monitored to ensure they perform well in production.

Best practices:

Practice	Reason
Use `explain` mode during development	Understand query cost and performance characteristics
Monitor query performance in production	Identify slow queries early
Set reasonable timeouts	Prevent runaway queries from consuming resources
Test queries with production-like data volumes	Ensures performance is acceptable at scale
Document complex queries	Makes maintenance easier

Combining filters and searches

Consideration: Combining multiple filters and search conditions can lead to complex queries that may affect performance.

Best practices:

Practice	Reason
Use filters for non-scoring conditions	Filters are generally more efficient than search clauses
Combine related conditions into single clauses	Reduces overall query complexity

Example: Avoid:

{
  "compound": {
    "must": [
      {
        "range": {
          "path": "publishDate",
          "gte": "now-7d"
        }
      },
      {
        "text": {
          "query": "football",
          "path": "heroMedia.title"
        }
      }
    ]
  }
}

Example: Preferred:

{
  "compound": {
    "must": [
      {
        "text": {
          "query": "football",
          "path": "heroMedia.title"
        }
      }
    ],
    "filter": [
      {
        "range": {
          "path": "publishDate",
          "gte": "now-7d"
        }
      }
    ]
  }
}

Isolated filters allow the search engine to optimise execution.

Monitoring query performance

Use explain mode to understand how your queries perform:

{
  "query": {
    // your query
  },
  "explain": true
}

Key metrics to review:

Metric	What to look for
`_explain.cost.total`	Overall query cost - lower is better
`_explain.cost.breakdown.fields`	Number of fields being searched - check wildcard expansions
Response time	How long the query takes to execute

See Explain mode for detailed information on query diagnostics.