# Query performance considerations

Content Advanced Search queries can become complex, and certain patterns may lead to performance degradation or increased resource consumption. This document highlights common pitfalls and best practices to avoid them.

# Performance considerations

## Wildcard field expansion

**Consideration:** Wildcards that expand to many fields can significantly impact query performance and resource usage.

**Potential issues:**

* Increased query execution time
* Higher memory consumption
* Possible query rejection if expansion is excessive
* Expanded fields may include unintended data

**Best practices:**

| Practice                                                                      | Reason                                                       |
| ----------------------------------------------------------------------------- | ------------------------------------------------------------ |
| Prefer explicit field lists over wildcards                                    | Predictable performance and clearer intent                   |
| Use wildcards only when the field count is known to be reasonable             | Prevents unexpected expansions                               |
| Test wildcard queries with explain mode to understand field expansion         | Visibility into actual fields being searched                 |
| Consider refactoring data structure if wildcards are required for many fields | Better data modelling can eliminate need for broad wildcards |

**Example: Avoid:**

```json
{
  "text": {
    "query": "championship",
    "path": "content.*" // Unknown expansion count
  }
}
```

**Example: Preferred:**

```json
{
  "text": {
    "query": "championship",
    "path": ["content.text", "content.altText", "heroMedia.title"] // Explicit, predictable
  }
}
```

## Query nesting depth

**Consideration:** Deeply nested compound queries can be difficult to understand, maintain, and may impact performance.

**Potential issues:**

* Increased query complexity and execution time
* Harder to debug and maintain
* Possible stack or recursion limits in implementation

**Best practices:**

| Practice                                             | Reason                                                |
| ---------------------------------------------------- | ----------------------------------------------------- |
| Keep nesting shallow (prefer 2-3 levels maximum)     | Easier to understand and maintain                     |
| Flatten nested queries when possible                 | Simpler structure with same logical outcome           |
| Use multiple clauses at the same level               | Takes advantage of Boolean logic without deep nesting |
| Consider breaking complex queries into simpler parts | May be easier to test and optimise individually       |

**Example: Avoid (deeply nested):**

```json
{
  "compound": {
    "must": [
      {
        "compound": {
          "must": [
            {
              "compound": {
                // Too many levels of nesting
              }
            }
          ]
        }
      }
    ]
  }
}
```

**Example: Preferred (flattened):**

```json
{
  "compound": {
    "must": [
      { "text": { ... } },
      { "equal": { ... } },
      { "range": { ... } }
    ],
    "filter": [
      { "equal": { ... } },
      { "range": { ... } }
    ]
  }
}
```

## Number of clauses

**Consideration:** Queries with many clauses in a compound operator can become unwieldy and may impact performance.

**Potential issues:**

* Longer execution time as more conditions are evaluated
* Increased memory usage
* Difficult to maintain and debug
* May indicate the need for better data modelling

**Best practices:**

| Practice                                               | Reason                                                  |
| ------------------------------------------------------ | ------------------------------------------------------- |
| Use `in` operator for multiple value matching          | More efficient than many `equal` clauses in `should`    |
| Consider restructuring data if many clauses are needed | Better schema design can simplify queries               |
| Break complex queries into multiple simpler queries    | Easier to test, debug, and optimise                     |
| Group related conditions in separate clause types      | Use `must`, `should`, `filter`, `mustNot` appropriately |

**Example: Avoid (many clauses):**

```json
{
  "compound": {
    "should": [
      { "equal": { "path": "tags", "value": "tag1" } },
      { "equal": { "path": "tags", "value": "tag2" } }
      // Many more clauses...
    ]
  }
}
```

**Example: Preferred (single operator):**

```json
{
  "in": {
    "path": "tags",
    "values": ["tag1", "tag2", "tag3", "tag4", "tag5"]
  }
}
```

## Large value arrays

**Consideration:** Using the `in` operator with very large arrays of values can impact performance and maintainability.

**Potential issues:**

* Query payload size increases
* More values to evaluate during query execution
* May indicate better data modelling is needed
* Difficult to maintain and test

**Best practices:**

| Practice                                               | Reason                                          |
| ------------------------------------------------------ | ----------------------------------------------- |
| Keep `in` operator value arrays reasonably sized       | Easier to maintain and better performance       |
| Consider data model changes for very large value sets  | e.g., use categories instead of individual tags |
| Use range or prefix queries when applicable            | More efficient for certain types of data        |
| Break into multiple queries if value set is very large | May be more maintainable                        |

## Query complexity

**Consideration:** Complex queries should be monitored to ensure they perform well in production.

**Best practices:**

| Practice                                       | Reason                                                |
| ---------------------------------------------- | ----------------------------------------------------- |
| Use `explain` mode during development          | Understand query cost and performance characteristics |
| Monitor query performance in production        | Identify slow queries early                           |
| Set reasonable timeouts                        | Prevent runaway queries from consuming resources      |
| Test queries with production-like data volumes | Ensures performance is acceptable at scale            |
| Document complex queries                       | Makes maintenance easier                              |

## Combining filters and searches

**Consideration:** Combining multiple filters and search conditions can lead to complex queries that may affect performance.

**Best practices:**

| Practice                                       | Reason                                                   |
| ---------------------------------------------- | -------------------------------------------------------- |
| Use filters for non-scoring conditions         | Filters are generally more efficient than search clauses |
| Combine related conditions into single clauses | Reduces overall query complexity                         |

**Example: Avoid:**

```json
{
  "compound": {
    "must": [
      {
        "range": {
          "path": "publishDate",
          "gte": "now-7d"
        }
      },
      {
        "text": {
          "query": "football",
          "path": "heroMedia.title"
        }
      }
    ]
  }
}
```

**Example: Preferred:**

```json
{
  "compound": {
    "must": [
      {
        "text": {
          "query": "football",
          "path": "heroMedia.title"
        }
      }
    ],
    "filter": [
      {
        "range": {
          "path": "publishDate",
          "gte": "now-7d"
        }
      }
    ]
  }
}
```

Isolated filters allow the search engine to optimise execution.

# Monitoring query performance

Use explain mode to understand how your queries perform:

```json
{
  "query": {
    // your query
  },
  "explain": true
}
```

Key metrics to review:

| Metric                           | What to look for                                            |
| -------------------------------- | ----------------------------------------------------------- |
| `_explain.cost.total`            | Overall query cost - lower is better                        |
| `_explain.cost.breakdown.fields` | Number of fields being searched - check wildcard expansions |
| Response time                    | How long the query takes to execute                         |

See [Explain mode](https://developer.cortextech.io/docs/explain) for detailed information on query diagnostics.