Index Aliases in Elasticsearch and OpenSearch

Aliases are a fantastic, powerful feature in Elasticsearch and OpenSearch. They are one of the fundamental building blocks for any healthy cluster. In this post we will discuss what aliases are, how to use them and some common use cases where they really shine.

Aliases in Elasticsearch and OpenSearch are essentially pointers to indices. They provide an abstraction layer between your client code for indexing and searching operations and the indices in your cluster. Aliases can point to one or more indices and can even be used to segment the data within an index or set of indices.

Let’s first see how to use and manage aliases, and then discuss the various applicable use-cases, and how using Aliases can improve your usage of your Elasticsearch cluster. Or you can just skip to that and come back to the API part later.

Adding or Removing Aliases

To add or remove an alias for an index you can use the Aliases API as follows:

Add or remove single alias from an index:

POST _aliases
{
  "actions": [
    {
      "add": {
        "index": "test1",
        "alias": "alias1"
      }
    },
    {
      "remove": {
        "index": "test1",
        "alias": "alias1"
      }
    }
  ]
}

Add or remove single alias from multiple indices:

POST _aliases
{
  "actions": [
    {
      "add": {
        "indices": [ "test1", "test2" ],
        "alias": "alias1"
      }
    },
    {
      "remove": {
        "indices": [ "test1", "test2" ],
        "alias": "alias1"
      }
    }
  ]
}

Add or remove multiple aliases from a single index:

POST _aliases
{
  "actions": [
    {
      "add": {
        "index": "test1",
        "aliases": [ "alias1", "alias2" ]
      }
    },
    {
      "remove": {
        "index": "test1",
        "aliases": [ "alias1", "alias2" ]
      }
    }
  ]
}

Add and remove actions are transactional, when executed on the same request.

Alternatively, you can add aliases at index creation directly within the index PUT API or by using index/component templates:

PUT test1
{
  "aliases": {
    "alias1": {}
  },
  "settings": {},
  "mappings": {}
}

Lastly, you can add or remove an alias for an existing index using the PUT or DELETE /_alias API as well:

# Add an alias
PUT test1/_alias/alias1

# Remove an alias
DELETE test1/_alias/alias1

Filtered Aliases

Filtered aliases are special types of aliases that allow you to view subsets of data within your indices by using a filter query within the alias definition. You have access to the entire query DSL within the filter context which allows you to filter by specific terms, date ranges, geo-locations, etc. The syntax for adding a filter within the alias definition is the same as using filters in queries:

POST _aliases
{
  "actions": [
    {
      "add": {
        "index": "test1",
        "alias": "client_a_bottoms",
        "filter": {
          "bool": {
            "filter": [
              {
                "term": {
                  "clientId": "ClientA"
                }
              },
              {
                "terms": {
                  "type": [
                    "shorts",
                    "jeans",
                    "chinos",
                    "joggers"
                  ]
                }
              }
            ]
          }
        }
      }
    }
  ]
}

The alias client_a_bottoms will be created that contains a subset of the data in the test1 index. It will have all documents for ClientA where the type is equal to one of the listed values.

Write Index of an Alias

Specifying a write index within an alias is typically reserved for ILM/ISM use cases, but can also be done manually when needed. In fact, if you use an alias to point to multiple indices and then try to write to an index using that alias, your request will fail unless a write index is specified. This is done by design because Elasticsearch will not know which of the multiple indices the alias points to that it should write to. Only one index can be specified as the writable index for an alias. You can set the write index during alias creation by adding the parameter "is_write_index": true.

Listing Aliases

There are multiple ways to view the aliases associated with an index or set of indices. One method is using the get alias API. With this API you can get all aliases within the cluster: GET _alias, specify which alias or aliases you wish to view: GET _alias/alias1,alias2, or view aliases for a target index: GET index1/_alias. These APIs also accept wildcards.

Another mechanism is to use the _cat API for aliases: GET _cat/aliases. It accepts all of the query parameters available to the _cat API. If you are only interested in viewing the aliases and the indices they are associated with, you can adjust the parameters like so: GET _cat/aliases?s=alias,index&h=alias,index. This API can also accept a target alias and wildcards to view a grouping of aliases with similar names (e.g. GET _cat/aliases/client_a*).

Common use-cases for Aliases

Aliases gives a lot of power to Elasticsearch administrators and developers. They allow to unlink between cluster management and application development against the cluster. With Aliases, the index names used by the applications for writing and reading data don't have to match the ones actually present on the cluster, thus allowing easier maintenance operations and often times significantly better performance and experience. Here is how:

Separating Reads and Writes

Depending on the use case, it can be helpful to isolate reads vs writes on an index using aliases. One such use case is using a nightly batch job to index data from a database in a separate index from the currently queryable index, running validations on the copied data, then switching the read alias from the current index to the new index created by the batch job. Using aliases allows the client code to query Elasticsearch, unaware of which index the data will be served from.

Multi-tenancy

With multi-tenancy (storing data for multiple users/clients within the same index or cluster) aliases, or more specifically filtered aliases, can play a big role. If you store data for multiple clients within the same index, you can easily use filtered aliases to segment the data so no client receives the wrong data from your queries. You can have filtered aliases based on the clientId field, for example, then use those aliases when generating your queries.

ILM and ISM

Aliases play a big role in Index Lifecycle Management (or Index State Management for OpenSearch). Whenever a rollover event occurs, a new index is created and the same alias that was used for the original index is pointed to the new index as well. The new index is now designated as the write index (meaning new documents will be written to the new index instead of the original index) while the original index can be marked read-only or remain writable depending on your use case. All of this happens automatically during the execution of the rollover process. Using aliases makes the process transparent to the end-user as they still query or index using the same alias name. By using aliases, the underlying index infrastructure is obfuscated making for a simpler experience.

Alias Limitations and Additional Notes

One of the big limitations of aliases is that they cannot be used to point to other aliases. This was likely a design choice by the Elasticsearch engineers to not overcomplicate the feature. Pointers to pointers would have a whole set of design challenges to overcome and could easily become a cluster management nightmare.

When deleting an index, aliases that exclusively pointed to that index automatically get deleted as well. If the alias pointed to the deleted index and other indices that still exist, the alias will not get deleted but it will no longer point to the non-existent index. If you recreate an index with the same name as the deleted index, the alias that previously pointed to that index will not automatically point to that index again (unless you have a component/index template that sets the alias automatically).

Conclusion

Aliases are a powerful feature that should be used by virtually every Elasticsearch or OpenSearch engineer. The layers of abstraction aliases allow for decrease the complexity of cluster management and reduce the need for client code changes. Need help architecting the best approach to using aliases within your cluster(s)? Reach out to us!.