Two years after the fork from Elasticsearch and the initial release of OpenSearch, here is our detailed and updated OpenSearch vs Elasticsearch comparison.
Over the last 13 years, the team and I have been working closely with Elasticsearch teams in both Elastic and Amazon. I’ve also followed the events that took place in 2020 and 2021 closely. I didn't voice an opinion about that argument back then because we work closely with both teams and have great appreciation for both. To be honest, that competition is actually good for our customers. Now, two years following the events that led to the birth of OpenSearch, it’s a good time for a retrospective. Let’s both technologies and their trajectory.
Today, both Elasticsearch and OpenSearch are widely used for indexing and searching large volumes of data in real-time. Together, they are the industry standard for search engines, log analytics, and real-time BI. However, they have some significant differences that set them apart.
In this blog, we'll take a closer look at OpenSearch vs. Elasticsearch and help you decide which one is right for your needs.
What is Elasticsearch?
Elasticsearch is a popular search engine that is based on the Apache Lucene project (which is also the parent project for Apache Solr), that has been used by many since 2010 for search and log analytics. It is designed to be highly scalable and can be used for a wide range of applications, from simple search functionality to complex data analytics. Elasticsearch and Kibana, as part of the Elastic Stack, both have a robust set of features, including support for full-text search, enterprise search, real-time analytics, and geospatial queries. It used to be fully open-source under the Apache License, until that changed in early 2021, when competitor Amazon embarked into creating its own project. Elasticsearch is often deployed self-managed, or on Elastic Cloud.
What is OpenSearch?
The OpenSearch search engine is a fork of Elasticsearch maintained by Amazon since January 2021. It's essentially the same codebase until the fork event, which is also when the projects started to slightly diverge. One of the key features of OpenSearch is its focus on transparency and community-driven development. Unlike Elasticsearch, which is owned by Elastic NV, OpenSearch is governed by a community-driven foundation. This means that anyone can contribute to the development of OpenSearch. While the codebase of both software products is open for inspection by anyone who wants to review it, it is easier to contribute code and influence the direction of OpenSearch than it is to do for Elasticsearch. It is often used as part of the Amazon OpenSearch Service (previously known as the Amazon Elasticsearch Service).
When looking to compare the two, we will review Codebase as signal for traction and development effort, and Features - so you can choose which one is a better fit for your needs.
OpenSearch vs Elasticsearch: Codebase and Releases
The OpenSearch project forked the Elasticsearch codebase when version 7.10.2 was the latest release, and then significant work occurred on the OpenSearch codebase to rename the project and clean all non Apache-licensed code (namely, X-Pack functionality). To properly compare the work done on both, we counted commits on the master/main branch made since April 22nd 2021, which marked the first release candidate of OpenSearch following the fork a few months earlier.
The Elasticsearch repository had nearly 20k commits, 6k of them were made to core Elasticsearch (the "server" folder), and few more to satellite modules:
# total commits in repo since fork ➜ elasticsearch git:(master) git log --oneline --all --since='Apr 22 2021' | wc -l 19527 # total commits to the main codebase (server folder) since fork ➜ elasticsearch git:(master) git log --oneline --all --since='Apr 22 2021' -- server/ | wc -l 6130 # total commits to main modules (various surrounding functionality not under x-pack) since fork # https://github.com/elastic/elasticsearch/tree/main/modules ➜ elasticsearch git:(master) git log --oneline --all --since='Apr 22 2021' -- modules/ | wc -l 1437 # just as means of comparison, the amount of work made on x-pack features is not negligible ➜ elasticsearch git:(master) git log --oneline --all --since='Apr 22 2021' -- x-pack/ | wc -l 7294
OpenSearch saw over 3 times less code commits on core, and 14 times less work on important modules which includes for example the scripting languages, reindexing features, ingestion pipeline processors, and more:
➜ OpenSearch git:(main) git log --oneline --all --since='Apr 22 2021' | wc -l 3727 ➜ OpenSearch git:(main) git log --oneline --all --since='Apr 22 2021' -- server/ | wc -l 1966 # total commits to main modules (surrounding functionality not under x-pack) since fork # https://github.com/opensearch-project/OpenSearch/tree/main/modules ➜ OpenSearch git:(main) git log --oneline --all --since='Apr 22 2021' -- modules/ | wc -l 470
Consequently, OpenSearch has fewer releases compared to the number of releases (major and minor) made by Elasticsearch.
While the number of commits is not a direct evidence to quality of code or performance of the software, it is quite clear and evident the Elasticsearch project is seeing more work on core, which in turn surely translates to better performance, more features, keeping up with latest versions of dependencies and Lucene features, and so on - especially when the difference is on that scale.
The aforementioned stats were collected in April 2023.
OpenSearch vs Elasticsearch: Features Comparison
All basic functionality of search, analytics and dashboards is exactly the same between the two technologies. After all, OpenSearch was forked from a very mature version of Elasticsearch. For the standard use-cases, from the features perspective it doesn't matter which one you pick.
The difference in features between the projects will be for anything that was under Elastic’s X-Pack (free or paid), and all features that have been added after the fork.
For the important features that go a little bit above just basic - those eventually exist or will exist in both. As prime examples we can list the following:
- Data Streams API is implemented by both (although Elasticsearch just released time-series data streams that are not in OpenSearch - yet)
- Index Lifecycle Management becomes Index State Management in OpenSearch
- Both have some support for alerting (although we actually recommend going with ElastAlert2 and not any built-in alerting solution).
- Cross-cluster replication is supported by both, in Elasticsearch it's a Premium tier feature (not free).
At the moment of writing this piece, some niche core features are still unique to Elasticsearch, such as geoshape and geohex grid aggregations.
Some OpenSearch features are only available on the managed service Amazon OpenSearch Service, and on the other hand unlike Elastic Cloud which is always up to date with the latest Elasticsearch version, the managed OpenSearch offering by Amazon is usually 2-3 versions behind.
Here are some highlights of major differences between the two, that mostly exist around the Elastic Stack solutions (more on that minus x-pack features.
All of Elasticsearch’s built-in security features are part of the X-Pack Basic license, and those are limited to an Elasticsearch-based user directory. To authenticate with LDAP, OpenID, Saml and so on - higher not-free licensing is required. Same goes to other security features such as IP filtering, document and field level security, and more.
OpenSearch’s Security module is provided entirely for free, without higher tiers, and it does provide pretty much everything you’d need: Active Directory and LDAP, SAML, OpenID, Access Control features including masking and field-level security, audit logs, and more.
The ability to create an “offline” search experience, thus significantly reducing the amount of hardware required to run Elasticsearch clusters with older, less frequently accessed data, is a true game changer for many use-cases.
Elasticsearch has this feature implemented and in very wide use for a while now; while OpenSearch have just recently released it and it’s still marked as experimental. However, and very importantly - Elasticsearch requires a paid license on a high tier (Enterprise) to make use of this feature, while in OpenSearch Searchable Snapshots is a completely free feature.
This feature is provided by managed services, known as “Searchable Snapshots” or “search on frozen tier” in Elastic Cloud, and “Ultrawarm” on Amazone OpenSearch Service.
Our advice is not to run Machine Learning and AI workloads on Elasticsearch or OpenSearch simply because it's not purposely built for it. It's sometimes handy to have, sure, but it doesn't come without a price tag.
Elasticsearch and OpenSearch should be thought of as serving layer engines. You should prepare data to be served easily from them with or without ML involved. As an example, you can use the vector fields (dense or sparse vectors) and use kNN / ANN algorithms to find similar documents via Vector Search.
Another approach is to use rescoring approaches, like the LTR plugin does, to improve scoring capabilities.
Both Elasticsearch and OpenSearch offer built-in solutions (or “apps”) for Machine Learning workloads and use-cases, that in some cases might come in handy (e.g. built-in SIEM in the Elastic Stack) but in our opinion - not for a general, widespread use.
The Data Prepper technology, which is part of the OpenSearch project, is meant to address that need.
Alternatively, there are dedicated connectors ready for various data streaming technologies, such as Kafka Connect for Kafka, Flink sink to be used with various sources, and so on.
Licensing and Restrictions
Of course we cannot have a post comparing the two without discussing the elephant in the room, which is the licensing model. Elasticsearch was previously released under the Apache license, which is a very permissive license. This is also the current license of OpenSearch - but Elasticsearch is now released under a different, less permissive license, that many consider to not be an open-source license.
Neither me or the team are lawyers - we prefer remaining highly technical and this is the real value we are able to provide. But more often than not we are asked if doing X is legit or will be in violation of Elastic’s license.
The gist is the new license forbids serving Elasticsearch APIs as a managed service. If you just use Elasticsearch as the backend for your application - you are good to go. But there is a lot of grey area, such as embedding Elasticsearch as part of a larger solution that is sold as one piece, exposing some APIs that could be seen as Elasticsearch APIs (e.g. search via API), and so on. A lot of our customers look for having zero risk, and especially if they don’t need anything special from Elasticsearch - they opt to use OpenSearch and use it’s basic features and then some.
OpenSearch is an open-source project, it doesn’t offer support. Manage services for OpenSearch such as Amazon OpenSearch Service, Aiven, and others - will take responsibility of running the hardware and software for you, but not how you make use of it.
Elastic Co, the company behind Elasticsearch, do offer support via their standard subscription licenses, or on their managed offering on Elastic Cloud. But again - that support will be limited and will not always provide the best tailored advice as of how to use the technology to best suite your needs.
For cases when the documentation is not enough, and when you need a true expert who can serve as your trusted advisor - we have established our name as world-leaders in providing support for Elasticsearch and OpenSearch alike. Aside from consulting and migration services, we also provide 24/7 production support to help with urgent matters, and just keep your clusters healthy at all times.
Also check out our Pulse service - an automated consultant solution for proactive monitoring and support.
To easily summarize this OpenSearch vs Elasticsearch comparison - as long as you don't serve Elasticsearch directly to customers or fall in the legal gray area of doing so, you are safe to use both Elasticsearch and OpenSearch.
For all basic and mainstream use-cases there is really no difference between Elasticsearch and OpenSearch. Those use-cases include text search, log analytics, dashboards, and so on. Both technologies will serve exactly the same purpose.
Elasticsearch will most likely be easier to integrate from anywhere due to the extensive client libraries support, and will also catch up on bugs and issues faster thanks to the very active development team.
OpenSearch on the other hand will most likely be cheaper to operate, most definitely so if you are looking for something that is beyond just basic functionality, such as a full-fledged SIEM. The Elastic Stack implementation of those solutions will most likely be much more mature, but they will also come at a significant price tag.
For self-managed - those would probably be your deciding factors. If you’re looking for a managed solution, there are many more options out there for OpenSearch, for obvious reasons.
Regardless of your choice, our team is available to provide 24/7 support, consulting services, and hands-on development and migration services. Reach out to us to discuss further.