Leaving the default node role configurations is highly discouraged. Deploying dedicated master nodes can help prevent cluster failure and improve overall cluster health and stability.
By default, nodes in Elasticsearch & OpenSearch clusters have multiple roles. This is not an issue in smaller clusters that do not have a lot of load (and in fact works quite well), but it can quickly become a problem once your cluster expands and the load increases.
In this post we will discuss some reasons why you should consider changing these default node assignments and deploy dedicated Master nodes.
There are multiple different roles a node can have (including combinations of roles), but for the purposes of this article we will simply focus on two – Data and Master. You can think of a node role as a set of instructions or tasks the node will carry out. Each role consists of different tasks the node will handle.
Data nodes are, as the name suggests, nodes that hold data. They act as the workhorse of your cluster by housing your data and performing indexing and search operations on it. They typically require a lot of processing power, memory, and fast storage IOPS (excluding cold/frozen tier data nodes).
Master nodes, on the other hand, have a much more leisurely role. They have nothing to do with search queries or data ingestion. Master nodes are in charge of one thing, and that is managing the cluster. They control things like shard allocation, tracking the other nodes within the cluster, creating and deleting indices, etc.
By default, the nodes that are created in Elasticsearch are both data nodes and master-eligible nodes (meaning any one of them can be elected master). As mentioned above, this is not a problem for smaller clusters that do not have a large workload. The data nodes aren’t being worked very hard, so any one of them can easily handle the workload of managing the cluster. But what happens when the data nodes which are also master-eligible nodes do become overloaded?
Remember your master node is in charge of managing your cluster. If a data node is overloaded, you may simply fail to run a query or ingest data on that node. If the master node is overloaded, your entire cluster will cease to function well. The role of the master node is not very intensive, but by combining both the data and master roles on the same node, the resource intensive data node operations can easily overtake the responsibilities of the master.
Consider the example of a retail store manager that also works as a cashier, and also works stocking the shelves and cleaning the floors. How well do you think this store manager can actually manage while also taking on these other responsibilities? Now consider the store manager that sits behind his/her desk all day. Which manager do you think will be better able to create schedules, handle customer complaints, handle inventory counts and other managerial responsibilities? This same scenario holds true for your Elasticsearch/OpenSearch cluster. A dedicated master node (or set of dedicated master-eligible nodes) can handle the cluster management responsibilities with much more ease and reliability than a data/master combination node.
To create dedicated master-eligible nodes, simply update the node.roles property in the elasticsearch.yml configuration file for each node, then restart the node.
node.roles: [ master ]
The syntax for older versions of Elasticsearch (early 7.x and before) is a bit different:
In Elastic Cloud, dedicated master-eligible nodes are added by default once your cluster grows to a minimum of 6 nodes across all zones. OpenSearch Service allows you to select the number of dedicated master nodes when configuring your domain and they recommend three nodes for each production domain.
When sizing your dedicated master-eligible nodes, take into consideration that they perform a much smaller workload than the data nodes. The CPU/memory considerations for a master node are only a fraction of what you would use for a data node. OpenSearch uses an example of having a
m3.2xlarge instance type for data nodes compared to a
m3.medium instance type for master nodes.
Leaving the default node roles, where your nodes act as a jack of all trades, is no issue for smaller clusters with small amounts of load. It can also be acceptable in lower environments that are not mission critical. But when it comes to a Production environment with a medium to large workload, having dedicated master nodes is the obvious choice. By deploying dedicated master-eligible nodes on all production clusters, you can guarantee there is always a master node available to manage your cluster.