Exploring Elastic-search Hot Pods on NVMe-based I3 Instances in Kubernetes

In the realm of Kubernetes, managing stateful workloads can be a formidable challenge. This article stems from a prior experience where we encountered and successfully resolved the complex issue of deploying Elasticsearch pods on NVMe-based Local SSDs within I3 machines. This task proved significantly more demanding compared to the relatively straightforward process of deploying Elasticsearch pods on EBS (Elastic Block Store). Inspired to share this knowledge, I’ve decided to document it now. The challenge primarily revolves around the persistent problem of ephemeral disk locality in our Kubernetes cluster. This issue has impeded our efforts to scale our infrastructure and boost operational efficiency. Despite extensive research, I’ve found a noticeable lack of a comprehensive technical guide that addresses local storage provisioning in Kubernetes clusters.

The Problem We Faced

Our scenario involves deploying Elasticsearch hot pods on NVMe-based I3 instances, a scenario demanding an exceptional level of performance due to Elasticsearch’s read and write-heavy nature. We had three main requirements:

  1. Data Persistence: It was crucial to ensure that local data wouldn’t be deleted if a pod was restarted. We needed to attach the data to the pod to maintain data integrity.
  2. Scalability: Our future plans included running multiple hot pods on the same machine. To achieve this, we needed a solution that could bifurcate multiple volumes on the disk and fulfill relevant pod requests through the underlying volumes.
  3. Performance: Given Elasticsearch’s read and write-intensive nature, performance was a paramount requirement for our storage solution. It needed to meet the demands of Elasticsearch workloads. Hence we decided to use the native storage capabilities of I3 machines instead of going ahead with EBS which in turn posed a lot of challenges.

Solutions Evaluated

We explored three different storage provisioner options in Kubernetes to address our requirements. Here are the solutions we considered:

1. Using HostPath:

HostPath is a relatively straightforward solution for local storage in Kubernetes. It allows you to use the local file system of the node where your pod is running as the storage backend. This means you can directly access and use the host’s file system.

Also The hostPath volume type is single-node only, meaning that a pod on one node cannot access the hostPath volume on another node. One way to get around this limitation may be to either create a StatefulSet or Daemonset which could force pods to always deploy to the same node(s), or force a deployment’s pods to always be deployed to the same node via nodeAffinity / nodeSelector.

Also host-paths do not offer any volume limitations mechanism so in case of pod tries to take up large amounts of disk another pod on the same node would also be impacted.

apiVersion: v1
kind: Pod
metadata:
  name: elasticsearch-pod
spec:
  volumes:
    - name: elasticsearch-data
      hostPath:
        path: /var/data/elasticsearch
  containers:
    - name: elasticsearch-container
      image: elasticsearch:7.9.3
      volumeMounts:
        - name: elasticsearch-data
          mountPath: /usr/share/elasticsearch/data

Pros:

  • Relatively easy to set up

Cons:

2. Kubernetes Local Storage Provisioner:

Kubernetes Local Storage Provisioner helps manage local storage on the nodes where your applications run. Local storage means the hard drives or storage space directly available on the physical servers or computers in your Kubernetes cluster.

Here’s what it does:

  1. Local Volumes: It deals with storage devices like hard drives, partitions, or directories that are physically present on the machine where Kubernetes is running. These local volumes can only be used as storage that’s set up in advance via pv / pvc, not on-the-fly.
  2. Node-Aware: Unlike some other ways of using local storage, this system is aware of which node (or computer) in your cluster has the storage. It takes this into account when deciding where to run your applications. Eg. Let’s say we have a PV of 50 GB on a node and then pod has a PVC claim of 40 GB , then the kubernetes control plane will schedule the pod on the node on which has 50 GB PV.
  3. Node Locking: The significant advantage here is that Kubernetes ensures that if your application uses this local storage, it always runs on the same computer where that storage is available. This helps prevent data loss because it won’t randomly move your application to a different computer. This happens because of the claim is associated with the Persistent Volume on the node.

In the above diagram , we can see the following

  • StatefulSet 1 uses underlying partition for storage needs
  • Hence after restarts, pod remains on the same node
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
spec:
  serviceName: "elasticsearch-service"
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
      - name: elasticsearch
        image: registry.k8s.io/elasticsearch 
        command:
        - "/bin/sh"
        args:
        - "-c"
        - "sleep 100000"
        volumeMounts:
        - name: elasticsearch-data
          mountPath: /usr/test-pod
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "local-storage"
      resources:
        requests:
          storage: 30Gi

Pros:

  • Supports creating PVCs on local ephemeral storage.
  • Provides basic deprovisioning and disk space management.
  • Open source and relatively easy to manage.

Cons:

  • May not offer the level of control required for bifurcating volumes and fulfilling specific pod requests.
  • Capacity Management is not baked in

3. OpenEBS with Local LVM Provisioner:

OpenEBS is an open-source project that offers more advanced storage management capabilities. It’s designed for scenarios like the one you’re facing, where we require data persistence, scalability, and high performance. OpenEBS allows you to create storage classes that can be used to provision local storage volumes on nodes with local SSDs. It also offers features like volume bifurcation and fine-tuning for performance. While it provides powerful capabilities, it does introduce some added complexity in terms of setup and configuration.

As one can see in the diagram

  • StatefulSets can be assigned to different block devices or different partitions
  • On restarts, StatefulSets are assigned to the same node due to existing PV claims
# Install OpenEBS via Helm or other methods.

apiVersion: openebs.io/v1alpha1
kind: StorageClass
metadata:
  name: openebs-local-lvm
provisioner: openebs.io/local
parameters:
  storageType: local

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
spec:
  serviceName: "elasticsearch-service"
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
      - name: elasticsearch
        image: registry.k8s.io/elasticsearch 
        command:
        - "/bin/sh"
        args:
        - "-c"
        - "sleep 100000"
        volumeMounts:
        - name: elasticsearch-data
          mountPath: /usr/test-pod
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: openebs-local-lvm
      resources:
        requests:
          storage: 30Gi

Pros:

  • Offers advanced storage management capabilities.
  • Allows for bifurcating volumes and fulfilling specific pod requests.
  • Open source and manageable.

Cons:

  • Can introduce complexity because of too many services which are involved compared to other solutions.
  • Requires more setup and configuration.

Evaluation Against Requirements

Let’s evaluate these solutions against your specific requirements:

RequirementHostPathKubernetes Local Storage ProvisionerOpenEBS with Local LVM
Support for Creating PVCs on Local Ephemeral StorageCan be achievedYesYes
Disk Space ManagementLimitedBasic managementStrong management
Disk PerformanceSame as the disk performance on the nodeSame as the disk performance on the nodeSame as the disk performance on the node
Open SourceOpen sourceOpen sourceOpen source
Easy to ManageYesYesSomewhat difficult

Conclusion

Choosing the right storage provisioner for your Elasticsearch hot pods on NVMe-based I3 instances is a critical decision. The choice depends on your specific needs and trade-offs between complexity and control.

  • HostPath: Simple but limited scalability and volume management.
  • Kubernetes Local Storage Provisioner: Basic support with some limitations.
  • OpenEBS with Local LVM: Offers advanced control, scalability, and performance, but requires more configuration.

Considering our requirements, OpenEBS with Local LVM came out on top. It provides the flexibility and control needed for your use case, including the ability to bifurcate volumes and fulfill specific pod requests.

Leave a Reply