Elastic cloud clusters - Advanced setup

This topic discusses advanced scaling and configuration strategies and techniques for elastic cloud clusters to optimize workloads. These approaches involve configuration of infrastructure and Kubernetes tooling outside the Boomi application, and therefore familiarity with the tools in your environment is expected. Your environment and workload requirements will vary, so these strategies should be considered general leading practices that should be adapted to your specific needs.

Placeholder pods for node pre-warming

Why use placeholder pods?

Placeholder pods serve as "dummy" workloads that pre-warm nodes and get evicted when higher-priority workloads need resources. They ensure nodes are ready for immediate scheduling of execution pods.

Benefits

Faster pod scheduling

Nodes are pre-warmed and ready for immediate workload placement
Eliminates cold start delays for execution pods

Resource Optimization

Placeholder pods consume minimal resources until evicted
Automatic eviction when real workloads need resources
Maintains node availability without waste

Quality of Service (QoS) classes

Kubernetes assigns QoS classes to pods based on their resource specifications, which determines scheduling priority and eviction order.

For more information, refer to the Pod Quality of Service Classes topic on the Kubernetes website.

Guaranteed
- Requirements: CPU and memory requests equal limits for all containers
- Priority: Highest - never evicted unless exceeding limits
Burstable
- Requirements: At least one container has CPU or memory request/limit set
- Priority: Medium - evicted when node pressure occurs
BestEffort
- Requirements: No CPU or memory requests or limits specified
- Priority: Lowest - first to be evicted under resource pressure

Implementation

Placeholder pods are created with the following principles in mind: no resource requests or limits to ensure BestEffort QoS class, minimal resource usage through simple sleep containers, and proper node affinity to target specific node pools for pre-warming. This design ensures they are evicted first when resources are needed by the higher-priority execution workloads.

Create placeholder pods with no resource specifications:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: forkedexecution-placeholder
spec:
  # Adjust replica count based on expected workload patterns
  replicas: 10
  selector:
    matchLabels:
      app: forkedexecution-placeholder
  template:
    metadata:
      labels:
        app: forkedexecution-placeholder
    spec:
      # Update nodeSelector to match your cluster's labeling scheme
      nodeSelector:
        nodepool: forkedexecution
      tolerations:
        - key: "forkedexecution"
          operator: "Exists"
          effect: "NoSchedule"
      containers:
        - name: placeholder
          image: busybox:latest
          command: ["sleep", "3600"]
          # No resources specified = BestEffort QoS class

A similar placeholder deployment can be created for workerpool nodes by updating the nodeSelector and tolerations to target the workerpool taint and appropriate node labels.

For this placeholder pod strategy to work effectively, execution pods must have higher priority than the BestEffort placeholder pods; by ensuring execution pods have resource requests specified, which automatically assigns them a higher QoS class (Burstable or Guaranteed).

Without resource requests on the execution pods, they would also be BestEffort and compete equally with placeholder pods, defeating the purpose of the pre-warming strategy.

Workload Separation/Isolation

Configuring compute nodes to achieve workload isolation - specifically separating execution worker processes from forked execution processes while maintaining cost effective autoscaling - lets you create different instance sizes, optimized for specific workloads.

Core concepts

Taints

Taints are properties applied to compute nodes that repel workloads unless those workloads have matching tolerations. A taint consists of three components:

Key: Identifies the taint (e.g., forkedexecution)
Value: Optional value associated with the key (e.g., true)
Effect: Defines what happens to workloads that don't tolerate the taint

Taint effects

NoSchedule: Prevents new workloads from being scheduled on the node
PreferNoSchedule: The scheduler tries to avoid scheduling workloads on the node but doesn't guarantee it
NoExecute: Evicts existing workloads that don't tolerate the taint and prevents new ones

Tolerations

Tolerations are specifications on workloads that allow them to be scheduled on nodes with matching taints. They enable workloads to tolerate specific node conditions.

Node pool strategy

This configuration uses three distinct node pools to achieve workload separation.

Default node pool

These nodes will be used by the runtime cloud pods, runtime-elastic-controller pods, as well as any other pods.

Key characteristics:

No taints which allows for any workload without special tolerations.
Intended for any pods besides execution related pods.

ForkedExecution node pool

These nodes will be used by the forked execution pods.

Key characteristics:

Has a NoSchedule taint called forkedexecution.
Requires explicit toleration from the forked execution pods for scheduling.

WorkerPool node pool

These nodes will be used by the execution worker pods.

Key characteristics:

Has a NoSchedule taint called workerpool.
Requires explicit toleration from the execution worker pods for scheduling.

Elastic controller configuration

In order to take advantage of these node pools, the elastic controller needs to be configured properly to ensure the execution pods are created with the appropriate tolerations and node selectors.

The configuration of the elastic controller is primarily done via the helm values. The below example shows how to create a separate file for the helm values of the elastic controller called elastic-controller-helm-values.yaml. The contents of this file should be as follows:

fullnameOverride: runtime-elastic-controller
boomi:
  runner:
    # Update nodeSelector key/value to match your cluster's node labeling scheme
    nodeSelector:
      nodepool: forkedexecution
    tolerations:
      - key: "forkedexecution"
        operator: "Exists"
        effect: "NoSchedule"
  browser:
    # Update nodeSelector key/value to match your cluster's node labeling scheme
    nodeSelector:
      nodepool: forkedexecution
    tolerations:
      - key: "forkedexecution"
        operator: "Exists"
        effect: "NoSchedule"
  worker:
    # Update nodeSelector key/value to match your cluster's node labeling scheme
    nodeSelector:
      nodepool: workerpool
    tolerations:
      - key: "workerpool"
        operator: "Exists"
        effect: "NoSchedule"

Key characteristics:

Forked execution pods (configured under runner) and browser pods, always scheduled onto the nodes of the forkedexecution node pool.
Execution worker pods are always scheduled onto the nodes of the workerpool node pool.

By implementing the detailed node pool configuration and appropriately setting the elastic controller's helm parameters, this strategy optimizes resource allocation for the various workloads; thereby contributing to more cost effective autoscaling within your compute environment.

Example Implementation: Karpenter

Karpenter is one example of a node provisioning system that can implement this workload isolation strategy. With Karpenter, you create NodePools that correspond to each workload type.

For instance, the WorkerPool node configuration would look like this:

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: workerpool
spec:
  template:
    spec:
      requirements:
        # Configure instance requirements based on your needs
        # Example:
        # - key: kubernetes.io/arch
        #   operator: In
        #   values: ["amd64"]
        # - key: karpenter.k8s.aws/instance-category
        #   operator: In
        #   values: ["c", "m"]
        # - key: karpenter.k8s.aws/instance-generation
        #   operator: Gt
        #   values: ["5"]
      taints:
        - key: workerpool
          value: "true"
          effect: NoSchedule
      nodeClassRef:
        apiVersion: karpenter.k8s.aws/v1
        kind: EC2NodeClass
        name: default # Reference your EC2NodeClass
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 15m

Similar NodePool configurations are created for the default and forkedexecution node pools, with the forkedexecution NodePool including the forkedexecution=true:NoSchedule taint and the default NodePool having no taints.

When using Karpenter, the runtime-elastic-controller's helm values need to be updated to use Karpenter's specific node labeling scheme. Instead of the generic nodepool label shown in the previous example, use karpenter.sh/nodepool as the nodeSelector key, with values matching your NodePool names (e.g., `forkedexecution`, `workerpool`). The tolerations remain the same, regardless of the node provisioning system used.