Skip to main content

AWS EKS

Overview

The AWS EKS integration with ObserveOps (formerly known as AIOps) collects operational telemetry from Amazon Elastic Kubernetes Service clusters. It monitors cluster state, node group configuration, cluster-level performance, Kubernetes nodes, pods, and containers.

These metrics help administrators track workload health, monitor node and pod resource utilization, identify container issues, and maintain operational visibility across EKS-managed Kubernetes environments.

Prerequisites

  • The AWS account has EKS clusters running.
  • The IAM role or user used for integration has the AmazonEKSClusterPolicy or equivalent read-only policy attached.
  • Required AWS API endpoints are reachable from ObserveOps.
  • The AWS account is added in discovery with correct credentials and region configuration.

List of Supported KPIs

Cluster

MetricDescriptionType
aws.eks.cluster.stateCurrent operational state of the EKS cluster.String
aws.eks.cluster.cpu.percentCPU utilization percentage across the EKS cluster.Percent
aws.eks.cluster.memory.used.percentMemory utilization percentage across the EKS cluster.Percent
aws.eks.kubernetes.namespacesTotal number of namespaces in the cluster.Count
aws.eks.kubernetes.nodesTotal number of nodes in the cluster.Count
aws.eks.kubernetes.podsTotal number of pods across the cluster.Count
aws.eks.kubernetes.containersTotal number of containers across the cluster.Count
aws.eks.kubernetes.running.podsTotal number of pods in running state across the cluster.Count
aws.eks.kubernetes.pending.podsTotal number of pods in pending state across the cluster.Count
aws.eks.kubernetes.failed.podsTotal number of pods in failed state across the cluster.Count
aws.eks.kubernetes.crashloopbackoff.podsTotal number of pods in CrashLoopBackOff state across the cluster.Count

Node Groups

MetricDescriptionType
aws.eks.nodegroupsTotal number of node groups in the cluster.Count
aws.eks.nodegroupIdentifier of an individual node group.String
aws.eks.nodegroup.stateCurrent state of the node group.String
aws.eks.nodegroup.capacity.typeCapacity type of the node group — ON_DEMAND or SPOT.String
aws.eks.nodegroup.instance.typeEC2 instance type used for nodes in this group.String
aws.eks.nodegroup.desired.sizeDesired number of nodes configured for this group.Count
aws.eks.nodegroup.min.sizeMinimum number of nodes configured for autoscaling.Count
aws.eks.nodegroup.max.sizeMaximum number of nodes configured for autoscaling.Count
aws.eks.nodegroup.disk.size.bytesBoot disk size allocated per node in this group.Bytes
aws.eks.nodegroup.management.typeNode group management type — MANAGED or SELF_MANAGED.String

Kubernetes Nodes

MetricDescriptionType
aws.eks.kubernetes.nodeIdentifier of an individual Kubernetes node.String
aws.eks.kubernetes.node.stateCurrent state of the node — Ready, NotReady, or Unknown.String
aws.eks.kubernetes.node.osOperating system running on the node.String
aws.eks.kubernetes.node.instance.typeEC2 instance type of this node.String
aws.eks.kubernetes.node.nodegroupNode group this node belongs to.String
aws.eks.kubernetes.node.roleRole assigned to this node.String
aws.eks.kubernetes.node.creation.timeTimestamp when this node was added to the cluster.Timestamp
aws.eks.kubernetes.node.cpu.percentCPU utilization percentage on this node.Percent
aws.eks.kubernetes.node.memory.used.percentMemory utilization percentage on this node.Percent
aws.eks.kubernetes.node.memory.capacity.bytesTotal memory capacity of this node.Bytes
aws.eks.kubernetes.node.memory.used.bytesAmount of memory currently in use on this node.Bytes
aws.eks.kubernetes.node.cpu.capacity.coresTotal CPU capacity of this node in cores.Count
aws.eks.kubernetes.node.cpu.used.coresNumber of CPU cores currently in use on this node.Count
aws.eks.kubernetes.node.podsNumber of pods currently scheduled on this node.Count

Kubernetes Pods

MetricDescriptionType
aws.eks.kubernetes.podIdentifier of an individual pod.String
aws.eks.kubernetes.pod.stateCurrent state of the pod.String
aws.eks.kubernetes.pod.ipIP address assigned to the pod.String
aws.eks.kubernetes.pod.namespaceNamespace in which the pod is running.String
aws.eks.kubernetes.pod.nodeNode on which the pod is scheduled.String
aws.eks.kubernetes.pod.containersNumber of containers in the pod.Count
aws.eks.kubernetes.pod.restartsTotal number of container restarts in the pod.Count
aws.eks.kubernetes.pod.readyIndicates whether all containers in the pod are ready.Boolean
aws.eks.kubernetes.pod.creation.timeTimestamp when the pod was created.Timestamp

Kubernetes Containers

MetricDescriptionType
aws.eks.kubernetes.containerIdentifier of an individual container.String
aws.eks.kubernetes.container.stateCurrent state of the container.String
aws.eks.kubernetes.container.imageContainer image running in this container.String
aws.eks.kubernetes.container.pod.ipIP address of the pod hosting this container.String
aws.eks.kubernetes.container.portsPorts exposed by this container.String
aws.eks.kubernetes.container.mountVolume mount paths configured for this container.String
aws.eks.kubernetes.container.creation.timeTimestamp when the container was created.Timestamp
aws.eks.kubernetes.container.cpu.limit.coresCPU limit allocated to this container in cores.Count
aws.eks.kubernetes.container.cpu.request.coresCPU requested by this container in cores.Count
aws.eks.kubernetes.container.memory.limit.bytesMemory limit allocated to this container.Bytes
aws.eks.kubernetes.container.memory.request.bytesMemory requested by this container.Bytes
aws.eks.kubernetes.container.cpu.limit.percentCPU limit as a percentage of node capacity.Percent
aws.eks.kubernetes.container.cpu.request.percentCPU request as a percentage of node capacity.Percent
aws.eks.kubernetes.container.memory.limit.percentMemory limit as a percentage of node capacity.Percent
aws.eks.kubernetes.container.memory.request.percentMemory request as a percentage of node capacity.Percent