AWS EKS
Overview
The AWS EKS integration with ObserveOps (formerly known as AIOps) collects operational telemetry from Amazon Elastic Kubernetes Service clusters. It monitors cluster state, node group configuration, cluster-level performance, Kubernetes nodes, pods, and containers.
These metrics help administrators track workload health, monitor node and pod resource utilization, identify container issues, and maintain operational visibility across EKS-managed Kubernetes environments.
Prerequisites
- The AWS account has EKS clusters running.
- The IAM role or user used for integration has the
AmazonEKSClusterPolicyor equivalent read-only policy attached. - Required AWS API endpoints are reachable from ObserveOps.
- The AWS account is added in discovery with correct credentials and region configuration.
List of Supported KPIs
Cluster
| Metric | Description | Type |
|---|---|---|
| aws.eks.cluster.state | Current operational state of the EKS cluster. | String |
| aws.eks.cluster.cpu.percent | CPU utilization percentage across the EKS cluster. | Percent |
| aws.eks.cluster.memory.used.percent | Memory utilization percentage across the EKS cluster. | Percent |
| aws.eks.kubernetes.namespaces | Total number of namespaces in the cluster. | Count |
| aws.eks.kubernetes.nodes | Total number of nodes in the cluster. | Count |
| aws.eks.kubernetes.pods | Total number of pods across the cluster. | Count |
| aws.eks.kubernetes.containers | Total number of containers across the cluster. | Count |
| aws.eks.kubernetes.running.pods | Total number of pods in running state across the cluster. | Count |
| aws.eks.kubernetes.pending.pods | Total number of pods in pending state across the cluster. | Count |
| aws.eks.kubernetes.failed.pods | Total number of pods in failed state across the cluster. | Count |
| aws.eks.kubernetes.crashloopbackoff.pods | Total number of pods in CrashLoopBackOff state across the cluster. | Count |
Node Groups
| Metric | Description | Type |
|---|---|---|
| aws.eks.nodegroups | Total number of node groups in the cluster. | Count |
| aws.eks.nodegroup | Identifier of an individual node group. | String |
| aws.eks.nodegroup.state | Current state of the node group. | String |
| aws.eks.nodegroup.capacity.type | Capacity type of the node group — ON_DEMAND or SPOT. | String |
| aws.eks.nodegroup.instance.type | EC2 instance type used for nodes in this group. | String |
| aws.eks.nodegroup.desired.size | Desired number of nodes configured for this group. | Count |
| aws.eks.nodegroup.min.size | Minimum number of nodes configured for autoscaling. | Count |
| aws.eks.nodegroup.max.size | Maximum number of nodes configured for autoscaling. | Count |
| aws.eks.nodegroup.disk.size.bytes | Boot disk size allocated per node in this group. | Bytes |
| aws.eks.nodegroup.management.type | Node group management type — MANAGED or SELF_MANAGED. | String |
Kubernetes Nodes
| Metric | Description | Type |
|---|---|---|
| aws.eks.kubernetes.node | Identifier of an individual Kubernetes node. | String |
| aws.eks.kubernetes.node.state | Current state of the node — Ready, NotReady, or Unknown. | String |
| aws.eks.kubernetes.node.os | Operating system running on the node. | String |
| aws.eks.kubernetes.node.instance.type | EC2 instance type of this node. | String |
| aws.eks.kubernetes.node.nodegroup | Node group this node belongs to. | String |
| aws.eks.kubernetes.node.role | Role assigned to this node. | String |
| aws.eks.kubernetes.node.creation.time | Timestamp when this node was added to the cluster. | Timestamp |
| aws.eks.kubernetes.node.cpu.percent | CPU utilization percentage on this node. | Percent |
| aws.eks.kubernetes.node.memory.used.percent | Memory utilization percentage on this node. | Percent |
| aws.eks.kubernetes.node.memory.capacity.bytes | Total memory capacity of this node. | Bytes |
| aws.eks.kubernetes.node.memory.used.bytes | Amount of memory currently in use on this node. | Bytes |
| aws.eks.kubernetes.node.cpu.capacity.cores | Total CPU capacity of this node in cores. | Count |
| aws.eks.kubernetes.node.cpu.used.cores | Number of CPU cores currently in use on this node. | Count |
| aws.eks.kubernetes.node.pods | Number of pods currently scheduled on this node. | Count |
Kubernetes Pods
| Metric | Description | Type |
|---|---|---|
| aws.eks.kubernetes.pod | Identifier of an individual pod. | String |
| aws.eks.kubernetes.pod.state | Current state of the pod. | String |
| aws.eks.kubernetes.pod.ip | IP address assigned to the pod. | String |
| aws.eks.kubernetes.pod.namespace | Namespace in which the pod is running. | String |
| aws.eks.kubernetes.pod.node | Node on which the pod is scheduled. | String |
| aws.eks.kubernetes.pod.containers | Number of containers in the pod. | Count |
| aws.eks.kubernetes.pod.restarts | Total number of container restarts in the pod. | Count |
| aws.eks.kubernetes.pod.ready | Indicates whether all containers in the pod are ready. | Boolean |
| aws.eks.kubernetes.pod.creation.time | Timestamp when the pod was created. | Timestamp |
Kubernetes Containers
| Metric | Description | Type |
|---|---|---|
| aws.eks.kubernetes.container | Identifier of an individual container. | String |
| aws.eks.kubernetes.container.state | Current state of the container. | String |
| aws.eks.kubernetes.container.image | Container image running in this container. | String |
| aws.eks.kubernetes.container.pod.ip | IP address of the pod hosting this container. | String |
| aws.eks.kubernetes.container.ports | Ports exposed by this container. | String |
| aws.eks.kubernetes.container.mount | Volume mount paths configured for this container. | String |
| aws.eks.kubernetes.container.creation.time | Timestamp when the container was created. | Timestamp |
| aws.eks.kubernetes.container.cpu.limit.cores | CPU limit allocated to this container in cores. | Count |
| aws.eks.kubernetes.container.cpu.request.cores | CPU requested by this container in cores. | Count |
| aws.eks.kubernetes.container.memory.limit.bytes | Memory limit allocated to this container. | Bytes |
| aws.eks.kubernetes.container.memory.request.bytes | Memory requested by this container. | Bytes |
| aws.eks.kubernetes.container.cpu.limit.percent | CPU limit as a percentage of node capacity. | Percent |
| aws.eks.kubernetes.container.cpu.request.percent | CPU request as a percentage of node capacity. | Percent |
| aws.eks.kubernetes.container.memory.limit.percent | Memory limit as a percentage of node capacity. | Percent |
| aws.eks.kubernetes.container.memory.request.percent | Memory request as a percentage of node capacity. | Percent |