Skip to main content

ALB metrics publishing

AWS Application Load Balancers have hard limits on the number of rules and target groups they can hold. When multiple scopes share the same ALB through a common group.name, these limits can be reached silently — causing deployment failures with no prior warning.

ALB metrics publishing solves this by reporting the current rule count and target group count to CloudWatch or Datadog after every deployment. With these metrics in place, you can set up alerts and catch capacity problems before they break a deploy.

Why this matters

Neither CloudWatch nor Datadog track ALB rule or target group counts natively. The only metrics AWS publishes for ALBs are operational (request count, latency, healthy hosts), not resource usage.

ResourceLimitAdjustable
Rules per ALB100Yes (via AWS Support)
Target groups per ALB100No

The target group limit is especially dangerous because it can't be increased. Once you hit it, the only option is to split traffic across multiple ALBs.

How it works

A publish_alb_metrics script runs as a post-deploy step — right after ingress reconciliation — in every deployment workflow (initial, switch traffic, and finalize). It:

  1. Queries the ALB via the AWS API to count non-default rules across all listeners and target groups.
  2. Publishes the counts as custom metrics to your chosen backend.

Configuration

Add these settings to your scope's values.yaml:

ALB_METRICS_PUBLISH_ENABLED: true
ALB_METRICS_PUBLISH_TARGET: cloudwatch # cloudwatch | datadog

CloudWatch

When the target is cloudwatch, the script publishes two metrics to the nullplatform/ApplicationELB namespace:

MetricUnitDimension
RuleCountCountALBName
TargetGroupCountCountALBName

IAM permissions required:

{
"Effect": "Allow",
"Action": [
"elasticloadbalancing:DescribeLoadBalancers",
"elasticloadbalancing:DescribeListeners",
"elasticloadbalancing:DescribeRules",
"elasticloadbalancing:DescribeTargetGroups",
"cloudwatch:PutMetricData"
],
"Resource": "*"
}

Datadog

When the target is datadog, the script publishes two metrics via the Datadog API:

MetricTags
nullplatform.applicationelb.rule_countalb_name, region
nullplatform.applicationelb.target_group_countalb_name, region

The script uses the DATADOG_API_KEY and DATADOG_SITE environment variables. These are available automatically if you have a Datadog metrics provider configured in nullplatform.

Setting up alerts

CloudWatch alarms

CloudWatch alarms require one alarm per ALB per metric. You can automate this with OpenTofu — the example below discovers all ALBs in the account automatically:

data "aws_lbs" "all" {}

data "aws_lb" "details" {
for_each = toset(data.aws_lbs.all.arns)
arn = each.value
}

locals {
alb_names = [
for lb in data.aws_lb.details : lb.name
if lb.load_balancer_type == "application"
]
}

resource "aws_sns_topic" "cloudwatch_alarms" {
name = "Cloudwatch-Alarms"
}

resource "aws_sns_topic_subscription" "email" {
topic_arn = aws_sns_topic.cloudwatch_alarms.arn
protocol = "email"
endpoint = "your-team@company.com"
}

resource "aws_cloudwatch_metric_alarm" "alb_rule_count" {
for_each = toset(local.alb_names)

alarm_name = "alb-rule-count-${each.value}"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 1
metric_name = "RuleCount"
namespace = "nullplatform/ApplicationELB"
period = 60
statistic = "Maximum"
threshold = 80
treat_missing_data = "notBreaching"
alarm_actions = [aws_sns_topic.cloudwatch_alarms.arn]
dimensions = { ALBName = each.value }
}

resource "aws_cloudwatch_metric_alarm" "alb_target_group_count" {
for_each = toset(local.alb_names)

alarm_name = "alb-target-group-count-${each.value}"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 1
metric_name = "TargetGroupCount"
namespace = "nullplatform/ApplicationELB"
period = 60
statistic = "Maximum"
threshold = 80
treat_missing_data = "notBreaching"
alarm_actions = [aws_sns_topic.cloudwatch_alarms.arn]
dimensions = { ALBName = each.value }
}

After applying, AWS sends a confirmation email — accept it to start receiving alerts.

note

New ALBs added after the initial apply need another tofu apply to get their alarms.

Datadog monitors

Datadog monitors can cover all ALBs with just two resources, since they group by the alb_name tag. New ALBs are detected automatically without re-applying.

resource "datadog_monitor" "alb_rule_count" {
name = "ALB Rule Count High"
type = "metric alert"
message = "ALB {{alb_name.name}} has {{value}} rules. @your-team@company.com"
query = "max(last_5m):max:nullplatform.applicationelb.rule_count{*} by {alb_name} > 80"
notify_no_data = false

monitor_thresholds {
critical = 80
}
}

resource "datadog_monitor" "alb_target_group_count" {
name = "ALB Target Group Count High"
type = "metric alert"
message = "ALB {{alb_name.name}} has {{value}} target groups. @your-team@company.com"
query = "max(last_5m):max:nullplatform.applicationelb.target_group_count{*} by {alb_name} > 80"
notify_no_data = false

monitor_thresholds {
critical = 80
}
}
tip

Datadog monitors support flexible notification targets — you can use @slack-channel, @pagerduty-service, or email addresses.

Deployment logs

On a successful publish, you'll see a single log line:

✓ ALB metrics published to CloudWatch (rules: 23, target_groups: 11)

When the feature is disabled, there's no output at all. Errors appear as warnings and don't block the deployment:

⚠ ALB metrics: could not find ALB [k8s-nullplatform-internet-facing]
⚠ ALB metrics: failed to publish to CloudWatch
⚠ ALB metrics: DATADOG_API_KEY not set

AWS ALB limits reference

ResourceDefault limitAdjustable
ALBs per region50Yes
Listeners per ALB50Yes
Rules per ALB100Yes
Target groups per ALB100No
Targets per ALB1,000Yes
Target groups per region3,000Yes