Skip to main content

Troubleshooting Ingress and ALB reconciliation (AWS)

This guide explains how the nullplatform agent detects and prevents Ingress and ALB reconciliation issues in agent-backed Kubernetes scopes on AWS, what problems can occur, and how to troubleshoot them when they do.

To do this, the agent runs networking validations during deployment. These validations catch AWS ALB and Ingress reconciliation failures early, before they result in unreachable applications or a blocked Load Balancer.

When enabled, the agent verifies that:

  • Your Ingress hostname can be reconciled to a valid ALB rule.
  • Required SSL/TLS certificates exist in ACM.
  • ALB limits and traffic configuration (such as blue/green weights) match expectations.

If a mismatch or invalid state is found, the deployment fails early with a clear error message and troubleshooting guidance in the logs.

The problem: failed Ingress and ALB reconciliation

When deploying an application, Kubernetes relies on the AWS ALB Ingress Controller to reconcile Ingress resources with an Application Load Balancer (ALB).

In some situations, this reconciliation fails due to invalid or constrained AWS or Ingress configuration, such as:

  • An Ingress hostname without a matching SSL/TLS certificate in AWS Certificate Manager (ACM).
  • The ALB reaching its maximum number of rules (commonly 100).
  • Insufficient IP capacity or other AWS-side limitations.

When reconciliation fails, it can lead to two critical outcomes:

  1. The deployed application is not reachable through its expected domain.
  2. The ALB enters an invalid or blocked state, preventing subsequent Ingresses from being reconciled on the same Load Balancer.

These failures are often difficult to diagnose after the fact and may impact unrelated services sharing the ALB.

What the networking validation checks

To reduce the risk of these failures, the agent includes a networking validation step that runs during deployment.

At a high level, the validation:

  1. Observes Ingress reconciliation events in the cluster.
  2. Identifies the ALB rule associated with the scope’s hostname.
  3. Compares the expected traffic configuration (for example, blue/green traffic weights) with the actual ALB configuration in AWS.
  4. Fails the deployment early if a mismatch or invalid state is detected.

By stopping the deployment at this stage, the validation prevents broken rollouts from progressing and avoids leaving the ALB in a bad or locked state.

How validation errors surface

If a networking validation fails, the deployment itself fails and the error is surfaced directly in the deployment logs.

A common example is a certificate mismatch:

  • The Ingress hostname does not match any valid SSL/TLS certificate in ACM.
  • The ALB listener rule cannot be created or updated.

In these cases, the logs clearly report:

  • The detected root cause.
  • The specific hostname, listener, or rule that failed validation.
  • Suggested next steps to correct the configuration.

Troubleshooting a failed networking validation

When a deployment fails due to a networking validation error, use the following steps to diagnose and resolve the issue:

  1. Review the deployment logs first
    The validation error message points directly to the failing condition (for example, a missing certificate or rule limit).

  2. Verify the Ingress hostname configuration
    Ensure the hostname defined in the Ingress resource matches a valid ACM certificate in the target AWS account and region.

  3. Check ALB rule limits and capacity
    Confirm that the ALB has not reached its maximum number of listener rules and that there are no AWS-side capacity constraints.

  4. Confirm traffic configuration expectations
    For blue/green or weighted deployments, verify that the expected traffic split matches what is supported by the ALB.

After correcting the issue, re-run the deployment. No manual cleanup is typically required unless the ALB was left in a partially configured state.

Requirements

Because the validation inspects ALB configuration directly, the agent’s IAM role must include read-only permissions for Elastic Load Balancing.

Attach the following policy to the agent role:

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowALBValidation",
"Effect": "Allow",
"Action": [
"elasticloadbalancing:DescribeLoadBalancers",
"elasticloadbalancing:DescribeListeners",
"elasticloadbalancing:DescribeRules"
],
"Resource": "*"
}
]
}

These permissions allow the agent to list Load Balancers and inspect listeners and rules required for validation.

Enabling networking validations

Networking validations are disabled by default and must be explicitly enabled.

If you installed the agent using Helm, enable the validation by setting the following value:

helm upgrade nullplatform-agent nullplatform/nullplatform-agent \
--reuse-values \
--set configuration.values.ALB_RECONCILIATION_ENABLED=true

📖 Check our Agent docs for for info.

Option 2: Enable via override values

If you manage configuration through a repository, add the following to your values.yaml:

configuration:
ALB_RECONCILIATION_ENABLED: true

📖 Check our Override workflows docs for info.

Once enabled, the validation runs automatically on subsequent deployments.