Add Diagnose to your scopes

Diagnose runs automated checks on your scopes and deployments. If your scope doesn't have the Diagnose tool yet, or you want to build your own diagnose flow, you need to register the right action specifications on your scope specification.

How Diagnose connects to your scope

Diagnose relies on three components working together:

Action specifications register the Diagnose actions so they appear in the UI.
Workflows define the execution flow when an action is triggered.
The diagnose module contains the context builder and individual checks.

Action specifications

The agent registers two action specifications that make Diagnose available in the UI:

Diagnose Scope

Runs diagnostics against a scope. This action appears on the manage and performance tabs.

{
  "name": "Diagnose Scope",
  "slug": "diagnose-scope",
  "type": "diagnose",
  "retryable": true,
  "parameters": {
    "schema": {
      "type": "object",
      "required": ["scope_id"],
      "properties": {
        "scope_id": {
          "type": "number",
          "readOnly": true,
          "visibleOn": ["read"]
        }
      }
    },
    "values": {}
  },
  "results": {
    "schema": {
      "type": "object",
      "required": [],
      "properties": {}
    },
    "values": {}
  },
  "annotations": {
    "show_on": ["manage", "performance"],
    "runs_over": "scope"
  }
}

Diagnose Deployment

Runs diagnostics against a specific deployment. This action appears on the deployment tab and includes both scope_id and deployment_id.

{
  "name": "Diagnose Deployment",
  "slug": "diagnose-deployment",
  "type": "diagnose",
  "retryable": true,
  "parameters": {
    "schema": {
      "type": "object",
      "required": ["scope_id", "deployment_id"],
      "properties": {
        "scope_id": {
          "type": "number",
          "readOnly": true,
          "visibleOn": ["read"]
        },
        "deployment_id": {
          "type": "number",
          "readOnly": true,
          "visibleOn": ["read"]
        }
      }
    },
    "values": {}
  },
  "results": {
    "schema": {
      "type": "object",
      "required": [],
      "properties": {}
    },
    "values": {}
  },
  "annotations": {
    "show_on": ["deployment"],
    "runs_over": "deployment"
  }
}

Annotations: `show_on` and `runs_over`

Diagnose action specifications use annotations to tell the UI where to display the action button and what entity the action targets. Without annotations, the UI won't know where to render the Diagnose tool or what data to pass to the workflow.

`show_on`

Controls which sections of the UI display the Diagnose tool. The value is an array, so a single action can appear in multiple places.

Value	Where the button appears
`manage`	The Manage tab of the scope
`performance`	The Performance tab of the scope
`deployment`	The Deployment detail view

For example, "show_on": ["manage", "performance"] makes the Diagnose Scope button visible in both the Manage and Performance tabs of the scope.

`runs_over`

Tells the UI what entity the action runs against. This determines which parameters the UI passes to the workflow when a user clicks the button.

Value	What the action targets	Parameters passed
`scope`	The scope as a whole	`scope_id`
`deployment`	A specific deployment	`scope_id` and `deployment_id`

A scope-level action uses the scope's label selector to gather resources, while a deployment-level action narrows the scope further with the deployment ID.

How they work together

The combination of show_on and runs_over gives you precise control over the Diagnose experience:

"annotations": {
  "show_on": ["manage", "performance"],
  "runs_over": "scope"
}

This registers a button on the Manage and Performance tabs that runs diagnostics at the scope level.

"annotations": {
  "show_on": ["deployment"],
  "runs_over": "deployment"
}

This registers a button on the Deployment view that runs diagnostics scoped to that specific deployment.

Match runs_over with the right parameters

If runs_over is "deployment", make sure deployment_id is in the required array of your parameters schema. The UI will pass it automatically, but the workflow will fail if the schema doesn't expect it.

Register the action specifications

If Diagnose isn't available on your scope yet, you need to register these action specifications manually. Use the CLI or API.

Here's an example for the Diagnose Scope action:

CLI
cURL

np service specification action specification create \
  --serviceSpecificationId $service-spec-id \
  --body '{
    "name": "Diagnose Scope",
    "slug": "diagnose-scope",
    "type": "diagnose",
    "retryable": true,
    "parameters": {
      "schema": {
        "type": "object",
        "required": ["scope_id"],
        "properties": {
          "scope_id": {
            "type": "number",
            "readOnly": true,
            "visibleOn": ["read"]
          }
        }
      },
      "values": {}
    },
    "results": {
      "schema": {
        "type": "object",
        "required": [],
        "properties": {}
      },
      "values": {}
    },
    "annotations": {
      "show_on": ["manage", "performance"],
      "runs_over": "scope"
    }
  }'

curl -L -X POST 'https://api.nullplatform.com/service_specification/$service-spec-id/action_specification' \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer <token>' \
  -d '{
    "name": "Diagnose Scope",
    "slug": "diagnose-scope",
    "type": "diagnose",
    "retryable": true,
    "parameters": {
      "schema": {
        "type": "object",
        "required": ["scope_id"],
        "properties": {
          "scope_id": {
            "type": "number",
            "readOnly": true,
            "visibleOn": ["read"]
          }
        }
      },
      "values": {}
    },
    "results": {
      "schema": {
        "type": "object",
        "required": [],
        "properties": {}
      },
      "values": {}
    },
    "annotations": {
      "show_on": ["manage", "performance"],
      "runs_over": "scope"
    }
  }'

note

The $service-spec-id is the ID of your scope specification. You can find it in the response when you create the specification, or by listing your existing ones.

Follow the same pattern to register the Diagnose Deployment action specification. See the lifecycle action specifications guide for more details on how action specs work.

The diagnose workflow

When you trigger a Diagnose action, the agent runs a workflow with three steps:

1. Load utility functions

The workflow starts by sourcing helper functions that every check relies on. These include output formatting (print_success, print_error, print_warning), resource validation (require_pods, require_services, require_ingresses), and the update_check_result function that reports check outcomes.

2. Build context

Before any checks run, the build_context script collects a point-in-time snapshot of the Kubernetes cluster. This snapshot ensures every check evaluates the same data, even if the cluster changes during the run.

The context includes:

Resource	What it collects
Pods	All pods matching the scope labels
Services	Services associated with the deployment
Endpoints	Service endpoint information
Ingresses	Ingress resources for the scope
Secrets	Metadata only, no secret data
IngressClasses	Available ingress classes in the cluster
Events	Recent Kubernetes events
ALB controller	Controller pods and logs, if applicable

All data is stored as JSON files in a data/ directory. Checks read from these files instead of making direct API calls, which keeps the run fast and consistent.

3. Execute checks

The workflow uses an executor step that discovers and runs all check scripts in parallel from three folders:

diagnose/service/ -- Kubernetes Service checks
diagnose/scope/ -- pod and workload checks
diagnose/networking/ -- ingress and routing checks

For each check, the executor:

Runs a before_each hook that sets the check status to "running" and notifies the UI.
Executes the check script.
Runs an after_each hook that collects the result and sends it to the UI.

This is what makes the Diagnose UI update in real time as each check completes.

What the workflow looks like

Both the scope and deployment workflows share the same structure:

continue_on_error: true
include:
  - "$SERVICE_PATH/values.yaml"
steps:
  - name: load_functions
    type: script
    file: "$SERVICE_PATH/diagnose/utils/diagnose_utils"
    output:
      - name: update_check_result
        type: function
      - name: notify_results
        type: function
  - name: build context
    type: script
    file: "$SERVICE_PATH/diagnose/build_context"
    output:
      - name: CONTEXT
        type: environment
      - name: LABEL_SELECTOR
        type: environment
  - name: diagnose
    type: executor
    before_each:
      name: notify_check_running
      type: script
      file: "$SERVICE_PATH/diagnose/notify_check_running"
    after_each:
      name: notify_check_results
      type: script
      file: "$SERVICE_PATH/diagnose/notify_diagnose_results"
    folders:
      - "$SERVICE_PATH/diagnose/service"
      - "$SERVICE_PATH/diagnose/scope"
      - "$SERVICE_PATH/diagnose/networking"

The continue_on_error: true flag ensures that if one check fails, the remaining checks still run. This is important because you want to see the full picture, not just the first failure.

What you need to enable Diagnose

If you're adding Diagnose to your scope or verifying that it's properly configured:

Action specifications must be registered for the scope. The agent handles this automatically during setup, but you can also register them manually.
Workflow files must exist at k8s/scope/workflows/diagnose.yaml and k8s/deployment/workflows/diagnose.yaml.
The diagnose module must be present at k8s/diagnose/ with its full directory structure.
kubectl access must be available inside the agent, since build_context uses it to collect cluster data.

For standard Kubernetes agents, all of this is included out of the box. You only need to think about these components if you're customizing the agent or building support for a new runtime.

Build your own diagnose flow

If the built-in Diagnose doesn't fit your scope — or your scope runs on a runtime other than Kubernetes — you can build your own diagnose flow from scratch. This means writing your own data collection, checks, and result reporting, while following the contract that the UI expects.

What the UI needs from your workflow

The Diagnose UI renders results based on a specific structure. Your workflow must produce results that match this contract, regardless of what your checks actually inspect.

Each check must produce a JSON result with these fields:

{
  "name": "My Check",
  "description": "What this check validates",
  "category": "Networking",
  "status": "success",
  "evidence": { "pods_checked": 3 },
  "logs": ["✓ All endpoints healthy", "✓ Port 8080 open"],
  "start_at": "2025-01-15T10:30:00Z",
  "end_at": "2025-01-15T10:30:02Z"
}

Field	Description
`name`	Display name shown in the UI
`description`	Short explanation of what the check validates
`category`	Groups checks in the UI. Use any string (e.g., `Networking`, `Scope`, `Database`)
`status`	One of `success`, `failed`, `warning`, or `skipped`
`evidence`	Arbitrary JSON with data that helps explain the result
`logs`	Array of strings shown as check logs in the UI
`start_at` / `end_at`	ISO 8601 timestamps for the check execution window

How results reach the UI

At the end of the workflow, results must be sent to the nullplatform API using the service action update endpoint. The payload groups checks by category:

{
  "results": {
    "categories": [
      {
        "category": "Networking",
        "summary": {
          "pending": 0,
          "running": 0,
          "success": 2,
          "failed": 1,
          "warning": 0,
          "skipped": 0
        },
        "checks": [
          { "name": "...", "status": "...", "..." : "..." }
        ]
      }
    ]
  }
}

The built-in diagnose flow handles this automatically through the notify_results function, which reads all check result files and groups them. If you're building your own flow, you need to build and send this payload yourself.

Wiring it up

To create a complete custom diagnose flow:

Register the action specifications with "type": "diagnose" as shown above. Use annotations to control where the action button appears in the UI and what entity it targets.
Create the workflow file at the expected path. The filename must match the action type:
- k8s/scope/workflows/diagnose.yaml for scope-level diagnosis
- k8s/deployment/workflows/diagnose.yaml for deployment-level diagnosis
Collect your data. Replace build_context with whatever data collection makes sense for your runtime. The built-in flow uses kubectl to snapshot Kubernetes resources, but yours could query a database, call an API, or read metrics.
Write your checks. Each check should analyze the collected data and report a result with status and evidence. See Create a custom check for the check structure and conventions.
Report results by sending the grouped result payload to the nullplatform API at the end of the workflow.

Use the built-in flow as a reference

The standard Kubernetes diagnose workflow is a good template for your own. It follows the same collect → check → report pattern described here. You can browse the source at nullplatform/scopes — k8s/diagnose.

How Diagnose connects to your scope​

Action specifications​

Diagnose Scope​

Diagnose Deployment​

Annotations: show_on and runs_over​

show_on​

runs_over​

How they work together​

Register the action specifications​

The diagnose workflow​

1. Load utility functions​

2. Build context​

3. Execute checks​

What the workflow looks like​

What you need to enable Diagnose​

Build your own diagnose flow​

What the UI needs from your workflow​

How results reach the UI​

Wiring it up​