Skip to main content

Check out Port for yourself ➜ 

Measure pipeline reliability

Understanding how your CI/CD pipelines perform is essential for maintaining delivery velocity and software quality.
Without pipeline-level visibility, engineering leaders cannot identify flaky workflows, detect reliability regressions, or measure the impact of infrastructure changes on build stability.

This guide demonstrates how to build a comprehensive pipeline reliability dashboard in Port that answers critical questions at both the service level and team level:

  • Failure rate: What percentage of pipeline runs are failing weekly and monthly?
  • Failure trend: Is pipeline reliability improving or degrading over time?
  • Failing pipelines: Which specific workflows or builds are failing most frequently?
  • Team impact: Which teams and services are most affected by pipeline failures?

By the end of this guide, you will have a dashboard that provides full visibility into pipeline reliability health across services and teams, helping you identify problem areas and track improvement over time.

Pipeline reliability dashboard showing failure rates, trends, and team reliability overview
Supported integrations

This guide supports GitHub, GitLab, and Azure DevOps.

Common use cases

  • Track pipeline failure rate trends to detect reliability regressions early.
  • Identify the most frequently failing workflows or builds to prioritize fixes.
  • Monitor failure rates per team and service to understand organizational reliability health.
  • Surface services with degrading pipeline reliability before they impact delivery.
  • Compare reliability metrics across teams to identify best practices and improvement areas.

Prerequisites

This guide assumes the following:

  • Port's GitHub integration is installed in your account.
  • The githubWorkflow, githubWorkflowRun, and githubRepository blueprints already exist (these are created automatically when you install the GitHub integration).
  • A service blueprint exists with a team relation to _team (see the PR delivery metrics guide for setup instructions).

Key metrics overview

This dashboard tracks pipeline reliability metrics across two levels - individual services and teams:

MetricWhat it measuresWhy it matters
Overall failure ratePercentage of workflow runs that failed in the last monthShows the organization-wide reliability baseline
Failing workflowsCount of workflows currently in a failure stateHighlights active reliability problems that need attention
Failure trendWeekly count of failed workflow runs over timeReveals whether pipeline reliability is improving or degrading
Run result distributionBreakdown of workflow run conclusions (success, failure, cancelled, etc.)Gives a complete picture of pipeline health beyond just failures
Top failing workflowsWorkflows with the most failed runsIdentifies the specific workflows to prioritize for fixing
Team failure distributionFailed workflow runs broken down by teamShows which teams carry the most reliability burden
Weekly failure rate (%)Percentage of workflow runs that failed in the last 7 daysProvides a short-term reliability signal
Monthly failure rate (%)Percentage of workflow runs that failed in the last 30 daysProvides a longer-term reliability baseline
Failure rate trendWhether weekly failure rate is better or worse than monthly averageSignals whether reliability is improving, stable, or degrading

Set up data model

The GitHub integration automatically creates the githubWorkflow and githubWorkflowRun blueprints with default properties. We need to ensure the workflow run blueprint has the right properties for reliability tracking, then add aggregation and calculation properties to the service and Team blueprints to surface failure metrics at those levels.

Default branch only

The integration mapping in this guide filters workflow runs to the default branch (main, master, or production) and the last 90 days. This ensures reliability metrics reflect production-relevant pipeline health rather than feature branch noise. Adjust the integration mapping selector if you need to include other branches.

Verify the GitHub workflow blueprint

The githubWorkflow blueprint should already have the following properties from the default integration. Verify they exist:

Expected workflow properties (click to expand)
"path": {
"title": "Path",
"type": "string"
},
"status": {
"title": "Status",
"type": "string",
"enum": ["active", "deleted", "disabled_fork", "disabled_inactivity", "disabled_manually"],
"enumColors": {
"active": "green",
"deleted": "red"
}
},
"createdAt": {
"title": "Created At",
"type": "string",
"format": "date-time"
},
"updatedAt": {
"title": "Updated At",
"type": "string",
"format": "date-time"
},
"link": {
"title": "Link",
"type": "string",
"format": "url"
}

We need to add two additional properties that will be populated via the integration mapping to track the latest run result:

  1. Go to the Builder page of your portal.

  2. Find the GitHub Workflow blueprint and click on it.

  3. Click on the {...} button in the top right corner, and choose Edit JSON.

  4. Add the following properties to the properties section:

    Additional workflow properties (click to expand)
    "result": {
    "title": "Result",
    "description": "Latest run conclusion (merged from workflow-run data)",
    "type": "string",
    "enum": ["success", "failure", "cancelled", "skipped", "timed_out", "action_required", "neutral", "stale", "startup_failure"],
    "enumColors": {
    "success": "green",
    "failure": "red",
    "cancelled": "lightGray",
    "skipped": "lightGray",
    "timed_out": "orange",
    "action_required": "yellow",
    "neutral": "lightGray",
    "stale": "darkGray",
    "startup_failure": "red"
    }
    },
    "last_triggered_at": {
    "title": "Last Triggered At",
    "type": "string",
    "format": "date-time"
    }
  5. Add the following mirror properties to the mirrorProperties section (create it if it doesn't exist). These surface the owning team and service on each workflow:

    Workflow mirror properties (click to expand)
    "team_name": {
    "title": "Team",
    "path": "repository.service.team.$title"
    },
    "service_name": {
    "title": "Service",
    "path": "repository.service.$title"
    }
  6. Verify the relations section includes a relation to githubRepository:

    "repository": {
    "title": "Repository",
    "target": "githubRepository",
    "required": false,
    "many": false
    }
  7. Click Save to update the blueprint.

Verify the GitHub workflow run blueprint

The githubWorkflowRun blueprint should already have the following properties. Verify they exist:

Expected workflow run properties (click to expand)
"name": {
"title": "Name",
"type": "string"
},
"triggeringActor": {
"title": "Triggering Actor",
"type": "string"
},
"status": {
"title": "Status",
"type": "string",
"enum": ["queued", "in_progress", "completed", "waiting", "requested", "pending"],
"enumColors": {
"completed": "green",
"in_progress": "blue",
"queued": "lightGray",
"waiting": "yellow",
"requested": "orange",
"pending": "yellow"
}
},
"conclusion": {
"title": "Conclusion",
"type": "string",
"enum": ["success", "failure", "cancelled", "skipped", "timed_out", "action_required", "neutral", "stale", "startup_failure"],
"enumColors": {
"success": "green",
"failure": "red",
"cancelled": "lightGray",
"skipped": "lightGray",
"timed_out": "orange",
"action_required": "yellow",
"neutral": "lightGray",
"stale": "darkGray",
"startup_failure": "red"
}
},
"createdAt": {
"title": "Created At",
"type": "string",
"format": "date-time"
},
"runStartedAt": {
"title": "Run Started At",
"type": "string",
"format": "date-time"
},
"updatedAt": {
"title": "Updated At",
"type": "string",
"format": "date-time"
},
"runNumber": {
"title": "Run Number",
"type": "number"
},
"runAttempt": {
"title": "Run Attempt",
"type": "number"
},
"link": {
"title": "Link",
"type": "string",
"format": "url"
},
"headBranch": {
"title": "Head Branch",
"type": "string"
}

Add the following mirror properties to surface team, service, and workflow context on each run:

  1. Go to the Builder page.

  2. Find the GitHub Workflow Run blueprint and click on it.

  3. Click on the {...} button in the top right corner, and choose Edit JSON.

  4. Add or verify the following mirrorProperties:

    Workflow run mirror properties (click to expand)
    "team_name": {
    "title": "Team",
    "path": "repository.service.team.$title"
    },
    "service_name": {
    "title": "Service",
    "path": "repository.service.$title"
    },
    "workflow_name": {
    "title": "Workflow Name",
    "path": "workflow.path"
    },
    "workflow_current_result": {
    "title": "Workflow Current Result",
    "path": "workflow.result"
    }
  5. Verify the relations section includes:

    "workflow": {
    "title": "Workflow",
    "target": "githubWorkflow",
    "required": false,
    "many": false
    },
    "repository": {
    "title": "Repository",
    "target": "githubRepository",
    "required": false,
    "many": false
    }
  6. Click Save to update the blueprint.

Update the service blueprint

Add aggregation and calculation properties to the service blueprint to surface pipeline reliability metrics for each service.

  1. Go to your Builder page.
  2. Find the Service blueprint and click on it.
  3. Click on the {...} button in the top right corner, and choose Edit JSON.
  4. Add the following entries to the aggregationProperties section of the blueprint:
Service reliability aggregation properties (click to expand)
"workflow_runs_7d": {
"title": "Weekly Workflow Runs",
"description": "Total workflow runs in the last 7 days",
"type": "number",
"target": "githubWorkflowRun",
"query": {
"combinator": "and",
"rules": [
{
"property": "createdAt",
"operator": "between",
"value": {
"preset": "lastWeek"
}
}
]
},
"calculationSpec": {
"func": "count",
"calculationBy": "entities"
}
},
"failed_workflow_runs_7d": {
"title": "Weekly Failed Workflow Runs",
"description": "Workflow runs that ended in failure in the last 7 days",
"type": "number",
"target": "githubWorkflowRun",
"query": {
"combinator": "and",
"rules": [
{
"property": "createdAt",
"operator": "between",
"value": {
"preset": "lastWeek"
}
},
{
"property": "conclusion",
"operator": "=",
"value": "failure"
}
]
},
"calculationSpec": {
"func": "count",
"calculationBy": "entities"
}
},
"workflow_runs_30d": {
"title": "Monthly Workflow Runs",
"description": "Total workflow runs in the last 30 days",
"type": "number",
"target": "githubWorkflowRun",
"query": {
"combinator": "and",
"rules": [
{
"property": "createdAt",
"operator": "between",
"value": {
"preset": "lastMonth"
}
}
]
},
"calculationSpec": {
"func": "count",
"calculationBy": "entities"
}
},
"failed_workflow_runs_30d": {
"title": "Monthly Failed Workflow Runs",
"description": "Workflow runs that ended in failure in the last 30 days",
"type": "number",
"target": "githubWorkflowRun",
"query": {
"combinator": "and",
"rules": [
{
"property": "createdAt",
"operator": "between",
"value": {
"preset": "lastMonth"
}
},
{
"property": "conclusion",
"operator": "=",
"value": "failure"
}
]
},
"calculationSpec": {
"func": "count",
"calculationBy": "entities"
}
}
  1. Add the following entries to the calculationProperties section of the blueprint (these are the same regardless of your SCM provider, since the aggregation property names are identical):

    Service reliability calculation properties (click to expand)
    "weekly_workflow_failure_rate": {
    "title": "Weekly Failure Rate (%)",
    "description": "Percentage of pipeline runs that failed in the last 7 days",
    "type": "number",
    "calculation": "if (.properties.workflow_runs_7d != null and .properties.workflow_runs_7d != 0) then ((.properties.failed_workflow_runs_7d // 0) / .properties.workflow_runs_7d) * 100 | floor else 0 end"
    },
    "monthly_workflow_failure_rate": {
    "title": "Monthly Failure Rate (%)",
    "description": "Percentage of pipeline runs that failed in the last 30 days",
    "type": "number",
    "calculation": "if (.properties.workflow_runs_30d != null and .properties.workflow_runs_30d != 0) then ((.properties.failed_workflow_runs_30d // 0) / .properties.workflow_runs_30d) * 100 | floor else 0 end"
    },
    "failure_rate_trend": {
    "title": "Failure Rate Trend",
    "description": "Weekly failure rate vs monthly average — Improving, Stable, or Degrading",
    "type": "string",
    "colorized": true,
    "colors": {
    "Improving": "green",
    "Stable": "blue",
    "Degrading": "red"
    },
    "calculation": "((if (.properties.workflow_runs_30d != null and .properties.workflow_runs_30d != 0) then ((.properties.failed_workflow_runs_30d // 0) / .properties.workflow_runs_30d) * 100 | floor else 0 end) - (if (.properties.workflow_runs_7d != null and .properties.workflow_runs_7d != 0) then ((.properties.failed_workflow_runs_7d // 0) / .properties.workflow_runs_7d) * 100 | floor else 0 end)) as $diff | if $diff > 0 then \"Improving\" elif $diff < 0 then \"Degrading\" else \"Stable\" end"
    }
  2. Click Save to update the blueprint.

Update the team blueprint

Add aggregation and calculation properties to the Team blueprint to aggregate reliability metrics across all services owned by each team.

  1. Go to your Builder page.
  2. Find the Team blueprint and click on it.
  3. Click on the {...} button in the top right corner, and choose Edit JSON.
  4. Add the following entries to the aggregationProperties section of the blueprint:
Team reliability aggregation properties (click to expand)
"workflow_runs_7d": {
"title": "Weekly Workflow Runs",
"description": "Total workflow runs in the last 7 days",
"type": "number",
"target": "githubWorkflowRun",
"query": {
"combinator": "and",
"rules": [
{
"property": "createdAt",
"operator": "between",
"value": {
"preset": "lastWeek"
}
}
]
},
"calculationSpec": {
"func": "count",
"calculationBy": "entities"
}
},
"failed_workflow_runs_7d": {
"title": "Weekly Failed Workflow Runs",
"description": "Workflow runs that ended in failure in the last 7 days",
"type": "number",
"target": "githubWorkflowRun",
"query": {
"combinator": "and",
"rules": [
{
"property": "createdAt",
"operator": "between",
"value": {
"preset": "lastWeek"
}
},
{
"property": "conclusion",
"operator": "=",
"value": "failure"
}
]
},
"calculationSpec": {
"func": "count",
"calculationBy": "entities"
}
},
"workflow_runs_30d": {
"title": "Monthly Workflow Runs",
"description": "Total workflow runs in the last 30 days",
"type": "number",
"target": "githubWorkflowRun",
"query": {
"combinator": "and",
"rules": [
{
"property": "createdAt",
"operator": "between",
"value": {
"preset": "lastMonth"
}
}
]
},
"calculationSpec": {
"func": "count",
"calculationBy": "entities"
}
},
"failed_workflow_runs_30d": {
"title": "Monthly Failed Workflow Runs",
"description": "Workflow runs that ended in failure in the last 30 days",
"type": "number",
"target": "githubWorkflowRun",
"query": {
"combinator": "and",
"rules": [
{
"property": "createdAt",
"operator": "between",
"value": {
"preset": "lastMonth"
}
},
{
"property": "conclusion",
"operator": "=",
"value": "failure"
}
]
},
"calculationSpec": {
"func": "count",
"calculationBy": "entities"
}
}
  1. Add the following entries to the calculationProperties section of the blueprint (these are the same regardless of your SCM provider):

    Team reliability calculation properties (click to expand)
    "weekly_workflow_failure_rate": {
    "title": "Weekly Failure Rate (%)",
    "description": "Percentage of pipeline runs that failed in the last 7 days",
    "type": "number",
    "calculation": "if (.properties.workflow_runs_7d != null and .properties.workflow_runs_7d != 0) then ((.properties.failed_workflow_runs_7d // 0) / .properties.workflow_runs_7d) * 100 | floor else 0 end"
    },
    "monthly_workflow_failure_rate": {
    "title": "Monthly Failure Rate (%)",
    "description": "Percentage of pipeline runs that failed in the last 30 days",
    "type": "number",
    "calculation": "if (.properties.workflow_runs_30d != null and .properties.workflow_runs_30d != 0) then ((.properties.failed_workflow_runs_30d // 0) / .properties.workflow_runs_30d) * 100 | floor else 0 end"
    },
    "failure_rate_trend": {
    "title": "Failure Rate Trend",
    "description": "Weekly failure rate vs monthly average — Improving, Stable, or Degrading",
    "type": "string",
    "colorized": true,
    "colors": {
    "Improving": "green",
    "Stable": "blue",
    "Degrading": "red"
    },
    "calculation": "((if (.properties.workflow_runs_30d != null and .properties.workflow_runs_30d != 0) then ((.properties.failed_workflow_runs_30d // 0) / .properties.workflow_runs_30d) * 100 | floor else 0 end) - (if (.properties.workflow_runs_7d != null and .properties.workflow_runs_7d != 0) then ((.properties.failed_workflow_runs_7d // 0) / .properties.workflow_runs_7d) * 100 | floor else 0 end)) as $diff | if $diff > 0 then \"Improving\" elif $diff < 0 then \"Degrading\" else \"Stable\" end"
    }
  2. Click Save to update the blueprint.

Update integration mapping

Now we'll update the GitHub integration mapping to populate the workflow and workflow run blueprints with the properties needed for reliability tracking. The default mapping may already handle basic workflow data, but we need to ensure the workflow run mapping filters to the default branch and includes the conclusion property.

  1. Go to your Data Source page.

  2. Select the GitHub integration.

  3. Find or add the workflow resource block in the mapping:

    Workflow mapping (click to expand)
      - kind: workflow
    selector:
    query: 'true'
    port:
    entity:
    mappings:
    identifier: >-
    (.url | capture("repos/(?<repo>[^/]+/[^/]+)/") | .repo) +
    (.id|tostring)
    title: .name
    blueprint: '"githubWorkflow"'
    properties:
    path: .path
    status: .state
    createdAt: .created_at
    updatedAt: .updated_at
    link: .html_url
    relations:
    repository: >-
    .url | capture("repos/[^/]+/(?<repo>[^/]+)/") | .repo
  4. Find or add the workflow-run resource block. This mapping creates workflow run entities filtered to the default branch and last 90 days:

    Workflow run mapping (click to expand)
      - kind: workflow-run
    selector:
    query: >-
    (.head_branch | IN("main", "master", "production")) and
    ((.created_at | fromdateiso8601) > (now - 7776000))
    port:
    entity:
    mappings:
    identifier: .repository.full_name + (.id|tostring)
    title: .display_title
    blueprint: '"githubWorkflowRun"'
    properties:
    name: .name
    triggeringActor: .triggering_actor.login
    status: .status
    conclusion: .conclusion
    createdAt: .created_at
    runStartedAt: .run_started_at
    updatedAt: .updated_at
    runNumber: .run_number
    runAttempt: .run_attempt
    link: .html_url
    headBranch: .head_branch
    relations:
    workflow: .repository.full_name + (.workflow_id|tostring)
    repository: .repository.name
  5. Add a second workflow-run resource block that updates the workflow blueprint with the latest run result. This "merges" the most recent conclusion back onto the workflow entity:

    Workflow result update mapping (click to expand)
      - kind: workflow-run
    selector:
    query: >-
    (.head_branch | IN("main", "master", "production")) and
    ((.created_at | fromdateiso8601) > (now - 7776000))
    port:
    entity:
    mappings:
    identifier: .repository.full_name + (.workflow_id|tostring)
    title: .repository.full_name + (.workflow_id|tostring)
    blueprint: '"githubWorkflow"'
    properties:
    result: .conclusion
    last_triggered_at: .run_started_at
  6. Click Save & Resync to apply the mapping.

Branch filter

The workflow run selector filters to main, master, and production branches only. This ensures reliability metrics reflect production-relevant pipeline health. If your repositories use different default branch names, update the IN(...) list accordingly.

Visualize metrics

We will create a dedicated dashboard to monitor pipeline reliability metrics using Port's customizable widgets.

Create the dashboard

First, let's create an Engineering Intelligence folder (if it doesn't already exist) to organize your dashboards, then add the Pipeline Reliability dashboard inside it:

  1. Navigate to your software catalog.
  2. Click on the + button in the left sidebar.
  3. Select New folder.
  4. Name the folder Engineering Intelligence and click Create.
  5. Inside the Engineering Intelligence folder, click + again.
  6. Select New dashboard.
  7. Name the dashboard Pipeline Reliability and click Create.

Add widgets

You can populate the dashboard using either an API script or by manually creating each widget through the UI.

The fastest way to set up the dashboard is by using Port's API to create all widgets at once.

Get your Port API token

  1. In your Port portal, click on your profile picture in the top right corner.

  2. Select Credentials.

  3. Click Generate API token.

  4. Copy the generated token and store it as an environment variable:

    export PORT_ACCESS_TOKEN="YOUR_GENERATED_TOKEN"
EU region

If your portal is hosted in the EU region, replace api.getport.io with api.port-eu.io in the dashboard creation command below.

Create the dashboard with widgets

Save the following JSON to a file named reliability_dashboard.json:

Dashboard JSON payload (click to expand)
{
"identifier": "pipeline_reliability",
"title": "Pipeline Reliability",
"icon": "Apps",
"type": "dashboard",
"description": "Based on default branch (main/master/production) workflow runs only. Adjust the integration mapping selector to include other branches if needed.",
"parent": "engineering_intelligence",
"widgets": [
{
"id": "reliabilityDashboardWidget",
"type": "dashboard-widget",
"layout": [
{
"height": 512,
"columns": [
{"id": "overallFailureRate", "size": 3},
{"id": "totalFailingWorkflows", "size": 3},
{"id": "failureTrend", "size": 3},
{"id": "workflowRunResults", "size": 3}
]
},
{
"height": 482,
"columns": [
{"id": "topFailingWorkflows", "size": 4},
{"id": "teamFailureBar", "size": 4},
{"id": "serviceFailureBar", "size": 4}
]
},
{
"height": 400,
"columns": [
{"id": "teamReliabilityTable", "size": 12}
]
},
{
"height": 400,
"columns": [
{"id": "serviceReliabilityTable", "size": 12}
]
},
{
"height": 488,
"columns": [
{"id": "failedRunsTable", "size": 12}
]
}
],
"widgets": [
{
"id": "overallFailureRate",
"type": "entities-number-chart",
"title": "Overall Failure Rate % (Last Month)",
"icon": "Metric",
"description": "Percentage of workflow runs that failed across all services in the organisation",
"blueprint": "_team",
"chartType": "aggregateByProperty",
"calculationBy": "property",
"func": "average",
"property": "monthly_workflow_failure_rate",
"averageOf": "total",
"displayFormatting": "round",
"unit": "%",
"unitAlignment": "right",
"dataset": {"combinator": "and", "rules": []}
},
{
"id": "totalFailingWorkflows",
"type": "entities-number-chart",
"title": "Total Failing Workflows",
"icon": "Metric",
"description": "Workflows whose current result is in failure state",
"blueprint": "githubWorkflow",
"chartType": "countEntities",
"calculationBy": "entities",
"func": "count",
"unit": "none",
"dataset": {
"combinator": "and",
"rules": [
{"property": "result", "operator": "=", "value": "failure"}
]
}
},
{
"id": "failureTrend",
"type": "line-chart",
"title": "Workflow Runs Failure Trend (Weekly)",
"icon": "LineChart",
"description": "Weekly trend of failed workflow runs over the past 6 months",
"blueprint": "githubWorkflowRun",
"chartType": "countEntities",
"func": "count",
"measureTimeBy": "runStartedAt",
"timeInterval": "isoWeek",
"timeRange": {"preset": "last6Months"},
"xAxisTitle": "",
"yAxisTitle": "# Failed Workflows",
"dataset": {
"combinator": "and",
"rules": [
{"property": "conclusion", "operator": "=", "value": "failure"}
]
}
},
{
"id": "workflowRunResults",
"type": "entities-pie-chart",
"title": "Workflow Run Results (Last Month)",
"icon": "Pie",
"description": "Breakdown of all workflow runs by conclusion status across the organisation",
"blueprint": "githubWorkflowRun",
"property": "property#conclusion",
"dataset": {
"combinator": "and",
"rules": [
{"property": "createdAt", "operator": "between", "value": {"preset": "lastMonth"}}
]
}
},
{
"id": "topFailingWorkflows",
"type": "bar-chart",
"title": "Top Failing Workflows (Last 30 Days)",
"icon": "Bar",
"description": "Workflows with the most failed runs (last 30 days)",
"blueprint": "githubWorkflowRun",
"property": "mirror-property#workflow_name",
"dataset": {
"combinator": "and",
"rules": [
{"property": "runStartedAt", "operator": "between", "value": {"preset": "lastMonth"}},
{"property": "conclusion", "operator": "=", "value": "failure"}
]
}
},
{
"id": "teamFailureBar",
"type": "bar-chart",
"title": "Teams with Most Workflow Failures (Last 30 Days)",
"icon": "Bar",
"description": "Distribution of failed workflow runs by team",
"blueprint": "githubWorkflowRun",
"property": "mirror-property#team_name",
"dataset": {
"combinator": "and",
"rules": [
{"property": "runStartedAt", "operator": "between", "value": {"preset": "lastMonth"}},
{"property": "conclusion", "operator": "=", "value": "failure"}
]
}
},
{
"id": "serviceFailureBar",
"type": "bar-chart",
"title": "Services with Most Workflow Failures (Last 30 Days)",
"icon": "Bar",
"description": "Distribution of failed workflow runs by service",
"blueprint": "githubWorkflowRun",
"property": "mirror-property#service_name",
"dataset": {
"combinator": "and",
"rules": [
{"property": "conclusion", "operator": "=", "value": "failure"},
{"property": "runStartedAt", "operator": "between", "value": {"preset": "lastMonth"}}
]
}
},
{
"id": "teamReliabilityTable",
"type": "table-entities-explorer",
"displayMode": "widget",
"title": "Team Reliability Overview",
"icon": "Table",
"description": "Workflow failure rates per team",
"blueprint": "_team",
"dataset": {"combinator": "and", "rules": [{"property": "type", "operator": "=", "value": "team"}]},
"excludedFields": [],
"blueprintConfig": {
"_team": {
"groupSettings": {"groupBy": []},
"propertiesSettings": {
"order": ["$title", "workflow_runs_30d", "failed_workflow_runs_30d", "monthly_workflow_failure_rate", "weekly_workflow_failure_rate", "failure_rate_trend", "workflow_runs_7d", "failed_workflow_runs_7d"],
"shown": ["$title", "workflow_runs_30d", "failed_workflow_runs_30d", "monthly_workflow_failure_rate", "weekly_workflow_failure_rate", "failure_rate_trend"]
},
"filterSettings": {"filterBy": {"combinator": "and", "rules": []}},
"sortSettings": {"sortBy": []}
}
}
},
{
"id": "serviceReliabilityTable",
"type": "table-entities-explorer",
"displayMode": "widget",
"title": "Service Reliability Overview",
"icon": "Table",
"description": "Workflow failure rates per service",
"blueprint": "service",
"dataset": {"combinator": "and", "rules": []},
"excludedFields": [],
"blueprintConfig": {
"service": {
"groupSettings": {"groupBy": ["parent_team_name", "team"]},
"propertiesSettings": {
"order": ["$title", "workflow_runs_30d", "failed_workflow_runs_30d", "monthly_workflow_failure_rate", "weekly_workflow_failure_rate", "failure_rate_trend", "workflow_runs_7d", "failed_workflow_runs_7d"],
"shown": ["$title", "workflow_runs_30d", "failed_workflow_runs_30d", "monthly_workflow_failure_rate", "weekly_workflow_failure_rate", "failure_rate_trend"]
},
"filterSettings": {"filterBy": {"combinator": "and", "rules": []}},
"sortSettings": {"sortBy": []}
}
}
},
{
"id": "failedRunsTable",
"type": "table-entities-explorer",
"displayMode": "widget",
"title": "Failed Workflow Runs (Last Month)",
"icon": "Table",
"description": "All workflow runs that concluded with failure",
"blueprint": "githubWorkflowRun",
"dataset": {
"combinator": "and",
"rules": [
{"property": "workflow_current_result", "operator": "=", "value": "failure"},
{"property": "createdAt", "operator": "between", "value": {"preset": "lastMonth"}}
]
},
"excludedFields": [],
"blueprintConfig": {
"githubWorkflowRun": {
"groupSettings": {"groupBy": ["team_name", "service_name", "repository"]},
"propertiesSettings": {
"order": ["repository", "service_name", "team_name", "workflow_name", "link", "$title", "conclusion", "runStartedAt"],
"shown": ["$title", "conclusion", "runStartedAt", "link", "workflow_name"]
},
"filterSettings": {"filterBy": {"combinator": "and", "rules": []}},
"sortSettings": {"sortBy": []}
}
}
}
]
}
]
}

Then run the following command to create the dashboard with all widgets:

curl -s -X POST "https://api.getport.io/v1/pages" \
-H "Authorization: Bearer $PORT_ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d @reliability_dashboard.json | python3 -m json.tool
Engineering Intelligence folder

The script assumes an engineering_intelligence folder already exists in your catalog. If you haven't created it yet, follow step 1-4 in the create the dashboard section first else you will run into an error when you run the script.

Next steps

Once your Pipeline Reliability dashboard is in place, consider these additional improvements:

  • Create automations to send Slack notifications when a service's failure rate exceeds a threshold or when the failure rate trend changes to "Degrading".
  • Add an AI agent to provide natural language insights into your reliability data, helping identify patterns in failures and recommending fixes.
  • Combine with delivery metrics by adding reliability columns to your existing Delivery Performance dashboard for a unified engineering intelligence view.