Hud natively supports GitHub Agentic Workflows
A real shift in how software gets built
Continuous integration changed how we build software. Continuous deployment changed how we ship it. GitHub Agentic Workflows could change how we create it, and it’s worth pausing to reflect on why.
The idea is deceptively simple. You describe an outcome in plain Markdown, commit it to your repo as a workflow, and a coding agent executes it as part of GitHub Actions. GitHub calls this Continuous AI: the same automation discipline that gave us CI and CD, now pointed at the work that used to require a human in the loop. Triage an issue. Keep the docs in sync. Find a regression and open a PR for it. The workflow is version-controlled, reviewable, and runs on the infrastructure your team already trusts.
That foundation is what we wanted to build on. So we made sure Hud works natively with GitHub Agentic Workflows.
Why agents need production context
An agentic workflow is only as good as what the agent can see. A coding agent reads your repo’s source code, your diff, and your tests, but it cannot see what your code actually does once it is running live in production. It does not know that the function in the diff is on the hottest path in the system, or that the endpoint it touched already errors for 2% of requests. Without that, the agent might optimize for the wrong thing, or open a PR that compiles and passes teststhat nobody can confidently merge.
“Agentic Workflows give an agent a safe, governed place to run inside the pipeline. Hud gives it the production context to actually be right. Pairing the two is exactly the kind of integration we hoped the ecosystem would build.”
– Idan Gazit, Head of GitHub Next
This is the gap GitHub’s controls are built to manage and the gap Hud is built to fill. GitHub gives the agent a safe place to run and a gated way to ship. Hud gives it the production signal to be right: live invocations, latency percentiles, error rates, and the map from functions to the endpoints they serve, all queryable while the workflow runs. Together that is an agent that runs safely and ships something worth merging.

How the integration works
Hud is exposed as an MCP server, which GitHub Agentic Workflows can use and consume. A gh-aw workflow is a single Markdown file: a YAML frontmatter block for configuration, and a body that represents the agent’s prompt. Hud connects in the frontmatter, with one server declaration and one secret:
mcp-servers:
hud-mcp:
command: “npx”
args: [“-y”, “hud-mcp@v2”]
env:
HUD_MCP_KEY: “${{ secrets.HUD_MCP_KEY }}”If the agent plans to open a PR, you pair it with GitHub’s safe-outputs so the change is gated through review, and add Hud’s domains to the network allow-list:
engine: claude
safe-outputs:
create-pull-request:
draft: true
network:
allowed: [defaults, node, github, api.hud.io, cdn.hud.io]
Run gh aw compile and GitHub turns the Markdown into a standard Actions lock file. Commit both, and the workflow runs like any other Action, except now the agent has production context. There is no SDK to wire in and no per-runner adapter to maintain. Because Hud is just an MCP server, the same integration works across GitHub Actions, gh-aw, and other runners. The production signal never changes shape.
What makes a good flow
Not every task makes a good agentic workflow. The ones that work share three traits.
A clear trigger. The flow fires on a well-defined event: a PR is created, a webhook is hit , a scheduled job runs, an issue is labeled. The trigger should map cleanly to the work, so the agent runs when there is something real to do and not otherwise.
A well-defined compute task. The agent needs a bounded job with a known shape: specific inputs, a clear question to answer, and the data and tools to answer it. “Score the blast radius of this diff” is a good task. “Make the codebase better” is not. The tighter the task, the more reliable the result.
A well-structured outlet. The output has to land somewhere with structure and a gate: a draft PR, a scored comment, a Slack message tagged to the right owner. A good outlet makes the result reviewable and actionable, and keeps a human in the loop on anything that ships.
Get those three right and the agent has a clear reason to run, a bounded job to do, and a safe place to put the result. That is the pattern every recipe below follows.
What it looks like in practice

Take blast-radius scoring, one of the recipes we publish. Triggered on a PR, the agent extracts the changed function signatures from the diff, resolves them to Hud function IDs, and pulls production metrics over a lookback window: invocations, p90, p99, standard deviation and error rate. It maps those functions to their endpoints and pulls the same metrics there. From that it computes a low-medium-high score, weighting traffic, latency sensitivity, fan-out, and the inherent risk of the diff. A comment-only change is capped at 15. An auth change on a high-traffic endpoint scores Critical. The reviewer sees exactly where to look before they merge.
The weekly-report recipe takes it further. On a schedule it analyzes the past week of production for regressions, spins up parallel sub-agents to investigate root causes and propose fixes, runs git blame to tag the right authors in Slack, scores and dedupes the findings, and for the top fix opens a draft PR through safe-outputs. The team gets one message on Monday: here is what got worse, here is why, here is the fix, and a draft PR is already open.
Every step depends on signals the agent could not get from source code alone. That is the whole point of the partnership. GitHub made the agent a safe, first-class part of the pipeline. Hud makes sure it works knowing what production is actually doing.
Try it
Our recipes repo has install-ready examples of GitHub Agentic Workflows, plus a mix-and-match guide for pairing any prompt with any runner. If you are already running GitHub Agentic Workflows, adding Hud is one block of frontmatter and one secret away.
Docs link is right here.