Introducing Hud - The Runtime Code Sensor for Taming Code-Generating AI
TL;DR: Coding agents are incredible - majestic and powerful - but in complex enterprise-scale production they can still wreak havoc. Hud’s Runtime Code Sensor streams live production data into their context, so the code they generate runs safely in the real world. Here’s our story.
---
Hello world, May and Roee here, and we are so excited to share with you what we’ve been building together with our fabulous team.
We are software engineers. We are excited by the craft of building software. We believe coding agents represent the biggest change to that craft in decades.
Coding agents are amazing. They were trained on massive bodies of code and there are things they’re really awesome at, especially when it comes to building new things from scratch.
But when it comes to changing or adding things to existing systems that operate at scale - they struggle. And how could they not? They are lacking half the picture - how the code behaves in production - where issues are sometimes caused by bad code, sometimes by some failing 3rd party, but most often by some frustratingly intricate combination of the two. There’s a gap between coding agents and reality.
This is where Hud comes in - we are the bridge between coding agents and reality.
We developed a powerful Runtime Code Sensor that you install in a minute in your service, it runs with your code in production, and brings back to the coding agent the context it needs to build production-safe code. We think this is what the future looks like, and we call it "production-aware development".
A new software stack is emerging. Yesterday’s development and observability tools weren’t designed for AI, the era of agentic code generation calls for a new way of thinking. Just like AI has the code "right there" to reason over, it needs production data "right there" to reason over.
How can agents be expected to perform without production awareness? Without visibility to how the entire codebase behaves in production, right now, right here at their artificial fingertips? How can they be expected to write robust code without this data?
We’d love to share a bit of Hud's origin story.
Hud was founded by Roee Adler (CEO) and May Walter (CTO) together with Shai Wininger (non-executive co-founder) - we built multiple startups and managed teams in many countries, have worked together on building big things, and among us we have 4 acquisitions and 3 IPOs.
We started brainstorming Hud in early 2023, immediately after the LLM revolution began. We looked at how software is built and what its future looks like, and got curious about how, despite amazing observability tools, most engineers operate without knowing how their code behaves in production. We believed that for engineers to write high quality code in scalable systems, they must know how the code behaves in production - and so we asked the simple question: why don't they?
Our realization was that while in theory you could put logs and traces everywhere automatically, this would lead to 3 problems: 1) it will slow production down; 2) it will be very expensive; 3) the user experience isn't built for consuming such data in a helpful way.
Simply put, the contemporary observability tech stack is not built for ubiquity. You have to either tell it what you want and then wait, or have on-demand access that slows production down.
Part of the challenge comes from the fact that code in production has multiple layers, some of them are:
1. A business layer: e.g. endpoints, queue consumers
2. A code layer: the functions themselves
3. External dependencies: e.g. databases, IO, other services
Some systems know about the business layer (e.g. APMs), some know about the code layer (e.g. loggers or error trackers) and some about 3rd parties. Sadly oftentimes each part’s data on its own is not only insufficient, but could be misleading or a waste of time. To make matters worse, the code engineers see in their IDEs is not the same code that runs in production, especially in dynamic languages. Transpilation, V8 optimizations - correlating between something in production and something in the IDE at scale is pretty difficult.
We realized that to create the next generation of tools and systems for the future of high quality code, we need to think from first principles and fundamentally differently. No logs or spans or traces, but a Runtime Code Sensor that sees everything, is very sophisticated on the edge, and has negligible footprint. To achieve that, we needed a different technical approach - and assembled a team of low-level cybersecurity researchers and engineers, bringing a wealth of experience in operating system and runtime reverse engineering, to build something completely different:
- An SDK sensor that runs with the code and constantly understands its behavior
- The SDK sends mostly aggregated statistical data - i.e. a tiny fraction of what logs would send, improving both performance and egress
- Because this is function-level data by design, we can embed it alongside the code in the IDE itself, servicing this context exactly where code is written
When we started the company in July 2023, we had two hypotheses:
1) If reality is right there in the context where code is written, then different code will be written,
hence changing software engineering.
2) Context that helps engineers write better code will be even more impactful when used by code-generating AI
We started by productizing the Runtime Code Sensor for enterprise observability. Our sensor could find problems in production much earlier than other systems, and when it found an issue it "knew" the root cause in the code, without needing any configuration or guidance, and served this data in the IDE right where the engineer is working. Super cool. Our users ran our sensor in hundreds of millions of pods, and we developed confidence in its uniqueness and rigidity.
We kept trying to feed the data to code-generating LLMs, but in our first year the results weren't very exciting.
Then around March 2025 things started to change. Better models, IDEs with agentic mode, and the arrival of MCP - suddenly feeding our data to a modern IDE in agentic mode completely changed how it reasoned over and wrote code.
We felt we were staring at something big. We decided everything else is boring. The real revolution is happening where enterprise companies are trying to adopt agentic code generation, and also where the newcomers that were born in agentic AI are starting to grow and experience production pains. That’s the biggest problem AI has in code generation, and the coolest place to innovate in.
We decided to go all in on becoming the new layer between production runtime and agentic code generation - giving LLMs the context they need to build production-safe code.
We don't know what the future holds, but we are certain a new stack is emerging and are excited to play a key role in it. LFG!
The End
Some housekeeping stuff:
- Hud was built to be enterprise grade from day 1, we are SOC2, ISO 27001 and GDPR compliant
- Our sensor is publicly available for Node/JavaScript/TypeScript, in beta for Python, and in the works for C# and Java
- Our sensor runs on AWS, Azure, GCP and on-prem
- Our sensor doesn’t interfere with other observability products - we are deployed alongside Datadog, Dynatrace, Sentry, etc
- Hud doesn’t send your code to our servers
- While Hud does add overhead to the production environment, it adds much less than standard APMs (and we have the benchmarks to show it)