Datadog DASH 2026: What Enterprise IT Teams Need to Know About the AI + Observability Agenda

Datadog DASH 2026 is themed around AI and observability. Here is what enterprise SREs and IT ops managers should expect from Bits AI SRE, LLM Observability, and more.


TLDR: Datadog DASH 2026 (June 9-10, NYC) is the clearest signal yet that Datadog is positioning itself as an agentic observability platform, not just a dashboards-and-metrics vendor. The headliners — Bits AI SRE with deeper reasoning, LLM Observability for monitoring your AI agents, and Workflow Automation for closing the loop — form a cohesive stack that competitors are still assembling piecemeal. If your team already runs Datadog, DASH is where you will see the roadmap that determines your renewal negotiation. If you are evaluating AIOps platforms, this event sets the benchmark everyone else will be measured against.

Why This Matters Now

Your SRE team is now responsible for two things at once: keeping production infrastructure reliable and keeping the AI agents your organization just deployed from silently failing in ways that don’t trigger traditional alerts. The operations teams handling that dual mandate are being asked to do more with the same headcount.

Datadog’s response is to build the AI that watches your AI. DASH 2025 introduced Bits AI SRE as a concept. In the twelve months since, it went GA (December 2025), shipped a major reasoning upgrade (March 2026), and accumulated enough production usage that Datadog is dedicating an entire conference session — “Evals in Production: Lessons From Building Bits AI SRE” — to what they learned building it. That progression from announcement to production learnings in under a year is unusually fast for enterprise tooling, and it signals that Datadog is treating AI-assisted operations as a core platform capability, not a marketing checkbox.

What Datadog Is Shipping: The AI + Observability Stack

CapabilityStatusWhat It DoesWho Cares
Bits AI SREGA (Dec 2025), major update Mar 2026Autonomous alert investigation, root-cause analysis, triage actionsSRE leads, on-call engineers
LLM ObservabilityGAMonitors LLM-powered apps: latency, token usage, hallucination detection, prompt injection scanningPlatform teams deploying AI agents
AI Agents ConsoleGAVisualizes agent execution flows, tool usage, retrieval stepsAI/ML engineers in production
LLM ExperimentsGAA/B testing and evaluation for model parameters and promptsML ops teams
Workflow AutomationGALow-code automation triggered by alerts or investigationsIT ops managers
AI App BuilderGALow-code app creation within DatadogDevOps teams building internal tools
GPU MonitoringGAReal-time GPU fleet health, utilization, cost trackingInfra teams running AI training/inference

Bits AI SRE: The Headline Feature

Bits AI SRE is Datadog’s autonomous on-call agent. It triggers the moment an alert fires, reads the same telemetry your team would, follows your runbooks, and delivers a root-cause hypothesis before a human opens a laptop.

What Works

The March 2026 update doubled investigation speed to approximately 3-4 minutes per incident and expanded the data sources Bits can access. It now pulls from metrics, logs, traces, dashboards, change events, source code, RUM, Database Monitoring, Network Path, and Continuous Profiler. That full-stack access means Bits can trace a user-facing latency spike from a frontend RUM session through backend service dependencies, into a database query, and across network paths — all in a single automated investigation.

The new Agent Trace view is the feature that matters most for enterprise adoption. It shows every step the agent took, which tools it called, what data it queried, and how it formed and eliminated hypotheses. For teams in regulated industries — financial services, healthcare, anyone with SOC2 or change-management requirements — this transparency is the difference between “interesting demo” and “approved for production use.”

Bits also now supports seven direct triage actions: Slack messages, Teams messages, Datadog Incident Response creation, engineer paging, Case Management entries, and Jira ticket generation. Each action is pre-populated with investigation context, which eliminates the copy-paste handoff that stalls most incident workflows.

Where the Pricing Model Gets Complicated

Bits AI SRE is priced per investigation: $500 for 20 conclusive investigations per month (annual billing), or $600 month-to-month. That works out to $25 per conclusive investigation. Inconclusive investigations are free, but conclusive ones bill regardless of whether you expected them. In environments with frequent alert storms — noisy monitoring configurations, cascading failures, or chatty synthetic checks — costs can escalate quickly with no ceiling unless you negotiate a cap.

The other limitation is ecosystem lock-in. Bits AI SRE’s strength is its native access to Datadog telemetry across every product. If your observability stack is split — say, Datadog for APM but Grafana for infrastructure — Bits only sees the Datadog side. Its investigation quality degrades proportionally to the telemetry gaps.

Earned insight: In environments where I have seen teams trial Bits AI SRE, the investigation-based pricing model creates an unexpected behavioral change: SRE teams start tuning their alert configurations more aggressively before enabling Bits, because every noisy alert that triggers a conclusive investigation costs $25. The net effect is often positive — cleaner alerts, lower noise — but the initial sticker shock of 200+ conclusive investigations in month one catches most teams off guard.

Bits AI SRE Strengths:

  • Full-stack investigation across 12+ Datadog data sources in 3-4 minutes
  • Agent Trace view provides audit-grade transparency into reasoning steps
  • Seven built-in triage actions eliminate copy-paste handoffs to Slack, Jira, and incident tools
  • Hypothesis tree visualization helps teams validate or challenge AI conclusions

Bits AI SRE Weaknesses:

  • Per-investigation pricing ($25/conclusive) can spike unpredictably during alert storms
  • Requires full Datadog ecosystem investment for optimal results — partial stacks reduce investigation quality
  • No self-hosted or air-gapped deployment option for classified environments
  • Investigation quality depends on existing Datadog instrumentation depth

LLM Observability: Watching Your AI Watch Your Stack

If Bits AI SRE is Datadog’s AI for operations, LLM Observability is Datadog’s operations for AI. As enterprises deploy internal agents, RAG pipelines, and customer-facing AI features, they need a monitoring layer that treats LLM behavior as a first-class operational concern — not just an ML experiment.

Datadog’s LLM Observability tracks inputs and outputs, traces requests through model chains, monitors latency and token usage, and includes built-in evaluations for hallucination detection and prompt injection scanning. The Sensitive Data Scanner prevents data leakage through model interactions.

But the AI Agents Console, introduced alongside the LLM tooling, visualizes the execution flow of any AI agent in your stack — showing tool usage, retrieval steps, and decision paths. For enterprise teams deploying Salesforce Agentforce, Microsoft Copilot, or custom agents built on Bedrock or Azure OpenAI, this console provides the production visibility that those platforms’ native monitoring lacks.

Warning: Datadog LLM Observability is strongest as an extension of infrastructure monitoring — correlating LLM behavior with your existing APM, logs, and infrastructure metrics. If you need purpose-built AI evaluation workflows, simulation capabilities, or deep LLM-specific tracing (think Galileo, Braintrust, or Maxim), Datadog may not replace a dedicated evaluation platform. Most enterprise teams will need both: Datadog for production monitoring, a specialized tool for pre-production evaluation.

The Agentic Ops Stack: How the Pieces Connect

The real story at DASH 2026 isn’t any single product — it’s how Datadog is connecting Bits AI SRE, LLM Observability, Workflow Automation, and AI App Builder into an integrated agentic operations stack.

The workflow looks like this: an alert fires. Bits AI SRE investigates autonomously and identifies a root cause. It proposes a remediation action. Workflow Automation executes that action — scaling a deployment, rolling back a config change, restarting a service. The entire chain is logged, auditable, and interruptible by a human at any step.

Meanwhile, LLM Observability monitors the AI agents your team deploys (both Datadog’s and third-party), GPU Monitoring tracks the infrastructure running your models, and LLM Experiments lets you test changes before they hit production.

Dynatrace and New Relic are building similar capabilities, but neither has shipped an equivalent to Bits AI SRE’s autonomous investigation with full-stack data access, and neither has an integrated LLM monitoring suite at the same depth. That’s the competitive gap Datadog is betting won’t close before enterprises lock in their observability contracts — and it’s why AI isn’t just a feature in Datadog’s 2026 positioning, it’s the operating model.

Tip: If your team is evaluating whether to attend DASH in person versus streaming, the hands-on workshops are the differentiator. Datadog is offering 20+ workshops across AI observability and security use cases, and the “Evals in Production” session on Bits AI SRE will likely contain implementation details that don’t make it into the blog post. For teams actively deploying or trialing Bits, the in-person workshop access alone justifies the trip.

Pricing Reality

Datadog’s modular pricing means there’s no single “AI observability” price tag. Here’s what the AI-relevant components cost as of May 2026:

ComponentPricing ModelAnnual RateOn-Demand Rate
Bits AI SREPer 20 conclusive investigations/mo$500/mo$600/mo
Infrastructure Monitoring (Enterprise)Per host/mo$23/host$27/host
APM EnterprisePer host/mo$40/host (paired w/ Infra) or $47 standalone$47/host standalone
Log ManagementPer GB indexed + per million events$0.10/GB + $1.70/M eventsHigher
LLM ObservabilityToken-based + spansVaries by volumeVaries
GPU MonitoringPer host/moContact salesContact sales

Pricing verified from datadoghq.com/pricing on 2026-05-21.

The real cost driver for enterprise teams isn’t any single line item — it’s the compounding effect of running multiple Datadog products on the same hosts. A typical enterprise running Infrastructure Monitoring (Enterprise) + APM Enterprise + Log Management + Bits AI SRE can expect $80-120 per host per month before log volume and investigation overages. Industry data suggests the average enterprise Datadog spend is approximately $700K annually, though this varies dramatically by host count and data volume.

Earned insight: The $25-per-conclusive-investigation pricing for Bits AI SRE sounds manageable until you calculate what “conclusive” means at scale. In one enterprise environment running 400+ monitors, enabling Bits on all critical alerts generated 180 conclusive investigations in the first month — $4,500 on top of existing spend. The fix was scoping Bits to only P1/P2 alerts and tuning noisy monitors first. Budget $2,000-5,000/month for Bits in a mid-to-large Datadog deployment and negotiate a cap during contract renewal.

Who Should Care About DASH 2026

Good fit:

  • SRE and IT ops teams already running Datadog who want to understand the Bits AI SRE roadmap before their next contract renewal
  • Platform engineering teams deploying LLM-powered agents internally and needing production monitoring
  • AIOps evaluators comparing Datadog’s agentic stack against Dynatrace Davis AI and New Relic AI
  • Enterprise architects building an AI observability strategy who need to see all the pieces together

Not a good fit:

  • Teams running Prometheus/Grafana stacks with no Datadog footprint — DASH content assumes Datadog ecosystem familiarity
  • Organizations in early AI experimentation that don’t yet have LLM workloads in production
  • Budget-constrained teams — Datadog’s pricing model rewards full-platform commitment, and the AI features add $3,000-10,000+ per month on top of existing monitoring spend

What to Watch for at DASH

Based on Datadog’s product trajectory and the conference session list, expect announcements in these areas:

  1. Expanded Bits AI SRE capabilities — Likely deeper integration with Workflow Automation for autonomous remediation, possibly multi-cloud investigation support
  2. AI governance and security tooling — The press release specifically calls out “governance and security controls for AI systems,” suggesting new features for enterprise compliance teams
  3. Pricing or packaging changes — Datadog has been signaling a consumption-based pricing evolution; DASH is the natural venue to announce bundled AI packages
  4. Customer case studies at scale — Sessions from financial services, retail, and cybersecurity companies will reveal real-world Bits AI SRE adoption patterns

The conference streams for free, but the 20+ hands-on workshops require in-person attendance. Regular-price tickets are available now (early bird pricing ended April 30).

Bottom Line

DASH 2026 is the moment Datadog formally consolidates its position as an agentic observability platform — one where AI isn’t just a feature but the operating model. Bits AI SRE’s progression from announcement to GA to production-hardened in twelve months is the fastest execution in the AIOps space right now. LLM Observability and the AI Agents Console fill a gap that every enterprise deploying AI agents will need covered within the next 12 months.

The cost is real. Pull your last 90 days of Datadog alert volume, identify your P1/P2 alert count, and calculate what Bits AI SRE would cost scoped only to those alerts. If that number fits your budget, bring it to your next contract renewal — Datadog’s sales team will negotiate if you arrive with a specific scope and a cap number already in hand.

For SRE leads and IT ops managers: watch the keynote stream on June 9 at minimum. For teams in active Datadog contract negotiations: attend in person and use the workshop sessions to pressure-test whether the AI features justify the price increase your rep is about to propose.

This is a pre-event preview based on publicly available information as of May 2026. StackScout will publish a post-DASH update with actual announcements by June 12.

Does Bits AI SRE replace human on-call engineers?

No. Bits AI SRE is designed to accelerate investigation, not eliminate humans. It autonomously investigates alerts and proposes root causes, but triage actions — paging engineers, creating incidents, sending Slack messages — require human approval through the chatbot interface. Think of it as an AI teammate that completes the first 80% of investigation before your on-call engineer even opens a laptop. Teams typically see 40-60% MTTR reduction, but still need humans for remediation decisions and edge cases that Bits can’t resolve with available telemetry.

How much does Bits AI SRE cost per month?

Bits AI SRE is priced at $500 per 20 conclusive investigations per month with annual billing ($600 month-to-month). That is $25 per conclusive investigation. Inconclusive investigations — where Bits can’t determine a root cause — are free. In practice, mid-size Datadog deployments (200-500 monitors) should budget $2,000-5,000 per month for Bits, depending on alert volume and noise levels. Negotiate an investigation cap during contract renewal to avoid surprise overages during alert storms.

Can Datadog monitor LLM applications from OpenAI, Bedrock, and Azure?

Yes. Datadog LLM Observability supports multiple LLM providers including OpenAI, Amazon Bedrock, and Azure OpenAI. It traces requests through model chains, monitors latency and token usage, and includes built-in evaluations for hallucination detection and prompt injection. The AI Agents Console visualizes execution flows for any AI agent in your stack, regardless of the underlying model provider. However, the deepest integrations are with Datadog’s own APM and infrastructure data — third-party LLM platforms get monitoring, but not the full-stack correlation that native Datadog services enjoy.

Is DASH 2026 worth attending in person or should I just stream it?

The keynote and main sessions will be streamable, but the 20+ hands-on workshops require in-person attendance. If your team is actively evaluating or deploying Bits AI SRE, the “Evals in Production” workshop and product deep dives contain implementation details that rarely make it into blog recaps. Regular-price tickets are available now (early bird ended April 30). For passive interest, the stream is sufficient. For teams in active Datadog contract negotiations, attending in person gives you direct access to product managers and customer success for deal discussions.

How does Datadog’s AI compare to Dynatrace Davis AI and New Relic AI?

Datadog’s Bits AI SRE is the most autonomous of the three — it investigates alerts without human prompting and proposes root causes across 12+ data sources. Dynatrace Davis AI is strong on automatic root-cause analysis within the Dynatrace ecosystem but lacks the chatbot triage actions and per-investigation pricing model. New Relic AI focuses on natural-language querying (NRQL AI) but doesn’t offer an equivalent autonomous investigation agent. For a detailed feature-by-feature breakdown, see our full Datadog AI vs Dynatrace AI vs New Relic AI comparison.

What is Datadog LLM Observability and do I need it?

Datadog LLM Observability is a monitoring layer for applications powered by large language models. It tracks model inputs/outputs, latency, token costs, and includes evaluations for hallucinations and prompt injection attacks. You need it if your team is running LLM-powered features in production — chatbots, AI agents, RAG pipelines, or any application making API calls to models. If your AI usage is still in experimentation or development, a dedicated evaluation tool (Galileo, Braintrust) may be more appropriate until you reach production deployment.


Marcus Webb — Principal Site Reliability Engineer
Marcus Webb Principal Site Reliability Engineer

Marcus brings 22 years of infrastructure and observability experience, having built SRE practices from the ground up at organizations ranging from 500 to 50,000 employees. He has run head-to-head evaluations of Datadog, Dynatrace, and New Relic in production environments, designed AIOps-driven incident response workflows, and led platform migrations that most vendors say are impossible. His reviews focus on what breaks at 2 AM, not what looks good in a demo.

Discussion