Roadmap: Agent-Native Debugging¶
Django Orbit should evolve from a Django debugging dashboard into an agent-native observability layer: a local, privacy-conscious system that gives humans and AI agents enough structured context to move from ticket or error to root-cause hypothesis and proposed fix.
Direction¶
The future debugging workflow is likely to be hybrid:
- humans use the dashboard for inspection, trust-building and quick triage;
- agents use MCP tools, incident bundles and structured traces for investigation;
- coding agents consume the same context to propose patches, tests and pull requests;
- OpenTelemetry compatibility keeps Orbit interoperable with the wider observability ecosystem.
Orbit's differentiator should be Django-native context: request lifecycle, ORM behavior, middleware, templates, cache, storage, mail, auth/gates, jobs, signals and settings-aware safety boundaries.
Track A: High-Level MCP Tools¶
Current MCP tools expose useful raw slices. The next generation should expose investigation primitives that answer developer and agent questions directly.
Request and Endpoint Investigation¶
| Tool | Purpose |
|---|---|
investigate_request(family_hash) | Return a complete diagnosis for one request: request summary, child events, slow spans, duplicate queries, exceptions, logs and likely causes. |
investigate_endpoint(path, method=None, hours=24) | Summarize health for an endpoint across recent traffic: latency, error rate, slow queries, top exception groups and regressions. |
compare_endpoint_windows(path, baseline_hours=24, current_hours=1) | Compare current endpoint behavior against a prior window to detect regressions. |
find_slowest_endpoints(hours=24, limit=10) | Rank endpoints by p95 or average duration with query and exception context. |
find_erroring_endpoints(hours=24, limit=10) | Rank endpoints by error rate and show representative requests. |
summarize_request_family(family_hash) | Compact, agent-friendly summary of one family without full payload noise. |
get_request_timeline(family_hash) | Return ordered spans/events suitable for RCA and timeline rendering. |
find_related_requests(family_hash, limit=10) | Find similar requests by path, exception fingerprint, tags or duplicate-query signature. |
explain_status_code_spike(status_code, hours=24) | Identify paths and fingerprints driving a spike in 4xx/5xx responses. |
Exception and Error Intelligence¶
| Tool | Purpose |
|---|---|
investigate_exception_group(fingerprint) | Return representative traceback, frequency, affected paths, first/last seen and likely owner surface. |
summarize_exception_groups(hours=24) | Group exceptions by fingerprint and rank by recency, frequency and blast radius. |
find_new_exception_groups(hours=24, baseline_hours=168) | Detect exceptions not seen in the baseline period. |
find_regressed_exception_groups(hours=24, baseline_hours=168) | Detect exception groups whose frequency increased materially. |
get_exception_repro_context(entry_id) | Return request method/path, payload shape, user/auth hints, related logs and DB/cache events needed to reproduce. |
trace_exception_to_request(entry_id) | Walk from exception to parent request and adjacent logs/queries/jobs. |
classify_exception(entry_id) | Classify likely category: validation, auth, database, migration, external service, timeout, template, settings, code bug. |
suggest_exception_fix(entry_id) | Produce a bounded hypothesis and candidate code areas, not an automatic patch. |
Database and ORM Analysis¶
| Tool | Purpose |
|---|---|
investigate_slow_query(entry_id) | Explain one slow query with request context, duplicates, stack caller and optional EXPLAIN summary. |
find_n_plus_one_candidates(hours=24) | Rank endpoints/requests with duplicate-query evidence. |
explain_n_plus_one(family_hash) | Identify repeated SQL shapes, likely model relation and suggested select_related / prefetch_related. |
find_duplicate_query_signatures(hours=24) | Group repeated SQL fingerprints globally. |
find_query_regressions(hours=24, baseline_hours=168) | Detect query count or duration increases by endpoint. |
find_missing_indexes_candidates(hours=24) | Use slow query patterns and EXPLAIN output to identify likely missing indexes. |
summarize_db_load(hours=24) | Return query count, slow count, duplicate count, top tables if inferable and worst endpoints. |
get_query_callsite(entry_id) | Return captured Python stack/caller context for the query. |
Logs, Cache, Storage, Mail and Jobs¶
| Tool | Purpose |
|---|---|
investigate_log(entry_id) | Tie a warning/error log to request/job context and nearby events. |
find_warning_clusters(hours=24) | Group warnings by logger/message shape and affected paths. |
analyze_cache_efficiency(hours=24) | Cache hit/miss ratios by operation/key prefix if safe. |
find_cache_miss_spikes(hours=24) | Identify endpoints or code paths with elevated misses. |
investigate_job_failure(entry_id) | Summarize failed job, exception/log context and related DB/cache/external calls. |
summarize_jobs(hours=24) | Job success/failure counts, slow jobs and top failure fingerprints. |
investigate_storage_error(entry_id) | Explain storage failures with backend, operation and safe path metadata. |
investigate_mail_failure(entry_id) | Summarize mail send failures, backend and safe recipient metadata. |
find_external_service_failures(hours=24) | Group HTTP client failures/timeouts by host and endpoint. |
Security and Privacy-Aware Debugging¶
| Tool | Purpose |
|---|---|
find_authz_denials(hours=24) | Summarize gate/permission denials by permission, user class and endpoint. |
investigate_permission_denial(entry_id) | Tie a denial to request context and related logs. |
audit_mcp_exposure() | Report MCP config, masking config and which tools can expose what classes of data. |
preview_masked_entry(entry_id) | Show exactly what an agent would receive after masking. |
find_sensitive_payload_risks(limit=20) | Identify entries whose keys look sensitive and confirm masking behavior. |
list_agent_safe_fields(entry_type) | Return the allowlisted fields exported to MCP/incident bundles. |
Ticket-to-Diagnosis Tools¶
| Tool | Purpose |
|---|---|
investigate_ticket(text, hours=72) | Parse a ticket/error report and search Orbit for matching paths, messages, fingerprints and tags. |
match_error_text(text, hours=72) | Match pasted stack traces or user reports to exception groups and logs. |
build_debug_brief(query, hours=72) | Generate a concise brief for a human or coding agent from a natural-language problem. |
find_recent_changes_context(path=None) | Return Orbit-side evidence useful to compare against recent code changes, without reading git. |
propose_reproduction_steps(entry_id) | Convert request/error context into likely repro steps and test targets. |
propose_test_plan(entry_id) | Suggest unit/integration/E2E tests that would cover the failure. |
propose_fix_hypotheses(entry_id) | Produce ranked hypotheses with confidence, evidence and files/surfaces likely involved. |
create_agent_handoff_bundle(entry_id_or_query) | Produce a compact JSON/Markdown bundle for Codex/Cursor/Claude. |
Daily Developer Workflows¶
| Tool | Purpose |
|---|---|
daily_health_brief(hours=24) | Morning digest: error groups, slow endpoints, N+1s, job failures, new warnings. |
what_changed_in_orbit(hours=24) | Human-friendly summary of notable runtime behavior changes. |
triage_top_issues(hours=24, limit=10) | Rank issues by severity, recency, frequency and blast radius. |
find_flaky_failures(days=7) | Detect intermittent exception/job/log patterns. |
list_open_debug_threads(hours=24) | Return unresolved-looking issue clusters with latest evidence. |
suggest_next_debug_action(issue_id_or_entry_id) | Recommend the next investigation step based on missing evidence. |
generate_pr_context(entry_id_or_group) | Create PR-ready context: problem, evidence, suspected cause, test plan. |
generate_release_risk_brief(hours=24) | Before release: current errors, slow paths, new exception groups and safety warnings. |
Track B: Incident Bundles¶
Incident Bundles are portable evidence packages generated from a request, exception group, endpoint, ticket text or natural-language query. They should be small enough for agents, structured enough for automation and readable enough for humans.
Bundle Sources¶
family_hashfor a single request lifecycle;- exception
fingerprintfor grouped errors; - endpoint path/method for route-level behavior;
- raw ticket text or pasted traceback;
- tag, query or time window.
Bundle Contents¶
Each bundle should include:
- metadata: bundle id, generated at, Orbit version, time window, source type;
- primary evidence: request summary, exception summary, endpoint stats or matched ticket terms;
- timeline: ordered events with relative offsets;
- query analysis: slow queries, duplicate signatures, callsites and EXPLAIN summaries when available;
- logs: warning/error logs near the event, grouped by logger/message shape;
- related systems: cache, jobs, mail, storage, Redis, HTTP client, gates;
- safety report: masking status, omitted fields, payload truncation and MCP exposure policy;
- hypotheses: ranked causes with supporting evidence and confidence;
- suggested next actions: reproduce, inspect file/surface, add test, check config, add index, etc.;
- agent handoff: compact JSON plus Markdown brief.
Bundle Formats¶
incident_bundle.jsonfor agents and automation;incident_bundle.mdfor tickets, PRs and humans;- optional zipped export for sharing with maintainers;
- future: OTLP trace export for OpenTelemetry-compatible backends.
MCP Tools for Bundles¶
create_incident_bundle(source_type, source_id_or_text, hours=72)get_incident_bundle(bundle_id)list_recent_incident_bundles(limit=20)export_incident_bundle(bundle_id, format="json|markdown")redact_incident_bundle(bundle_id, policy="agent_safe")
Storage Model¶
Start without a new table: generate bundles on demand from OrbitEntry and return them directly. Add persisted bundles later only if users need sharing, comments or issue tracking.
Track C: OpenTelemetry Bridge¶
Add OpenTelemetry as an interoperability lane, not as a replacement for Orbit's Django-native storage.
Possible milestones:
orbit.exporters.opentelemetrymodule that convertsOrbitEntryobjects into OTEL-like spans/events.- Config keys:
OTEL_EXPORT_ENABLED,OTEL_EXPORT_ENDPOINT,OTEL_SERVICE_NAME,OTEL_HEADERS,OTEL_SAMPLE_RATE. - Map
family_hashtotrace_idand child entries to spans/events. - Map SQL/cache/http_client/job/mail/storage events to semantic attributes where stable.
- Add GenAI/LLM attributes when the AI watcher ships.
- Provide
python manage.py orbit_export_otel --since ...for batch export. - Optional live export hook with strict fail-silent behavior.
Track D: AI and LLM Watcher¶
Capture AI application behavior for Django apps that call LLM providers or agent frameworks.
Initial surfaces:
- OpenAI Python SDK;
- Anthropic Python SDK;
- LangChain / LangGraph callbacks;
- LiteLLM if present;
- HTTP fallback for known provider hosts.
Events should capture only safe metadata by default:
- provider, model, operation, status;
- latency, retries, timeout/error type;
- token counts and estimated cost when available;
- tool call names, not raw arguments by default;
- prompt/completion capture disabled by default and guarded by explicit config;
- request family correlation so LLM calls appear inside the Django request timeline.
Possible config:
ORBIT_CONFIG = {
"RECORD_AI": True,
"AI_CAPTURE_PROMPTS": False,
"AI_CAPTURE_COMPLETIONS": False,
"AI_CAPTURE_TOOL_ARGS": False,
"AI_MASK_KEYS": ["password", "token", "secret"],
}
Track E: Agent Safety Layer¶
Before Orbit becomes deeply agentic, it needs explicit data boundaries.
Milestones:
- Central
agent_safe_serialize_entry(entry)used by MCP and bundles. - Default field allowlists by entry type.
- Payload size caps and deterministic truncation metadata.
- MCP-level config:
MCP_ENABLED,MCP_ALLOWED_TOOLS,MCP_DENIED_TOOLS,MCP_MAX_LIMIT,MCP_INCLUDE_PAYLOADS. audit_mcp_exposure()tool.- Masking preview in dashboard and MCP.
- Optional per-environment profiles:
local,staging,production.
Track F: From Error to Fix Hypothesis¶
Orbit should not directly edit code. It should produce evidence-rich handoffs that coding agents can use.
Workflow:
- Developer pastes a ticket, traceback or endpoint complaint.
- Orbit matches it to entries, exception groups, endpoints or related logs.
- Orbit creates an Incident Bundle.
- Orbit ranks hypotheses and suggests repro/test plan.
- A coding agent consumes the bundle, inspects the repo, writes tests, proposes a fix and references the evidence.
- Orbit can later verify whether the runtime symptom disappeared.
MCP sequence example:
investigate_ticket("Users get 500 on checkout")
create_incident_bundle("ticket", "Users get 500 on checkout")
propose_fix_hypotheses(bundle_id)
propose_test_plan(bundle_id)
generate_pr_context(bundle_id)
Near-Term Priority¶
The v0.11.0 release ships endpoint investigation, daily health briefs and release risk briefs from the roadmap. The next pragmatic PR sequence:
- Add
agent_safe_serialize_entry()and make existing MCP tools use it. - Add
investigate_request(family_hash). - Add
investigate_exception_group(fingerprint). - Add
create_incident_bundle(...)as on-demand JSON. - Add
build_debug_brief(query, hours=72). - Add OpenTelemetry export design doc and config stub.
- Add AI watcher behind conservative defaults.