Reliai Demo
Explore a realistic AI reliability workflow in under two minutes.
Simulated failure
Hallucination spike detected
A prompt update introduced hallucinated responses. Reliai detected the regression, opened an incident, and recommended a guardrail before users noticed.
Demo scenario
AI Support Copilot serves production support traffic.
Hallucination spike detected appeared after a prompt rollout and surfaced before end-user complaints arrived.
This walkthrough shows the full operator loop: detect, explain, mitigate, and decide on rollout safety.
01
System health
Start at the control panel.
Reliai system status page
AI reliability control panel
AI Support Copilot
Default status page for this AI system. It answers what is happening, whether it is safe, and where an operator should click next.
Is this system safe right now?
Answer: MAYBE
This AI system needs review before the next change.
The system is stable enough to operate, but current signals show elevated reliability risk.
System Health
Reliability score
92
Active incidents
1
Guardrails protecting
17
Traffic
Traces analyzed (24h)
2.3M
Throughput
27
traces/sec · 1m avg
Active services
6
System status
What needs attention next
Latest deployment
Today
Risk score 0.24
Incident pressure
1 incidents / 24h
Latest: Hallucination spike after retriever prompt rollout
Guardrail pressure
17 triggers / 24h
Top policy: structured_output
Deployment risk
Safety before the next rollout
Guardrail activity
Runtime protection coverage
Policy compliance
Organization guardrail coverage
structured output
Mode: enforce
98.0%
cost budget
Mode: warn
96.0%
latency retry
Mode: enforce
94.0%
Recommended next step
Operator guidance
Add hallucination guardrail to retriever prompt
structured_output -> Add hallucination guardrail to retriever prompt. for gpt-4.1
Add retry policy for retrieval failures
latency_retry -> Retry retrieval failures once before fallback
02
Incident
Responses referencing non-existent documentation
Reliai incident command center
Incident command center
Retrieval latency regression after prompt rollout
AI Support Copilot · retrieval_latency_ms · opened Today
Likely root cause
What probably broke
Prompt rollout changed retrieval behavior
Confidence 71%
Graph related patterns
Prompt rollout expanded retrieval context from 4 to 6 chunks
Retrieval timeout under long-context prompts
Global platform failures
Prompt rollout changed retrieval behavior · 71%
Retrieval stage is exceeding expected latency · 62%
Similar long-context retrieval failures seen elsewhere · 48%
Deployment changes
gpt-4.1-mini
support/refund-v42
82 min before incident
Guardrail triggers
structured_output · 11 triggers
latency_retry · 4 triggers
Recommended mitigation
What the operator should do next
Enable latency retry on retrieval and compare v42 against the previous prompt before expanding rollout.
Recommended guardrails
Enable structured output validation on the full response path
Add retry policy for retrieval failures before fallback
Pause rollout of prompt version support/refund-v42
Rollback suggestion
A linked deployment was active near incident start. Verify the rollout and consider rollback if the trace compare confirms regression.
Model change risk
gpt-4.1-mini is part of the deployment context and should be treated as a candidate change vector.
Roll back prompt v42 after enabling retrieval retry
The likely change vector is the prompt rollout. Restore the previous prompt, enable latency retry, then compare baseline vs failing traces before redeploying.
Incident status
Scope · environment:production
Window · 30 min
Owner · owner@acme.test
Ack · Today
Deployment context
Environment · production
Deployed · Today
Time to incident · 82 min
Prompt · support/refund-v42
Model · gpt-4.1-mini
Guardrail activity
structured_output
11 triggers
Last trigger · Today
latency_retry
4 triggers
Last trigger · Today
03
Trace graph
Slowest span: retrieval · Token heavy span: llm_call
Reliai trace debugger
Execution graph
trace_demo_agent_94f3
AI system debugging view for one request across retrieval, prompt construction, model execution, tool calls, guardrails, and post-processing.
Spans
6
Edges
5
Environment
production
Trace analysis
What stands out in this request
Slowest step
retrieve_context
980 ms
Largest token consumer
answer_customer
3840 tokens
Guardrail retries
structured_output
1 retries
Estimated cost surface
6940 tokens
Across all recorded spans in this trace graph
Span legend
Execution breakdown
Span tree
ai_request
gpt-4.1-mini
span span_root · root span
retrieve_context
pgvector
span span_retrieval · parent span_root
build_prompt
prompt-assembler
span span_prompt · parent span_root
answer_customer
gpt-4.1-mini
span span_llm · parent span_root
postprocess_answer
response-normalizer
span span_post · parent span_root
lookup_order_status
order-service
span span_tool · parent span_llm
04
Root cause explanation
Prompt update deployed earlier today
Likely cause
Prompt update deployed earlier today
The failing trace shows retrieval pressure first, then a model response that required guardrail retry. The deployment window lines up with the incident start.
Failure surface
Responses referencing non-existent documentation
Linked signal
Prompt update deployed 82 minutes before incident start.
05
Guardrail recommendation
Recommended runtime protections before blast radius expands.
enforce
Structured output policy
Schema validation catches malformed tool responses before they reach users.
warn
Latency retry policy
Retrieval retry cushions transient upstream failures during traffic spikes.
06
Deployment safety
Finish at the rollout gate.
Reliai deployment gate
Deployment detail
support/refund-v42 · gpt-4.1-mini
production · current release window
Deployment metadata
Change record
Prompt version
support/refund-v42
Model version
gpt-4.1-mini
{
"rollout": "progressive 20%",
"release_reason": "refund handling improvement",
"linked_simulation": "sim_demo_v42"
}Deployment safety check
Gate decision
Safety state
WARNING
Risk score: 72/100
Deployment risk factors
- High regression probability in retrieval latency
- Similar failures seen across organizations
- Insufficient guardrail coverage for retrieval retries
- Recent incident correlation detected during rollout
AI reliability insights
Known failure patterns
Known reliability patterns detected
- Similar retrieval latency failures have been seen across other deployments with expanded context windows.
- Current rollout has insufficient retry coverage for retrieval failures.
Retrieval timeout under long-context prompts
high risk · 142 traces
Recommended guardrails
Deployment risk factors
Why this change is safe or risky
High regression probability
Current deployment signals indicate elevated regression risk.
Cross-organization failures
Similar failure patterns have been seen in the reliability graph.
Guardrail coverage
Additional guardrail coverage is recommended before rollout.
Recent incident correlation
1 linked incident already reference this deployment.