Incidents

Incidents group related alerts together and track their lifecycle from detection to resolution.

How Incidents Work

Alert fires - A rule condition is met
Incident created - New incident or grouped with existing
Notifications sent - Team is alerted
Investigation - Team acknowledges and investigates
Resolution - Issue is fixed, incident closed

Incident Grouping

Related alerts are automatically grouped:

Incident: API Degradation
├── Alert: High P95 Latency (10:30)
├── Alert: Error Rate Spike (10:32)
└── Alert: Database Slow Queries (10:35)

Grouping rules:

Same service/source
Within time window (default: 5 minutes)
Similar tags

Incident States

State	Description
`open`	New incident, not yet acknowledged
`acknowledged`	Team is aware and investigating
`resolved`	Issue fixed, incident closed

Managing Incidents

List Incidents

GET /api/v1/incidents
GET /api/v1/incidents?status=open
GET /api/v1/incidents?severity=critical

Get Incident

GET /api/v1/incidents/:id

Response includes:

{
  "incident": {
    "id": "inc_abc123",
    "title": "High Error Rate on API",
    "status": "acknowledged",
    "severity": "critical",
    "created_at": "2024-01-15T10:30:00Z",
    "acknowledged_at": "2024-01-15T10:32:00Z",
    "acknowledged_by": "user_xyz",
    "alerts": [
      {
        "id": "alert_1",
        "rule_name": "High Error Rate",
        "fired_at": "2024-01-15T10:30:00Z"
      }
    ],
    "timeline": [
      {
        "event": "incident_created",
        "timestamp": "2024-01-15T10:30:00Z"
      },
      {
        "event": "alert_added",
        "alert_id": "alert_1",
        "timestamp": "2024-01-15T10:30:00Z"
      },
      {
        "event": "acknowledged",
        "user": "user_xyz",
        "timestamp": "2024-01-15T10:32:00Z"
      }
    ]
  }
}

Acknowledge Incident

POST /api/v1/incidents/:id/acknowledge
{
  "note": "Investigating the issue"
}

Resolve Incident

POST /api/v1/incidents/:id/resolve
{
  "note": "Deployed fix in v1.2.3",
  "root_cause": "Memory leak in cache service"
}

Add Note

POST /api/v1/incidents/:id/notes
{
  "content": "Identified root cause - scaling up instances"
}

Incident Timeline

Every incident maintains a timeline:

Event	Description
`incident_created`	Incident opened
`alert_added`	New alert added to incident
`alert_resolved`	Alert in incident resolved
`acknowledged`	Team acknowledged
`note_added`	Note added
`escalated`	Escalated to next level
`resolved`	Incident resolved

Escalation

Incidents can escalate if not acknowledged:

{
  "escalation_policy": {
    "steps": [
      { "delay": "5m", "channels": ["slack-ops"] },
      { "delay": "15m", "channels": ["pagerduty-oncall"] },
      { "delay": "30m", "channels": ["email-management"] }
    ]
  }
}

See Escalation Policies for configuration.

Incident Metrics

Track incident performance:

GET /api/v1/incidents/stats

{
  "stats": {
    "total": 156,
    "by_status": {
      "open": 3,
      "acknowledged": 2,
      "resolved": 151
    },
    "mttr": 1800,
    "mtta": 300
  }
}

Metric	Description
MTTA	Mean Time to Acknowledge
MTTR	Mean Time to Resolve

Best Practices

Acknowledge Quickly

Set targets for acknowledgment time

Add Notes

Document investigation steps in timeline

Include Root Cause

Record root cause when resolving

Review Metrics

Track MTTA/MTTR to improve response

Signal

Signal API

Signal MCP

Incidents

Incidents

How Incidents Work

Incident Grouping

Incident States

Managing Incidents

List Incidents

Get Incident

Acknowledge Incident

Resolve Incident

Add Note

Incident Timeline

Escalation

Incident Metrics

Best Practices

Acknowledge Quickly

Add Notes

Include Root Cause

Review Metrics

Signal

Signal API

Signal MCP

​Incidents

​How Incidents Work

​Incident Grouping

​Incident States

​Managing Incidents

​List Incidents

​Get Incident

​Acknowledge Incident

​Resolve Incident

​Add Note

​Incident Timeline

​Escalation

​Incident Metrics

​Best Practices

Acknowledge Quickly

Add Notes

Include Root Cause

Review Metrics

Incidents

How Incidents Work

Incident Grouping

Incident States

Managing Incidents

List Incidents

Get Incident

Acknowledge Incident

Resolve Incident

Add Note

Incident Timeline

Escalation

Incident Metrics

Best Practices