Preventing Alert Fatigue in Website Monitoring
Too Many Alerts Means No One Listens
Alert fatigue is one of the most common operational problems teams encounter after setting up monitoring. When false positives, planned-downtime notifications, and minor blips all fire with the same urgency as real outages, engineers start ignoring alerts. By the time a critical incident occurs, your team has been conditioned not to act — and the blast radius grows.
For web agencies managing dozens of client sites, the problem compounds quickly. One noisy monitor on a low-priority staging environment can desensitize the team to every alert across the account.
This article covers four practical approaches in Miterl that, used together, let you design a monitoring setup where every alert that arrives is worth acting on.
What Causes Alert Fatigue?
Alert fatigue has four main causes, each requiring a different fix:
| Cause | Example |
|---|---|
| False positives from transient errors | A single network glitch triggers an alert |
| Planned downtime not suppressed | Maintenance floods Slack with notifications |
| Off-hours notifications for low-severity events | A minor slowdown at 3 AM wakes the on-call engineer |
| Thresholds set too aggressively | Any response time variance triggers an alert |
Addressing each cause independently, then combining them, brings alert volume down to a manageable and trustworthy level.
Fix 1: Confirmation Count — Filter Out Single-Check Failures
The fastest win is setting a confirmation count (confirmation_count): Miterl only fires an alert after N consecutive failed checks. A single-check failure from a transient network issue gets filtered out before it reaches anyone's Slack.
# Create a monitor with confirmation_count set
curl -X POST https://miterl.com/api/v1/monitors \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Client A - Production",
"url": "https://client-site.example.com",
"type": "http",
"interval_seconds": 60,
"confirmation_count": 2,
"alert_contact_ids": [1]
}'
With confirmation_count: 2, Miterl only alerts after two consecutive failures. A one-time blip is silently ignored.
Recommended values by priority:
| Site Priority | confirmation_count | Effective detection time |
|---|---|---|
| Revenue-critical (high) | 1–2 | 1–2 minutes |
| Corporate / informational (medium) | 2–3 | 2–3 minutes |
| Staging / development (low) | 3–5 | 3–5 minutes |
High-priority sites get fast detection; lower-priority sites absorb more noise tolerance, reducing overall alert volume without sacrificing coverage on what matters.
Fix 2: Maintenance Windows — Silence Planned Downtime
Server migrations, CMS updates, and database maintenance all cause legitimate downtime. Without suppression, each one floods your alerting channels and trains your team to expect noise. Miterl's maintenance window feature solves this by suppressing notifications during a registered window — while keeping checks running in the background.
# Open a maintenance window via Webhook before a deployment
curl -X POST https://miterl.com/api/v1/webhooks/maintenance/YOUR_TOKEN/start \
-H "Content-Type: application/json" \
-d '{
"duration_hours": 2,
"name": "CMS Update"
}'
# Close the window when work is complete
curl -X POST https://miterl.com/api/v1/webhooks/maintenance/YOUR_TOKEN/end
Adding these calls to your CI/CD pipeline or deploy script eliminates the "forgot to set the maintenance window" mistake entirely. If maintenance overruns, an alert fires the moment the window closes and the site is still down.
For full configuration options including recurring schedules, see "Maintenance Window Best Practices."
Fix 3: Quiet Hours — Block Off-Hours Low-Severity Noise
Even without planned maintenance, monitoring generates off-hours noise: a brief slowdown at 3 AM, a minor DNS hiccup on a Sunday morning. These wake up on-call engineers for incidents that resolve on their own — and erode the trust that a 3 AM alert means something important.
Miterl's quiet hours feature lets you suppress alert notifications on a per-company schedule. Checks continue to run and all incidents are logged; only the outbound notifications are held.
Example quiet hours configuration (typical for a Japanese agency):
{
"enabled": true,
"timezone": "Asia/Tokyo",
"weekdays": {
"enabled": true,
"start": "22:00",
"end": "07:00"
},
"weekends": {
"enabled": true,
"all_day": true
}
}
With this configuration, alerts are suppressed weeknights from 22:00 to 07:00 and all day on weekends. Overnight incidents are logged and visible in the dashboard when the team starts work in the morning. If the site is still down when quiet hours end, an alert fires immediately.
Handling critical sites during quiet hours:
If some monitors should always page regardless of quiet hours, manage it through alert contacts. Assign critical sites to an alert contact group that bypasses quiet hours, and route lower-priority monitors through a group that respects the schedule. This gives you silence on the noise without missing genuine emergencies.
Quiet hours are configured under the "Quiet Hours" section in the dashboard and respect each company's timezone setting.
Fix 4: Thresholds and Alert Routing
The final layer is designing what triggers an alert and who receives it.
Response time thresholds
Setting a response time threshold too low generates constant noise from normal variance. A practical starting point is 3–5x your site's typical response time:
# Monitor with a generous response time threshold
curl -X POST https://miterl.com/api/v1/monitors \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Client A - HTTP",
"url": "https://client-site.example.com",
"type": "http",
"interval_seconds": 60,
"response_time_threshold_ms": 5000,
"confirmation_count": 3,
"alert_contact_ids": [1]
}'
response_time_threshold_ms: 5000 means the check only fails if the site takes more than 5 seconds to respond — clearly abnormal, not just variable.
Routing alerts by severity
Sending all alerts to a single Slack channel means critical incidents get buried. Structuring alert contacts by severity prevents this:
| Alert Contact Group | Assigned Monitors | Destination |
|---|---|---|
| Critical (always on) | E-commerce, booking systems | Slack #critical + SMS/phone |
| Standard (quiet hours apply) | Corporate sites | Slack #monitoring |
| Log only | Staging and dev | Slack #dev-monitoring |
For Slack and Chatwork alert configuration, see "How to Set Up Slack and Chatwork Alerts."
Putting It All Together
A practical setup for an agency managing a mixed portfolio:
# Production site — tight confirmation, always-on alert contact
curl -X POST https://miterl.com/api/v1/monitors \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "[ClientA] Production - HTTP",
"url": "https://client-a.example.com",
"type": "http",
"interval_seconds": 60,
"confirmation_count": 2,
"alert_contact_ids": [1]
}'
# Staging site — loose confirmation, quiet-hours-aware alert contact
curl -X POST https://miterl.com/api/v1/monitors \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "[ClientA] Staging - HTTP",
"url": "https://staging.client-a.example.com",
"type": "http",
"interval_seconds": 300,
"confirmation_count": 5,
"alert_contact_ids": [2]
}'
alert_contact_ids: [1] is the always-on critical group; [2] is the standard group with quiet hours applied.
FAQ
If a site goes down during quiet hours, will anyone know?
Incidents are logged normally during quiet hours — checks keep running. The team sees what happened when they open the dashboard in the morning. If the site is still down when quiet hours end, an alert fires immediately so the situation is not missed.
Can I set different confirmation counts for different monitors?
Yes. confirmation_count is configured per monitor. You can set it low for revenue-critical sites (fast detection) and high for staging environments (noise tolerance), independently.
Does closing a maintenance window immediately restore alerting?
Yes, alerting resumes the instant the window closes. If the site is still down at that moment, an alert fires right away.
Summary
Alert fatigue is not inevitable — it is a configuration problem with practical solutions.
- Confirmation count: Filter out single-check false positives
- Maintenance windows: Suppress planned-downtime noise automatically
- Quiet hours: Block off-hours notifications for low-severity events
- Threshold and routing design: Segment alerts by severity and destination
Used together, these four controls turn alert volume from noise into signal. When an alert arrives, your team knows it is worth acting on — and responds accordingly.
For full setup documentation, see the docs. Try it yourself by signing up for free. For incident response once an alert does fire, see "Incident Response Playbook." For monitoring type fundamentals, see "Server Monitoring Basics: HTTP, Ping, DNS, and SSL." See how agencies use Miterl in the use cases section.