Monitoring

Check Intervals & Thresholds

Fine-tune how often PingBase checks your services and when it should trigger incidents.

Check Intervals

The check interval determines how frequently PingBase sends a request to your service. Shorter intervals detect issues faster but consume more resources.

IntervalChecks/HourBest For
30 seconds120Mission-critical APIs, payment systems
60 seconds60Production services (recommended default)
5 minutes12Internal tools, staging environments
15 minutes4Low-priority services, documentation sites

Failure Thresholds

The failure threshold prevents false positives by requiring multiple consecutive failures before triggering an incident. This accounts for transient network issues and brief service restarts.

Threshold = 1

Incident created on the first failure. Use only for critical services where any downtime matters.

Threshold = 3 (Default)

Incident created after 3 consecutive failures. Good balance between speed and accuracy.

Threshold = 5

More conservative. Use for services that occasionally have brief hiccups.

Retry Logic

Before marking a check as failed, PingBase retries the request according to your configuration:

  • Retries — Number of retry attempts (default: 3)
  • Retry Interval — Seconds between retries (default: 30)

A check is only marked as failed after all retries are exhausted. With the default settings (3 retries, 30s interval), a check takes up to 90 seconds to confirm a failure.

Timeout Configuration

The timeout value (default: 10 seconds) determines how long PingBase waits for a response before considering the check failed. If your service has known slow endpoints, increase this value to avoid false positives. For health check endpoints, 5-10 seconds is typically sufficient.

Example Configuration

{

"name": "Production API",

"target": "https://api.example.com/health",

"interval": 60,

"timeout": 10,

"retries": 3,

"retryInterval": 30,

"failureThreshold": 3,

"expectedStatus": [200]

}

With this configuration, PingBase checks every 60 seconds. If a check fails, it retries 3 times at 30-second intervals. After 3 consecutive failed checks (each with 3 retries), an incident is automatically created.