Server monitoring

Monitor any possible server performance metrics and incidents:

Performance

Server performance
  • High CPU or memory utilization
  • Network bandwidth usage
  • Packet loss rate
  • Interface error rate
  • Number of tcp connections is anomaly high for this day of the week
  • Aggregate throughput of core routers is low

Availability

Server availability
  • Free disk space is low
  • System status is in warning/critical state
  • Device temperature is too high / too low
  • Power supply is in critical state
  • Fan is in critical state
  • No SNMP data collection
  • Network connection is down

Configuration

Configuration changes
  • New components added or removed
  • Network module is added, removed or replaced
  • Firmware has been upgraded
  • Device serial number has changed
  • Interface has changed to lower speed or half-duplex mode

Problems

Problem Detection

Define smart thresholds

Detect problem states within the incoming metric flow automatically. No need to peer at incoming metrics continuously.

  • Highly flexible definition options
  • Separate problem conditions and resolution conditions
  • Multiple severity levels
  • Root cause analysis
  • Anomaly detection
  • Trend prediction

Partners