Splunk – CPU Alert

If you use Splunk for Infrastructure – this CPU alert can be highly customizable . It’s not like other alert where all the values are static. Each host has different threshold,waiting time. This query can help. You can use a lookup table to customize the threshold and the waiting time.

| mstats avg(cpu_metric.pctIdle) as curr_avg_idle_cpu  WHERE earliest=-9h  latest=-0h index=unix_metrics AND CPU=all BY host span=1h  
 | reverse ``` by default it the oldest will go first - we need to reverse so the list below will have the latest event on toop```
| stats list(*) as * by host 
|  rename curr_avg_idle_cpu as last_8_hour_range | eval hour_ago_1=mvindex(last_8_hour_range,0) , hour_ago_2=mvindex(last_8_hour_range,1), hour_ago_3=mvindex(last_8_hour_range,2) , hour_ago_4=mvindex(last_8_hour_range,3) , hour_ago_5=mvindex(last_8_hour_range,4)  , hour_ago_6=mvindex(last_8_hour_range,5) , hour_ago_7=mvindex(last_8_hour_range,6) , hour_ago_8=mvindex(last_8_hour_range,7)  

```debuging You can customize this```
| eval threshold =5, hour_delay=3 

|  foreach  hour_*  [eval threshold_result=threshold_result + if(<<FIELD>> < threshold,"NO","OK") + "," ] 
|  eval user_threshold_result = substr (threshold_result,1,hour_delay*3) 
|  eval critical_alert= if(like(user_threshold_result,"%OK%"),"NO","YES") 
|  search critical_alert="YES"

Leave a Reply

Your email address will not be published. Required fields are marked *