I love Graphite. It’s the most robust, flexible, kick-ass monitoring tool out there. But when I say monitoring
, I’m actually not describing what graphite really does. In fact, it does almost anything but monitoring. It collects metrics via carbon, it stores them using whisper, and it provides a front-end (both API and web-based), via graphite-web. It does not however monitor anything, and certainly does not alert when certain things happen (or fail to happen).
So graphite is great for collecting, viewing and analyzing data, particularly with the multitude of dashboard front-ends, my favourite being giraffe ;-). But what can you do when you want to get an email or a text message when, say, carbon throws some errors, or your web server starts to bleed with 500’s like there’s no tomorrow? Even better – do you want to get an email when your conversion signup rates drops below a certain mark??
Monitoring graphite
So what can you use if you want to monitor stuff using graphite? And what kind of stuff can you monitor? I’ve come across a really great approach using nagios. In fact, I ‘borrowed’ the method the author was using for alerting on 500 errors for my own approach. So I wanted to do something very similar, but I really didn’t want nagios. It’s an overkill for me, if all I want is to get an email (or run a script) when something goes wrong.