Alerta started at The Guardian out of necessity as a replacement for a legacy monitoring tool but only after exhaustively evaluating all credible alternatives first.
Initially all we wanted was to be able to create alert thresholds against the hundreds of thousands of Ganglia metrics collected for the website and view the alerts in a web console ie. a Ganglia “alerter”. Not having a proper name for this metrics and monitoring system the working name of “an alerter” stuck and a simple homophone was chosen to aid future Google searches.
In the end, the thresholding of metrics proved very difficult to scale so we eventually split the project in two and metric thresholding was given to Riemann (see riemann-config) and the alert correlation, de-duplication and visualisation became the “Alerta” project.
Over the years the project has evolved to meet the constantly changing needs of the Guardian development teams as they moved to a more agile, dynamic, “swimlaned” architecture which has meant, for the operations team, a shift from static, self-hosted infrastructureto an internal OpenStack cloud to finally an external cloud service.
In that time certain key features of alerta have been deprecated as requirements changed (eg. the message bus, Ganglia, Riemann) and others added (eg. OAuth2 login, CloudWatch, Pingdom, PagerDuty integration). In the process it has been slimmed down to fewer core components making it easier to understand, deploy and manage.
As such, Alerta is now quite different to the somewhat “over engineered” initial solution but the core concepts of being a flexible, easy-to-use tool remain and it is now a “cloud-ready” solution adapted to the challenges of a fast changing environment.