I've just finished reviewing the book, 'Effective Monitoring and Alerting' by Slawek Ligus, published by O'Reilly Media.
This deals with System Operations and administration. Given the importance that maintenance of production software commands these days and the emerging convergence of developers and operations; thanks to the devopts movement which has gained a fare amount of traction, the importance of this text cannot be underestimated.
The author presents the concepts in a technology agnostic manner and focuses on latest trends in operations.
The book gives you a short, high level overview of the topic in 150 pages and covers the common tasks involved in monitoring and alerting of operations.
It starts with the introduction of monitoring and alerting and the issues surrounding these concepts. Over the next few chapter, monitoring and alerting is discussed with greater detail and explanation about interpreting the monitoring as well as understanding the nuances of alarms are given.
After these topics, the implementation, challenges and implications of scaling these on a large basis is discussed by some sane advices obtained from real world projects. These are then closed off by displaying the principles that capture the essence of the topic.
Get in the habit of measuring
Draw Conclusions Reliably
Monitor Extensively
Alarm Selectively
Work smart, not hard
Learn from the experiences of others
Have a tactic
Run a bank of cases
Enjoy the process
While these might come as common words of advice; these were presented in a practical yet succinct manner.
As the book focuses on a niche that involves guesswork and intuition from non system administrators towards understanding these concepts, it provides a lot of insight that senior administrators can impart. While the book focuses on theory in all the chapters, setting up Open TSDB is given as an appendix, I've gone through this and it is different from what is generally available in the blogs and online tutorials as it covers the setup from the perspective of actual deployment rather than focusing on a specific technology(like hbase or nagios) used in the exercise.
Overall, this was a power packed guide that covered various concepts but took off from advanced level in various concepts, leaving me lost in few areas where I do not have any prior experience .
Disclaimer: I received a copy of the book under the Blogger Review Program - O'Reilly Media.
No comments:
Post a Comment