Observability is today a vital component of our architecture to manage a system properly, determine if it is working correctly, and decide what needs to be fixed, changed, or improved. Observability has been an area of interest in the cloud-native community for a long time, supported by the Cloud Native Computing Foundation (CNCF). Many projects and products are being developed to allow observability of systems and applications. The creation of a Special Interest Group for Observability (SIG) and the development of a framework (ie, the Open Telemetry project) prove the importance of this concept and the desire to standardize it.
This article deals with observability in the context of GitOps. As it was mentioned in the previous article, the key functionality of GitOps is comparing the desired state of the system, stored in Git, to the current state of the system and applying the required changes to make the two converge. GitOps relies on a controller that manages and controls the remote resources. This means GitOps relies on the observability of the controller to identify the actions that need to be performed. But GitOps is also a system that must provide observability.
What is observability?
The old supervision methods have reached their limits in the supervision of the new standards of application architecture. The management of highly scalable and portable micro-services requires the adaptation of tools in order to facilitate debugging and diagnosis at all times, thus, requiring the observability of systems.
Often, monitoring and observability are confused. Basically, the idea of a monitoring system is to get a state of the system based on a predefined set of metrics to detect a known set of issues. According to the SRE book by Google, a monitoring system needs to answer two simple questions: “What’s broken, and why?”. Analyzing an application over the long term makes it possible to profile it in order to better understand its behavior regarding external events and thus, be proactive in its management.
Observability, on the other hand, aims to measure the understanding of a system state based on multiple outputs. This means observability is a system capability, like reliability, scalability, or security, that must be designed and implemented during system design…