Istio: Distributed Tracing with Jaeger
About Jaeger
Last post we are able to depoly Istio and manage traffic for a book review microservice application. This session we will dive deeper into Istia for it add-on Jaeger for Microservice tracing.
Jaeger is an open-source end-to-end distributed tracing tool to monitor and troubleshoot the performance of microservices-based distributed systems by providing insights into the latency and other performance metrics.
- Trace
A trace represents the entire journey of a request or transaction as it propagates through various services and components of a distributed system. It captures the path the request takes, including all the microservices it interacts with, from start to finish. A trace is composed of multiple spans.
- Span
A span is a single unit of work within a trace. It represents an individual operation within a microservice, such as a function call, database query, or external API request. Each span contains metadata such as:
Prepration for Hands on
Here we will use Fleetman GPS sumilater microservice application as example to explore Jaeger and it capabilities.
- Enable Istio sidecar injection for existing deployment
- validate pod for istio sidecar injection, also check service status in Kiali
How Jaeger Works
When a request enters a microservice (e.g., a user making a request to a frontend service), the tracing library creates a span and assigns it a trace ID. As the request propagates through other services, additional spans are created and linked to the same trace ID. Each span is recorded with its respective start and end timestamps, operation name, and other metadata.
The Jaeger UI provides a way to visualize traces. Users can search for traces based on various criteria (e.g., service name, operation name, duration) and view the detailed structure of individual traces, like durations of time spent between microservices.
As the request flows through different services, each service creates additional or child spans. (e.g., The frontend service might call an authentication service. then authentication service call a user service, thus Jaeger will create 2 child spans)
Latency and Performance Analysis
By examining the durations of each span, If a particular span has a long duration, that service might be a bottleneck. If spans have significant gaps between them, network latency or queuing delays might be an issue. so we can identify which part of the request is taking the most time and investigate further to optimize performance.
Manage routing in each service from Kiali
Managing routing in Istio can be done either through the Kiali console or by defining VirtualServices and DestinationRules using Kubernetes YAML manifests. here from Kiali console, we have the visualization of each service traffic flow, metrics, and dependencies between services in real-time.
By create weighted routing or suspend traffic, Kiali will create it own VirtualServices and DestinationRules to manage the traffic
Add timeout in istio virtual service YAML
To add a timeout into istio virtual service YAML and ensure it works with Jaeger for better visibility and efficiency in the microservice architecture.
By adding this timeout to 3s for bellow “api-gateway” virtual service, Jaeger trace will aviod long response time when a request call api-gateway, any response longer than 3s will reture http timeout, which add visibility to Jaeger UI to determine if the request successful or not.
Conclusion
In this session we deep dive into Istio add-on Jaeger for distributed tracing, which Jaeger facilitates, involves tracking requests as they flow through various services and components of an application. This helps identify bottlenecks, understand service dependencies, and improve overall performance.
In the next post I will see how to use Istio and Kiali to run some Canary Releases, Blue-Green deployment Rolling Updates and A/B Testing.