So, you want to have your system distributed, but wonder how to stay in control of your flow, troubleshoot potential latency problem, and be able to see what is actually going on when something breaks down? Tracing comes to the rescue, giving you the ability to correlate your requests and responses in the depths of network traffic. If your stack is based on Spring Boot, then creating a basic setup is easy as catching jellyfish in Bikini Bottom! 🙃
Assuming you have a Spring Boot application, you only need to have Sleuth on the classpath to have it do all the tracing for you. It marks particular parts of your flow with Trace Id and Span Id, which get included (among others) in messages' headers. Other information is also provided, e.g. for tracing requests in proper order, timing details etc.
Span refers to a basic unit of work, e.g. single exchange of messages (like HTTP request and response) and trace refers to set of spans formed in a tree-like structure, eg. all messages exchanged as part of the execution of a flow initiated by a single message.
This visualization from Spring Cloud Sleuth documentation shows very well what traces and spans are:
Trace Id and Span Id appear in relevant logs (if you do log things of course 🙂), eg.
service1.log:2017-01-19 15:12:02.545 INFO [service1,d5b291496f21b36e,c362f06faa315849,true] 36161 --- [nio-8081-exec-1] tech.viacom.service1.SampleApp : Service1 called, calling service2 service2.log:2017-01-19 15:12:02.689 INFO [service2,d5b291496f21b36e,9ef980eb860bbd28,true] 36162 --- [nio-8082-exec-2] tech.viacom.service2.SampleApp : Service2 called, calling service3 service3.log:2017-01-19 15:12:02.779 INFO [service3,d5b291496f21b36e,10b8d941e19f0848,true] 36163 --- [nio-8083-exec-3] tech.viacom.service3.SampleApp : Service3 called service2.log:2017-01-19 15:12:02.864 INFO [service2,d5b291496f21b36e,9ef980eb860bbd28,true] 36162 --- [nio-8082-exec-2] tech.viacom.service2.SampleApp : Service2 got response from service3 service1.log:2017-01-19 15:12:02.977 INFO [service1,d5b291496f21b36e,c362f06faa315849,true] 36161 --- [nio-8081-exec-1] tech.viacom.service1.SampleApp : Service1 got response from service2
d5b291496f21b36e is Trace Id,
c362f06faa315849 is Span Id and
true states whether log should be exported to Zipkin or not (this is customizable). When using
org.springframework.cloud:spring-cloud-starter-zipkin traces are being traced by Sleuth and exported with
io.zipkin.reporter:zipkin-reporter to your Zipkin server.
Zipkin manages collection and lookup of reported (exported) data. It runs as a self-contained server with a web gui. You can install and run one locally like this:
wget -O zipkin.jar 'https://search.maven.org/remote_content?g=io.zipkin.java&a=zipkin-server&v=LATEST&c=exec' java -jar zipkin.jar
or run the latest Docker image:
docker run -d -p 9411:9411 openzipkin/zipkin
Now, you only need to tell your Spring Boot app where to report traces. You can do it by setting the
spring.zipkin.baseUrl property. Then, you will have a very basic setup in place. Zipkin runs on port 9411 by default. Some other defaults:
- threshold of traced data being sampled and reported in
spring-cloud-starter-zipkinis 10%. This setting can be overridden by
- Zipkin keeps data in memory but can be configured to use MySQL, Cassandra or Elasticsearch for storage (more info here).
Examples below show how Zipkin visualises received data:
Want to know more?