springboot(2.1.0)+springcloud(Greenwich.M1)实现链路追踪

Joonas 发布于2019-08-16 13:43 / 3750人阅读

摘要：主要问题由于新版本新版本实现链路追踪的一些新特性，使得我在实现的过程上踩了不少坑。同样一些场景下需要保存链路追踪的数据，以备后面观察对比，所以同样需要一个来存储数据。方法一，通过修改基配置文件后启动。

主要问题

由于springboot新版本(2.1.0)、springcloud新版本(Greenwich.M1)实现链路追踪sleuth+zipkin的一些“新特性”，使得我在实现sleuth+zipkin的过程上踩了不少坑。

在springboot1.X版本的时候，实现链路追踪服务需要用户自己实现client以及server，通常在server服务端需要引入各种各样的包(spring-cloud-sleuth-stream，以及支持zipkin的一些相关依赖包等等)；

但在spring cloud新版本实现链路追踪sleuth+zipkin的方式上已经不再需要自己再去实现一个server服务端（集成sleuth+zipkin），而是由zinkin官方提供了一个现成的zipkin-server.jar，或者是一个docker镜像，用户可以下载并通过命令进行启动它，用户可以通一些配置来确定sleuth收集到信息后传输到zipkin之间采用http,还是通过rabbit/kafka的方式。在新的版本下，用户只需要关注slenth-client选用何种传输方式（http或mq（rabbit/kafka），如果选择http,则在配置中指明base-url；如果选择mq,则在配置指明相关消息中间件的相关信息host/port/username/password...），至于zipkin的信息storage问题,则由zipkin-server要负责，可以通过zipkin-server.jar 配置一些具体的参数来启动。（下面会细讲）

ps:这不是教程贴，这主要是解决一些问题的一些方法，不会有详细的实现过程，但为了简明我会贴上部分代码。

背景

最近开始实习了，老大让我自学一下sc(spring cloud)，学就学嘛，也不是难事。看完spring cloud的全家桶,老大说让我重点了解一下它的链路追踪服务，后期会有这方面的任务安排给我做，所以呢我就重点关注这一方面，打算自己做个demo练练手，看了网上的教程，膨胀的我选择了个最新的版本，结果发现就这么掉坑里了。。。

版本

按照惯例，先说下springboot跟spring cloud的版本
springboot：2.1.0
springcloud：Greenwich.M1
个人建议新手不要过分追求新版本，旧版本的还是够用的,比springboot 2.6.0搭配sringcloud Finchley SR2还是挺稳的，如果真的要探索新版本你会发现这里面的坑实在是踩不完，基本要花个一两天才能让自己从坑里跳出去，这样频繁踩坑会让新手很容易放弃~~~
ps：不要问我为什么知道。。。

正题

闲话扯完了，可以进入正题了
一共四个服务
eureka-server
zipkin-server：新版本的zipkin服务端，负责接受sleuth发送过来的数据，完成处理、存储、建立索引，并且提供了一个可视化的ui数据分析界面。
需要的同学话可以直接在github上下载https://github.com/openzipkin...

嗯就是这两个家伙
下面两个是两个服务

eureka-server服务注册中心，这个实现我就不讲了，网上搜一大把，各个版本实现基本都是一致的，并不存在版本更新跨度极大的情况。而且这里我把它是打包成一个jar包,在需要的时候直接用java -jar XXX.jar 直接启动

至于product跟order(也即实际场景下各种种样的服务A、B、C...)

order服务只有一个接口/test，去调用product的接口

这里的productclient就是使用feignf去调用order的/product/list接口

product只有一个接口/product/list，查找所有商品的列表

简单的来说，这里的场景就是order服务--（去调用）-->product服务

说完场景后，贴一下这两个服务的相关配置信息(order跟producet的配置基本上是相同的）
application.yml

spring:
  application:
    #服务名
    name: product
  #由于业务逻辑需要操作数据库，所以这里配置了mysql的一些信息
  datasource:
    driver-class-name: com.mysql.jdbc.Driver
    username: root
    password: 123456
    url: jdbc:mysql://127.0.0.1:3306/sc_sell?characterEncoding=utf-8&useSSL=false&serverTimezone=Asia/Shanghai
  jpa:
    show-sql: true
  #重点
  zipkin:
    #base-url:当你设置sleuth-cli收集信息后通过http传输到zinkin-server时，需要在这里配置
    base-url: http://localhost:9411
    enabled: true
  sleuth:
    sampler:
      #收集追踪信息的比率，如果是0.1则表示只记录10%的追踪数据，如果要全部追踪，设置为1（实际场景不推荐，因为会造成不小的性能消耗）
      probability: 1
eureka:
  client:
    service-url:
    #注册中心地址
      defaultZone: http://localhost:8999/eureka/
logging:
  level:
    #这个是设置feign的一个日志级别,key-val的形式设置
    org.springframework.cloud.openfeign: debug

说完配置信息，就该讲一下依赖了，很简单，client实现链路追踪只需要添加一个依赖spring-cloud-starter-zipkin。就是这个

        
            org.springframework.cloud
            spring-cloud-starter-zipkin

其实这些都是基础操作，是吧，那么来点进阶的。
从上面的例子上来看，其实还是有几个问题需要考虑一下。

有点开发经验的人都会发现，首先它是基于http协议传输的，http协议传输有个不好的地方就是，它是短连接，即需要频繁通过三次握手建立链接，这在追踪很多服务时会造成不小的性能消耗。

另外还有一个问题：对于直接传输的方式，有个弊端就是一旦接收方意外断开连接，那么在传输链路中的一些数据将会丢失，如果这些数据是关键数据，那么后果将是非常严重的。同样一些场景下需要保存链路追踪的数据，以备后面观察对比，所以同样需要一个db来存储数据。

所以对于以上的问题，还是需要去考虑，值得欣慰的是，zipkin在这两个方面也作了很nice的解决方案，在实现过程中只需要稍作配置即可。

在sleuth-cli跟zipkin-server之间插入一个消息中间件rabbitmq/kafka，这里我举例中只使用rabbitmq来实现

将链路追踪的数据存储到DB上，目前zipkin暂时只支持mysql/elasticsearch,这里我使用mysql

如果你是刚开始学习sc，给你去实现的话，你肯定会开始打开浏览器开始搜索教程。
结果你会发现，大部分博客上都是以前版本的实现方式，一些较旧会让你自己实现一个zipkin-server（我怀疑他们的版本是1.x）,你会发现很郁闷，因为这跟你想象的不太一样啊。
继续找，终于在茫茫帖子中，找到了一篇是关于springboot2.0.X版本的实现链路追踪的教程，这时候你会兴奋，终于找到靠谱一点的啊，喜出望外有木有啊，但是，事情还没完，它会让你在客户端依赖下面这个依赖包

        
            org.springframework.cloud
            spring-cloud-sleuth-zipkin-stream
        
         
            org.springframework.cloud
            spring-cloud-sleuth-stream

结果你会发现，你在依赖它的时候，其实是依赖不了，为什么？因为版本的问题，什么？你跟我说你的pom文件没报错啊，但是，你打开idea右边的maven插件看一下

这真的是一个巨坑，我一直不明白是怎么回事，直到有一次，我打开了这个页面，花了我一天的时间去摸索是什么原因造成的集成rabbitmq失败，真的是被安排得明明白白，最后我发现，这条路行不通啊

最后，豪无头绪的我，继续在网上查找一些springboot2.x版本的一些链路追踪的教程，在搜索了一个下午，我突然想起，诶不对，我应该直接去官网看它的官方教程的啊。。。虽然都英文，大不了我用chrome自带的翻译工具翻译一下咯。结果就立马打开spring的官网，选择了最新的版本，进去找了一下，还真的让我找到，还特别简单！！！
传送门：https://cloud.spring.io/sprin...
官方文档是这么说的。

意思大概是说：如果你想使用rabbitmq或kafka替换掉http,添加spring-rabbit或spring-kafka的依赖包，默认目标名是zipkin(队列名),如果你使用kafka/mysql，你需要设置属性：spring-zipkin-sender-type=kafka/mysql
也就是说，只需要引入下面这两个依赖包！！！

 
    org.springframework.cloud
    spring-cloud-starter-zipkin

 
    org.springframework.amqp
    spring-rabbit

再往下看，你会发现有一个提示

spring-cloud-sleuth-stream已经被弃用，不再与这个版本新内容。。。
所以现在再回过头去看，你会知道为什么在上一个尝试中引入spring-cloud-sleuth-stream会无效了。

再修改下application.yml的配置信息，只需要注释掉base-url,修改zipkin.sender.type=rabiit，再配置一下rabbitmq的一些信息，就大功告成。

 zipkin:
#  内存方式配置：可不配
#    base-url: http://localhost:9411/
    sender:
      type: rabbit
  rabbitmq:
    host: localhost
    port: 5672
    username: guest
    password: guest

到这里，你就已经把order/poduct的链路追踪部分做好了。

我们上面讲了sleuth负责收集数据，zipkin负责接收sleuth收集后发送过来的追踪信息，处理、存储、索引、提供ui，所以接下来，就是来实现zipkin-server的从rabbitmq队列取出追踪数据，并存储在mysql数据中这一功能了。

对于zipkin-server如何去实现，其实zinkin官网已经给我们做了功能的集成，只需要在启动的时候，设置参数即可，下面就来讲一下

对于需要根据什么场景设置什么样的参数的问题，我不会具体讲解应该怎么设置，因为我也只是刚接触sc不久，一些场景我也不是很熟悉，但我会讲怎么去找我们需要的一些参数。

方法一，通过修改基配置文件后启动。

首先，我们用解压工具解压一下zipkin-server.jar这个压缩包，解压出来有三个文件夹，里面大部分都是.class文件。

然后我们进入BOOT-INFclasses目录下，你会发现有两个.yml文件，没错这就是yml的配置文件了

其中zipkin-server.yml就是zinpkin-server主要的配置文件了，但你打开后会发现，其实里面只有一行配置，spring.profiles.include: shared
,即引入shared.yml文件，所以这里我们主要看zinkin-serer-shared.yml文件。
打开zinkin-serer-shared.yml

zipkin:
  self-tracing:
    # Set to true to enable self-tracing.
    enabled: ${SELF_TRACING_ENABLED:false}
    # percentage to self-traces to retain
    sample-rate: ${SELF_TRACING_SAMPLE_RATE:1.0}
    # Timeout in seconds to flush self-tracing data to storage.
    message-timeout: ${SELF_TRACING_FLUSH_INTERVAL:1}
  collector:
    # percentage to traces to retain
    sample-rate: ${COLLECTOR_SAMPLE_RATE:1.0}
    http:
      # Set to false to disable creation of spans via HTTP collector API
      enabled: ${HTTP_COLLECTOR_ENABLED:true}
    kafka:
      # Kafka bootstrap broker list, comma-separated host:port values. Setting this activates the
      # Kafka 0.10+ collector.
      bootstrap-servers: ${KAFKA_BOOTSTRAP_SERVERS:}
      # Name of topic to poll for spans
      topic: ${KAFKA_TOPIC:zipkin}
      # Consumer group this process is consuming on behalf of.
      group-id: ${KAFKA_GROUP_ID:zipkin}
      # Count of consumer threads consuming the topic
      streams: ${KAFKA_STREAMS:1}
    rabbitmq:
      # RabbitMQ server address list (comma-separated list of host:port)
      addresses: ${RABBIT_ADDRESSES:}
      concurrency: ${RABBIT_CONCURRENCY:1}
      # TCP connection timeout in milliseconds
      connection-timeout: ${RABBIT_CONNECTION_TIMEOUT:60000}
      password: ${RABBIT_PASSWORD:guest}
      queue: ${RABBIT_QUEUE:zipkin}
      username: ${RABBIT_USER:guest}
      virtual-host: ${RABBIT_VIRTUAL_HOST:/}
      useSsl: ${RABBIT_USE_SSL:false}
      uri: ${RABBIT_URI:}
  query:
    enabled: ${QUERY_ENABLED:true}
    # 1 day in millis
    lookback: ${QUERY_LOOKBACK:86400000}
    # The Cache-Control max-age (seconds) for /api/v2/services and /api/v2/spans
    names-max-age: 300
    # CORS allowed-origins.
    allowed-origins: "*"

  storage:
    strict-trace-id: ${STRICT_TRACE_ID:true}
    search-enabled: ${SEARCH_ENABLED:true}
    type: ${STORAGE_TYPE:mem}
    mem:
      # Maximum number of spans to keep in memory.  When exceeded, oldest traces (and their spans) will be purged.
      # A safe estimate is 1K of memory per span (each span with 2 annotations + 1 binary annotation), plus
      # 100 MB for a safety buffer.  You"ll need to verify in your own environment.
      # Experimentally, it works with: max-spans of 500000 with JRE argument -Xmx600m.
      max-spans: 500000
    cassandra:
      # Comma separated list of host addresses part of Cassandra cluster. Ports default to 9042 but you can also specify a custom port with "host:port".
      contact-points: ${CASSANDRA_CONTACT_POINTS:localhost}
      # Name of the datacenter that will be considered "local" for latency load balancing. When unset, load-balancing is round-robin.
      local-dc: ${CASSANDRA_LOCAL_DC:}
      # Will throw an exception on startup if authentication fails.
      username: ${CASSANDRA_USERNAME:}
      password: ${CASSANDRA_PASSWORD:}
      keyspace: ${CASSANDRA_KEYSPACE:zipkin}
      # Max pooled connections per datacenter-local host.
      max-connections: ${CASSANDRA_MAX_CONNECTIONS:8}
      # Ensuring that schema exists, if enabled tries to execute script /zipkin-cassandra-core/resources/cassandra-schema-cql3.txt.
      ensure-schema: ${CASSANDRA_ENSURE_SCHEMA:true}
      # 7 days in seconds
      span-ttl: ${CASSANDRA_SPAN_TTL:604800}
      # 3 days in seconds
      index-ttl: ${CASSANDRA_INDEX_TTL:259200}
      # the maximum trace index metadata entries to cache
      index-cache-max: ${CASSANDRA_INDEX_CACHE_MAX:100000}
      # how long to cache index metadata about a trace. 1 minute in seconds
      index-cache-ttl: ${CASSANDRA_INDEX_CACHE_TTL:60}
      # how many more index rows to fetch than the user-supplied query limit
      index-fetch-multiplier: ${CASSANDRA_INDEX_FETCH_MULTIPLIER:3}
      # Using ssl for connection, rely on Keystore
      use-ssl: ${CASSANDRA_USE_SSL:false}
    cassandra3:
      # Comma separated list of host addresses part of Cassandra cluster. Ports default to 9042 but you can also specify a custom port with "host:port".
      contact-points: ${CASSANDRA_CONTACT_POINTS:localhost}
      # Name of the datacenter that will be considered "local" for latency load balancing. When unset, load-balancing is round-robin.
      local-dc: ${CASSANDRA_LOCAL_DC:}
      # Will throw an exception on startup if authentication fails.
      username: ${CASSANDRA_USERNAME:}
      password: ${CASSANDRA_PASSWORD:}
      keyspace: ${CASSANDRA_KEYSPACE:zipkin2}
      # Max pooled connections per datacenter-local host.
      max-connections: ${CASSANDRA_MAX_CONNECTIONS:8}
      # Ensuring that schema exists, if enabled tries to execute script /zipkin2-schema.cql
      ensure-schema: ${CASSANDRA_ENSURE_SCHEMA:true}
      # how many more index rows to fetch than the user-supplied query limit
      index-fetch-multiplier: ${CASSANDRA_INDEX_FETCH_MULTIPLIER:3}
      # Using ssl for connection, rely on Keystore
      use-ssl: ${CASSANDRA_USE_SSL:false}
    elasticsearch:
      # host is left unset intentionally, to defer the decision
      hosts: ${ES_HOSTS:}
      pipeline: ${ES_PIPELINE:}
      max-requests: ${ES_MAX_REQUESTS:64}
      timeout: ${ES_TIMEOUT:10000}
      index: ${ES_INDEX:zipkin}
      date-separator: ${ES_DATE_SEPARATOR:-}
      index-shards: ${ES_INDEX_SHARDS:5}
      index-replicas: ${ES_INDEX_REPLICAS:1}
      username: ${ES_USERNAME:}
      password: ${ES_PASSWORD:}
      http-logging: ${ES_HTTP_LOGGING:}
      legacy-reads-enabled: ${ES_LEGACY_READS_ENABLED:true}
    mysql:
      jdbc-url: ${MYSQL_JDBC_URL:}
      host: ${MYSQL_HOST:localhost}
      port: ${MYSQL_TCP_PORT:3306}
      username: ${MYSQL_USER:}
      password: ${MYSQL_PASS:}
      db: ${MYSQL_DB:zipkin}
      max-active: ${MYSQL_MAX_CONNECTIONS:10}
      use-ssl: ${MYSQL_USE_SSL:false}
  ui:
    enabled: ${QUERY_ENABLED:true}
    ## Values below here are mapped to ZipkinUiProperties, served as /config.json
    # Default limit for Find Traces
    query-limit: 10
    # The value here becomes a label in the top-right corner
    environment:
    # Default duration to look back when finding traces.
    # Affects the "Start time" element in the UI. 1 hour in millis
    default-lookback: 3600000
    # When false, disables the "find a trace" screen
    search-enabled: ${SEARCH_ENABLED:true}
    # Which sites this Zipkin UI covers. Regex syntax. (e.g. http://example.com/.*)
    # Multiple sites can be specified, e.g.
    # - .*example1.com
    # - .*example2.com
    # Default is "match all websites"
    instrumented: .*
    # URL placed into the  tag in the HTML
    base-path: /zipkin

server:
  port: ${QUERY_PORT:9411}
  use-forward-headers: true
  compression:
    enabled: true
    # compresses any response over min-response-size (default is 2KiB)
    # Includes dynamic json content and large static assets from zipkin-ui
    mime-types: application/json,application/javascript,text/css,image/svg

spring:
  jmx:
     # reduce startup time by excluding unexposed JMX service
     enabled: false
  mvc:
    favicon:
      # zipkin has its own favicon
      enabled: false
  autoconfigure:
    exclude:
      # otherwise we might initialize even when not needed (ex when storage type is cassandra)
      - org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration
info:
  zipkin:
    version: "2.11.8"

logging:
  pattern:
    level: "%clr(%5p) %clr([%X{traceId}/%X{spanId}]){yellow}"
  level:
    # Silence Invalid method name: "__can__finagle__trace__v3__"
    com.facebook.swift.service.ThriftServiceProcessor: "OFF"
#     # investigate /api/v2/dependencies
#     zipkin2.internal.DependencyLinker: "DEBUG"
#     # log cassandra queries (DEBUG is without values)
#     com.datastax.driver.core.QueryLogger: "TRACE"
#     # log cassandra trace propagation
#     com.datastax.driver.core.Message: "TRACE"
#     # log reason behind http collector dropped messages
#     zipkin2.server.ZipkinHttpCollector: "DEBUG"
#     zipkin2.collector.kafka.KafkaCollector: "DEBUG"
#     zipkin2.collector.kafka08.KafkaCollector: "DEBUG"
#     zipkin2.collector.rabbitmq.RabbitMQCollector: "DEBUG"
#     zipkin2.collector.scribe.ScribeCollector: "DEBUG"

management:
  endpoints:
    web:
      exposure:
        include: "*"
  endpoint:
    health:
      show-details: always
# Disabling auto time http requests since it is added in Undertow HttpHandler in Zipkin autoconfigure
# Prometheus module. In Zipkin we use different naming for the http requests duration
  metrics:
    web:
      server:
        auto-time-requests: false

这其实就是配置文件，对于需要使用的组件，其实就是只修改对应的配置，比如我需要使用storage，让它把追踪数据保存到mysql中，那么我只需要修改对应的配置信息：

  storage:
    #其实部分不需要修改,省略掉
    mysql:
      jdbc-url: jdbc:sqlserver://localhost?XXX=xxx;
      host: localhost
      port: 3306
      username: root
      password: 123456
      db: zipkin
      #最大连接数
      max-active: ${MYSQL_MAX_CONNECTIONS:10}
      #是否使用ssl
      use-ssl: ${MYSQL_USE_SSL:false}

修改完配置，我们重新压缩成一个jar包，直接启动即可。

方法二，通过使用环境变量的方式来启动zipkin-server.jar服务。

直接使用java -jar zipkin-server.jar --zipkin.storage.mysql.username=root --zipkin.storage.mysql.password=123456 --zipkin.storage.mysql.host=localhost --zipkin.storage.mysql.port=3306 ...
后面接上的即是它的环境变量，至于环境变量有哪些，请看方法一的yml文件，都是一一对应的。这种方法好片就是不需要修改jar包，但就是需要后面接上一串较长的环境变量声明。

好了，基本上就已经结束了。其实配置都是同样的原理。能够举一反三自然其它相关配置都不是什么问题。

总结

更新过程中因为比较忙中间还没写完就发表了，导致内容欠缺，今天终于利用周末的时间补上了，万幸。
第一篇文章，主要记录自己的踩坑经历，非专业的写教程，大都是一些随心的记录，如果有什么看不懂的，欢迎留下你的问题，同样，如果哪些地方写得有误，望您不吝赐教，帮我指出一些错误，谢谢。

云服务器 GPU云服务器 SpringCloud springcloud教程 springcloud实战 springcloud官网

文章版权归作者所有，未经允许请勿转载,若此文章存在违规行为，您可以联系管理员删除。

转载请注明本文地址：https://www.ucloud.cn/yun/72205.html

适合新手的spring cloud入门教程

摘要：就和是应用的脚手架一样，是分布式和集群应用的脚手架。是由一个一个的微服务组成，而这些微服务都是在注册中心管理起来的。为了降低维护成本，我们引入了分布式配置服务的概念。就和 springboot 是 web 应用的脚手架一样， springcloud 是分布式和集群应用的脚手架。但是并不是所有的同学都有接触过分布式和集群，所以为了让学习曲线变得缓和，站长按照如下顺序展开 spring...

Pandaaa 2019-08-16 17:07 评论0 收藏0
架构～微服务 - 收藏集 - 掘金

摘要：它就是史上最简单的教程第三篇服务消费者后端掘金上一篇文章，讲述了通过去消费服务，这篇文章主要讲述通过去消费服务。概览和架构设计掘金技术征文后端掘金是基于的一整套实现微服务的框架。 Spring Boot 配置文件 – 在坑中实践 - 后端 - 掘金作者：泥瓦匠链接：Spring Boot 配置文件 – 在坑中实践版权归作者所有，转载请注明出处本文提纲一、自动配置二、自定义属性三、ran...

church 2019-06-24 17:54 评论0 收藏0
基于SpringCloud FinchleySR1 SpringBoot 2x vue elemen

摘要：介绍是基于微服务基础脚手架对于日常开发而言提供基础权限控制，动态菜单，才用前后端分离架构，前台采用后台使用提供接口。对于以后开发，只需要在添加业务模块即可，大大减少工作量。介绍 panda是基于SpringCloud Finchley.SR1 、SpringBoot 2.x、 vue、element-ui 微服务基础脚手架对于日常开发而言提供基础权限控制，动态菜单，才用前后端分离架构...

lansheng228 2019-08-16 13:52 评论0 收藏0