• 案例分享 生产环境逐步迁移至k8s集群 - pod注册到consul


    #案例分享 生产环境逐步迁移至k8s集群 - pod注册到consul

    #项目背景

    • 多套业务系统, 所有节点注册到consul集群,方便统一管理
    • 使用consul的dns功能, 所有节点hostname能ping通
    • 使用consul健康检查功能, 健康检查通过才添加到service
    • 部分服务之前调用直接使用consul的server地址即:
      service-name.service.datacenter.consul
    • prometheus监控使用consul-templates自动添加节点
    • 运行环境是阿里云, k8s集群容器IP和云主机IP互通

    #1.1 需要解决的问题

    • 部分服务迁移k8s集群后, k8s集群外的服务需要直连pod的ip访问

    #1.2 解决办法

    • pod添加consul-agent容器注册到consul集群

    #2.1 pod注册到consul产生的新问题

    • pod退出或删除时, consul集群应删除pod
    • prometheus监控模板consul-templates需要排除pod

    #2.2 解决办法

    • consul容器使用preStop钩子, 退出前执行consul leave主动离开consul集群
    • consul-templates排除pod
      • pod注册到consul集群时添加前缀如k8s-
      • consul-templates使用regexMatch正则匹配忽略k8s-开头的节点

    #演示demo如下

    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: consul-demo-config
    namespace: default
    data:
    consul.json: |-
    {
    "datacenter": "qa",
    "acl_datacenter": "qa",
    "data_dir": "/tmp/consul",
    "bind_addr": "0.0.0.0",
    "client_addr": "0.0.0.0",
    "start_join": ["10.10.100.100"],
    "retry_join": ["10.10.100.100"],
    "retry_interval": "5s",
    "disable_host_node_id": true,
    "enable_script_checks": true,
    "disable_update_check": true,
    "leave_on_terminate": true,
    "log_level": "WARN",
    "server": false,
    "service": {
    "name": "qa-consul-demo",
    "port" : 80,
    "tags": ["k8s", "qa", "consul-demo"],
    "checks": [
    {
    "id": "consul-demo-HealthCheck",
    "name": "Health Check",
    "notes": "Health Check",
    "args": [ "sh", "-c", "[ $(curl -s 127.0.0.1 -I |grep 'nginx' |wc -l) -eq 1 ] && { echo 'Health check successful'; exit 0 ; } || { echo 'check error' ; exit 2 ; }" ],
    "interval": "10s"
    }
    ]
    }
    }
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: consul-demo
    namespace: default
    spec:
    selector:
    matchLabels:
    app: consul-demo
    replicas: 2
    template:
    metadata:
    labels:
    app: consul-demo
    spec:
    imagePullSecrets:
    - name: docker-image-key
    containers:
    - name: consul-agent
    image: consul:1.0.8
    imagePullPolicy: IfNotPresent
    command:
    - sh
    - -c
    - |
    consul agent -config-dir=/opt/consul -node=k8s-qa-$HOSTNAME -rejoin
    lifecycle:
    preStop:
    exec:
    command:
    - sh
    - -c
    - |
    consul leave
    volumeMounts:
    - mountPath: "/etc/consul"
    name: consul-conf
    resources:
    requests:
    cpu: 10m
    memory: 16Mi
    limits:
    cpu: 50m
    memory: 32Mi
    readinessProbe:
    tcpSocket:
    port: 8500
    livenessProbe:
    tcpSocket:
    port: 8500
    volumeMounts:
    - name: consul-config
    mountPath: "/opt/consul"
    - name: nginx-node
    image: alivv/nginx:node
    imagePullPolicy: IfNotPresent
    volumes:
    - name: consul-config
    configMap:
    name: consul-demo-config
    items:
    - key: consul.json
    path: consul.json

    监控模板consul-templates如下

    - job_name: 'node'
    static_configs:
    {{range nodes}}
    - targets: ['{{.Node}}:9100']
    labels:
    instance: {{.Node}}{{end}}

    修改后如下, 使用regexMatch正则匹配排除k8s-开头的节点名称

    - job_name: 'node'
    static_configs:
    {{range nodes}}{{if .Node | regexMatch "^k8s-.*" }}{{else}}
    - targets: ['{{.Node}}:9100']
    labels:
    instance: {{.Node}}{{end}}{{end}}
  • 相关阅读:
    面试:正确率能很好的评估分类算法吗
    PCL 点云按时间进行分段
    windows安装数据库MySQL
    【EC200U】何为QuecPython以及QPYcom基础操作
    论文笔记: 全波形反演的无监督学习: 将 CNN 与偏微分方程做成一个环
    Python 例题(10)
    Java数据结构与Java算法学习---时间复杂度分析
    Opencv+YOLO-V3实现目标跟踪
    MongoDB学习(一)
    软件开发模型与软件测试模型
  • 原文地址:https://www.cnblogs.com/elvi/p/16732694.html