代码之家  ›  专栏  ›  技术社区  ›  horcle_buzz

在kubeadm集群中升级calico节点时出现问题

  •  0
  • horcle_buzz  · 技术社区  · 6 年前

    我要去 upgrade Calico node and cni as per this link for "Upgrading Components Individually"

    方向非常清楚(我会在每个节点上设置警戒线,并为 calico/cni calico/node 但是我不知道这是什么意思

    Update the image in your process management to reference the new version

    WRT升级 印花/结点 容器。

    否则,我看不到其他与指示相关的问题。我们的环境是一个K8S Kubeadm集群。

    我想真正的问题是:我应该在哪里告诉K8S使用 印花/结点 图像?

    编辑

    回答上述问题:

    我只是做了一个 kubectl delete -f 两者兼而有之 calico.yaml rbac-kdd.yaml 然后做了一个 kubectl create -f 这些文件的最新版本。

    现在一切似乎都是3.3.2版,但现在所有calico节点pods上都出现了这个错误:

    Warning Unhealthy 84s (x181 over 31m) kubelet, thalia4 Readiness probe failed: calico/node is not ready: BIRD is not ready: BGP not established with <node IP addresses here

    我跑 calicoctl nodd status 并且得到

    Calico process is running.
    
    IPv4 BGP status
    +---------------+-------------------+-------+----------+--------------------------------+
    | PEER ADDRESS  |     PEER TYPE     | STATE |  SINCE   |              INFO              |
    +---------------+-------------------+-------+----------+--------------------------------+
    | 134.x.x.163 | node-to-node mesh | start | 02:36:29 | Connect                        |
    | 134.x.x.164 | node-to-node mesh | start | 02:36:29 | Connect                        |
    | 134.x.x.165 | node-to-node mesh | start | 02:36:29 | Connect                        |
    | 134.x.x.168 | node-to-node mesh | start | 02:36:29 | Active Socket: Host is         |
    |             |                   |       |          | unreachable                    |
    +---------------+-------------------+-------+----------+--------------------------------+
    
    IPv6 BGP status
    No IPv6 peers found.
    

    我假设134.x.x.168是不可到达的,这就是我收到上述健康检查警告的原因。

    但不确定该怎么做。此节点在K8S群集中可用(这是节点 thalia4 ):

    [gms@thalia0 calico]$ kubectl get nodes
    NAME                  STATUS   ROLES    AGE   VERSION
    thalia0               Ready    master   87d   v1.13.1
    thalia1               Ready    <none>   48d   v1.13.1
    thalia2               Ready    <none>   30d   v1.13.1
    thalia3               Ready    <none>   87d   v1.13.1
    thalia4               Ready    <none>   48d   v1.13.1
    

    编辑2

    calicoctl node status 在TaLaAl4上

    [sudo] password for gms:
    Calico process is running.
    
    IPv4 BGP status
    +---------------+-------------------+-------+----------+---------+
    | PEER ADDRESS  |     PEER TYPE     | STATE |  SINCE   |  INFO   |
    +---------------+-------------------+-------+----------+---------+
    | 134.xx.xx.162 | node-to-node mesh | start | 02:36:29 | Connect |
    | 134.xx.xx.163 | node-to-node mesh | start | 02:36:29 | Connect |
    | 134.xx.xx.164 | node-to-node mesh | start | 02:36:29 | Connect |
    | 134.xx.xx.165 | node-to-node mesh | start | 02:36:29 | Connect |
    +---------------+-------------------+-------+----------+---------+
    

    虽然 kubectl describe node thalia4

    Name:               thalia4.domain
    Roles:              <none>
    Labels:             beta.kubernetes.io/arch=amd64
                        beta.kubernetes.io/os=linux
                        dns=dns4
                        kubernetes.io/hostname=thalia4
                        node_name=thalia4
    Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                        node.alpha.kubernetes.io/ttl: 0
                        projectcalico.org/IPv4Address: 134.xx.xx.168/26
                        volumes.kubernetes.io/controller-managed-attach-detach: true
    CreationTimestamp:  Mon, 03 Dec 2018 14:17:07 -0600
    Taints:             <none>
    Unschedulable:      false
    Conditions:
      Type             Status    LastHeartbeatTime                 LastTransitionTime                Reason                       Message
      ----             ------    -----------------                 ------------------                ------                       -------
      OutOfDisk        Unknown   Fri, 21 Dec 2018 11:58:38 -0600   Sat, 12 Jan 2019 16:44:10 -0600   NodeStatusUnknown            Kubelet stopped posting node status.
      MemoryPressure   False     Mon, 21 Jan 2019 20:54:38 -0600   Sat, 12 Jan 2019 16:50:18 -0600   KubeletHasSufficientMemory   kubelet has sufficient memory available
      DiskPressure     False     Mon, 21 Jan 2019 20:54:38 -0600   Sat, 12 Jan 2019 16:50:18 -0600   KubeletHasNoDiskPressure     kubelet has no disk pressure
      PIDPressure      False     Mon, 21 Jan 2019 20:54:38 -0600   Sat, 12 Jan 2019 16:50:18 -0600   KubeletHasSufficientPID      kubelet has sufficient PID available
      Ready            True      Mon, 21 Jan 2019 20:54:38 -0600   Sun, 20 Jan 2019 20:27:10 -0600   KubeletReady                 kubelet is posting ready status
    Addresses:
      InternalIP:  134.xx.xx.168
      Hostname:    thalia4
    Capacity:
     cpu:                4
     ephemeral-storage:  6878Mi
     hugepages-1Gi:      0
     hugepages-2Mi:      0
     memory:             8009268Ki
     pods:               110
    Allocatable:
     cpu:                4
     ephemeral-storage:  6490895145
     hugepages-1Gi:      0
     hugepages-2Mi:      0
     memory:             7906868Ki
     pods:               110
    System Info:
     Machine ID:                 c011569a40b740a88a672a5cc526b3ba
     System UUID:                42093037-F27E-CA90-01E1-3B253813B904
     Boot ID:                    ffa5170e-da2b-4c09-bd8a-032ce9fca2ee
     Kernel Version:             3.10.0-957.1.3.el7.x86_64
     OS Image:                   Red Hat Enterprise Linux
     Operating System:           linux
     Architecture:               amd64
     Container Runtime Version:  docker://1.13.1
     Kubelet Version:            v1.13.1
     Kube-Proxy Version:         v1.13.1
    PodCIDR:                     192.168.4.0/24
    Non-terminated Pods:         (3 in total)
      Namespace                  Name                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
      ---------                  ----                        ------------  ----------  ---------------  -------------  ---
      kube-system                calico-node-8xqbs           250m (6%)     0 (0%)      0 (0%)           0 (0%)         24h
      kube-system                coredns-786f4c87c8-sbks2    100m (2%)     0 (0%)      70Mi (0%)        170Mi (2%)     47h
      kube-system                kube-proxy-zp4fk            0 (0%)        0 (0%)      0 (0%)           0 (0%)         31d
    Allocated resources:
      (Total limits may be over 100 percent, i.e., overcommitted.)
      Resource           Requests   Limits
      --------           --------   ------
      cpu                350m (8%)  0 (0%)
      memory             70Mi (0%)  170Mi (2%)
      ephemeral-storage  0 (0%)     0 (0%)
    Events:              <none>
    

    我认为这是一个防火墙问题,但是我在Slack频道上被告知,“如果你不使用主机端点,那么我们就不会干扰你的主机的连接。听起来好像有什么东西挡住了主机上的179端口。”

    不知道会在哪里?iptables规则在所有节点上看起来都相同。

    2 回复  |  直到 6 年前
        1
  •  1
  •   horcle_buzz    6 年前

    我解决了这个问题。我必须为 cali-failsafe-in 链AS sudo iptables -A cali-failsafe-in -p tcp --match multiport --dport 179 -j ACCEPT 在所有节点上。

    现在,所有节点的功能似乎都正常:

    IPv4 BGP status
    +---------------+-------------------+-------+----------+-------------+
    | PEER ADDRESS  |     PEER TYPE     | STATE |  SINCE   |    INFO     |
    +---------------+-------------------+-------+----------+-------------+
    | 134.xx.xx.163 | node-to-node mesh | up    | 19:33:58 | Established |
    | 134.xx.xx.164 | node-to-node mesh | up    | 19:33:40 | Established |
    | 134.xx.xx.165 | node-to-node mesh | up    | 19:35:07 | Established |
    | 134.xx.xx.168 | node-to-node mesh | up    | 19:35:01 | Established |
    +---------------+-------------------+-------+----------+-------------+
    
        2
  •  0
  •   baozhenli    6 年前

    --network plugin=cni指定使用CNI网络插件,实际CNI插件二进制文件位于--cni bin dir(default/opt/cni/bin)中,CNI插件配置位于--cni conf dir(default/etc/cni/net.d)中。

    例如

    --网络插件=CNI

    --cni bin dir=/opt/cni/bin可能有多个cni bin,例如calico/weave…,您可以使用命令'/opt/cni/bin/calico-v'显示calico版本

    --cni conf dir=/etc/cni/net.d定义详细的cni插件配置,如下所示:

    {
      "name": "calico-network",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "calico",
          "mtu": 8950,
          "policy": {
            "type": "k8s"
          },
          "ipam": {
            "type": "calico-ipam",
            "assign_ipv6": "false",
            "assign_ipv4": "true"
          },
          "etcd_endpoints": "https://172.16.1.5:2379,https://172.16.1.9:2379,https://172.16.1.15:2379",
          "etcd_key_file": "/etc/etcd/ssl/etcd-client-key.pem",
          "etcd_cert_file": "/etc/etcd/ssl/etcd-client.pem",
          "etcd_ca_cert_file": "/etc/etcd/ssl/ca.pem",
          "kubernetes": {
            "kubeconfig": "/etc/kubernetes/cluster-admin.kubeconfig"
          }
        }
      ]
    }