Консул в Кубернетесе: капсулы Консула работают, но не готовы

Я создаю 3-узловый кластер внутри виртуальной машины Ubuntu, работающей на моем Mac с использованием Kind . Они работают так, как должны:

NAME                 STATUS   ROLES    AGE   VERSION
kind-control-plane   Ready    master   20h   v1.17.0
kind-worker          Ready    <none>   20h   v1.17.0
kind-worker2         Ready    <none>   20h   v1.17.0

Я установил Консул, используя официальный учебник с таблицей Хелма по умолчанию. Теперь проблема в том, что консольные модули либо работают, либо ожидают, и ни одна из них не готова:

NAME                        READY   STATUS    RESTARTS   AGE
busybox-6cd57fd969-9tzmf    1/1     Running   0          17h
hashicorp-consul-hgxdr      0/1     Running   0          18h
hashicorp-consul-server-0   0/1     Running   0          18h
hashicorp-consul-server-1   0/1     Running   0          18h
hashicorp-consul-server-2   0/1     Pending   0          18h
hashicorp-consul-vmsmt      0/1     Running   0          18h

Вот полное описание стручков:

Name:         busybox-6cd57fd969-9tzmf
Namespace:    default
Priority:     0
Node:         kind-worker2/172.17.0.4
Start Time:   Tue, 14 Jan 2020 17:45:03 +0800
Labels:       pod-template-hash=6cd57fd969
              run=busybox
Annotations:  <none>
Status:       Running
IP:           10.244.2.11
IPs:
  IP:           10.244.2.11
Controlled By:  ReplicaSet/busybox-6cd57fd969
Containers:
  busybox:
    Container ID:  containerd://710eba6a12607021098e3c376637476cd85faf86ac9abcf10f191126dc37026b
    Image:         busybox
    Image ID:      docker.io/library/busybox@sha256:6915be4043561d64e0ab0f8f098dc2ac48e077fe23f488ac24b665166898115a
    Port:          <none>
    Host Port:     <none>
    Args:
      sh
    State:          Running
      Started:      Tue, 14 Jan 2020 21:00:50 +0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zszqr (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-zszqr:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-zszqr
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>


Name:         hashicorp-consul-hgxdr
Namespace:    default
Priority:     0
Node:         kind-worker2/172.17.0.4
Start Time:   Tue, 14 Jan 2020 17:13:57 +0800
Labels:       app=consul
              chart=consul-helm
              component=client
              controller-revision-hash=6bc54657b6
              hasDNS=true
              pod-template-generation=1
              release=hashicorp
Annotations:  consul.hashicorp.com/connect-inject: false
Status:       Running
IP:           10.244.2.10
IPs:
  IP:           10.244.2.10
Controlled By:  DaemonSet/hashicorp-consul
Containers:
  consul:
    Container ID:  containerd://2209cfeaa740e3565213de6d0653dabbe9a8cbf1ffe085352a8e9d3a2d0452ec
    Image:         consul:1.6.2
    Image ID:      docker.io/library/consul@sha256:a167e7222c84687c3e7f392f13b23d9f391cac80b6b839052e58617dab714805
    Ports:         8500/TCP, 8502/TCP, 8301/TCP, 8301/UDP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
    Host Ports:    8500/TCP, 8502/TCP, 0/TCP, 0/UDP, 0/TCP, 0/TCP, 0/TCP, 0/UDP
    Command:
      /bin/sh
      -ec
      CONSUL_FULLNAME="hashicorp-consul"

      exec /bin/consul agent 
        -node="${NODE}" 
        -advertise="${ADVERTISE_IP}" 
        -bind=0.0.0.0 
        -client=0.0.0.0 
        -node-meta=pod-name:${HOSTNAME} 
        -hcl="ports { grpc = 8502 }" 
        -config-dir=/consul/config 
        -datacenter=dc1 
        -data-dir=/consul/data 
        -retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -domain=consul

    State:          Running
      Started:      Tue, 14 Jan 2020 20:58:29 +0800
    Ready:          False
    Restart Count:  0
    Readiness:      exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | 
grep -E '".+"'
] delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      ADVERTISE_IP:   (v1:status.podIP)
      NAMESPACE:     default (v1:metadata.namespace)
      NODE:           (v1:spec.nodeName)
    Mounts:
      /consul/config from config (rw)
      /consul/data from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from hashicorp-consul-client-token-4r5tv (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      hashicorp-consul-client-config
    Optional:  false
  hashicorp-consul-client-token-4r5tv:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hashicorp-consul-client-token-4r5tv
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/pid-pressure:NoSchedule
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
Events:
  Type     Reason     Age                   From                   Message
  ----     ------     ----                  ----                   -------
  Warning  Unhealthy  96s (x3206 over 14h)  kubelet, kind-worker2  Readiness probe failed:


Name:         hashicorp-consul-server-0
Namespace:    default
Priority:     0
Node:         kind-worker2/172.17.0.4
Start Time:   Tue, 14 Jan 2020 17:13:57 +0800
Labels:       app=consul
              chart=consul-helm
              component=server
              controller-revision-hash=hashicorp-consul-server-98f4fc994
              hasDNS=true
              release=hashicorp
              statefulset.kubernetes.io/pod-name=hashicorp-consul-server-0
Annotations:  consul.hashicorp.com/connect-inject: false
Status:       Running
IP:           10.244.2.9
IPs:
  IP:           10.244.2.9
Controlled By:  StatefulSet/hashicorp-consul-server
Containers:
  consul:
    Container ID:  containerd://72b7bf0e81d3ed477f76b357743e9429325da0f38ccf741f53c9587082cdfcd0
    Image:         consul:1.6.2
    Image ID:      docker.io/library/consul@sha256:a167e7222c84687c3e7f392f13b23d9f391cac80b6b839052e58617dab714805
    Ports:         8500/TCP, 8301/TCP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/UDP
    Command:
      /bin/sh
      -ec
      CONSUL_FULLNAME="hashicorp-consul"

      exec /bin/consul agent 
        -advertise="${POD_IP}" 
        -bind=0.0.0.0 
        -bootstrap-expect=3 
        -client=0.0.0.0 
        -config-dir=/consul/config 
        -datacenter=dc1 
        -data-dir=/consul/data 
        -domain=consul 
        -hcl="connect { enabled = true }" 
        -ui 
        -retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -server

    State:          Running
      Started:      Tue, 14 Jan 2020 20:58:27 +0800
    Ready:          False
    Restart Count:  0
    Readiness:      exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | 
grep -E '".+"'
] delay=5s timeout=5s period=3s #success=1 #failure=2
    Environment:
      POD_IP:      (v1:status.podIP)
      NAMESPACE:  default (v1:metadata.namespace)
    Mounts:
      /consul/config from config (rw)
      /consul/data from data-default (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from hashicorp-consul-server-token-hhdxc (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  data-default:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-default-hashicorp-consul-server-0
    ReadOnly:   false
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      hashicorp-consul-server-config
    Optional:  false
  hashicorp-consul-server-token-hhdxc:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hashicorp-consul-server-token-hhdxc
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                    From                   Message
  ----     ------     ----                   ----                   -------
  Warning  Unhealthy  97s (x10686 over 14h)  kubelet, kind-worker2  Readiness probe failed:


Name:         hashicorp-consul-server-1
Namespace:    default
Priority:     0
Node:         kind-worker/172.17.0.3
Start Time:   Tue, 14 Jan 2020 17:13:57 +0800
Labels:       app=consul
              chart=consul-helm
              component=server
              controller-revision-hash=hashicorp-consul-server-98f4fc994
              hasDNS=true
              release=hashicorp
              statefulset.kubernetes.io/pod-name=hashicorp-consul-server-1
Annotations:  consul.hashicorp.com/connect-inject: false
Status:       Running
IP:           10.244.1.8
IPs:
  IP:           10.244.1.8
Controlled By:  StatefulSet/hashicorp-consul-server
Containers:
  consul:
    Container ID:  containerd://c1f5a88e30e545c75e58a730be5003cee93c823c21ebb29b22b79cd151164a15
    Image:         consul:1.6.2
    Image ID:      docker.io/library/consul@sha256:a167e7222c84687c3e7f392f13b23d9f391cac80b6b839052e58617dab714805
    Ports:         8500/TCP, 8301/TCP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/UDP
    Command:
      /bin/sh
      -ec
      CONSUL_FULLNAME="hashicorp-consul"

      exec /bin/consul agent 
        -advertise="${POD_IP}" 
        -bind=0.0.0.0 
        -bootstrap-expect=3 
        -client=0.0.0.0 
        -config-dir=/consul/config 
        -datacenter=dc1 
        -data-dir=/consul/data 
        -domain=consul 
        -hcl="connect { enabled = true }" 
        -ui 
        -retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -server

    State:          Running
      Started:      Tue, 14 Jan 2020 20:58:36 +0800
    Ready:          False
    Restart Count:  0
    Readiness:      exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | 
grep -E '".+"'
] delay=5s timeout=5s period=3s #success=1 #failure=2
    Environment:
      POD_IP:      (v1:status.podIP)
      NAMESPACE:  default (v1:metadata.namespace)
    Mounts:
      /consul/config from config (rw)
      /consul/data from data-default (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from hashicorp-consul-server-token-hhdxc (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  data-default:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-default-hashicorp-consul-server-1
    ReadOnly:   false
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      hashicorp-consul-server-config
    Optional:  false
  hashicorp-consul-server-token-hhdxc:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hashicorp-consul-server-token-hhdxc
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                    From                  Message
  ----     ------     ----                   ----                  -------
  Warning  Unhealthy  95s (x10683 over 14h)  kubelet, kind-worker  Readiness probe failed:


Name:           hashicorp-consul-server-2
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app=consul
                chart=consul-helm
                component=server
                controller-revision-hash=hashicorp-consul-server-98f4fc994
                hasDNS=true
                release=hashicorp
                statefulset.kubernetes.io/pod-name=hashicorp-consul-server-2
Annotations:    consul.hashicorp.com/connect-inject: false
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  StatefulSet/hashicorp-consul-server
Containers:
  consul:
    Image:       consul:1.6.2
    Ports:       8500/TCP, 8301/TCP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
    Host Ports:  0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/UDP
    Command:
      /bin/sh
      -ec
      CONSUL_FULLNAME="hashicorp-consul"

      exec /bin/consul agent 
        -advertise="${POD_IP}" 
        -bind=0.0.0.0 
        -bootstrap-expect=3 
        -client=0.0.0.0 
        -config-dir=/consul/config 
        -datacenter=dc1 
        -data-dir=/consul/data 
        -domain=consul 
        -hcl="connect { enabled = true }" 
        -ui 
        -retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -server

    Readiness:  exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | 
grep -E '".+"'
] delay=5s timeout=5s period=3s #success=1 #failure=2
    Environment:
      POD_IP:      (v1:status.podIP)
      NAMESPACE:  default (v1:metadata.namespace)
    Mounts:
      /consul/config from config (rw)
      /consul/data from data-default (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from hashicorp-consul-server-token-hhdxc (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  data-default:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-default-hashicorp-consul-server-2
    ReadOnly:   false
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      hashicorp-consul-server-config
    Optional:  false
  hashicorp-consul-server-token-hhdxc:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hashicorp-consul-server-token-hhdxc
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  63s (x434 over 18h)  default-scheduler  0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 node(s) didn't match pod affinity/anti-affinity.


Name:         hashicorp-consul-vmsmt
Namespace:    default
Priority:     0
Node:         kind-worker/172.17.0.3
Start Time:   Tue, 14 Jan 2020 17:13:57 +0800
Labels:       app=consul
              chart=consul-helm
              component=client
              controller-revision-hash=6bc54657b6
              hasDNS=true
              pod-template-generation=1
              release=hashicorp
Annotations:  consul.hashicorp.com/connect-inject: false
Status:       Running
IP:           10.244.1.9
IPs:
  IP:           10.244.1.9
Controlled By:  DaemonSet/hashicorp-consul
Containers:
  consul:
    Container ID:  containerd://d502870f3476ea074b059361bc52a2c68ced551f5743b8448926bdaa319aabb0
    Image:         consul:1.6.2
    Image ID:      docker.io/library/consul@sha256:a167e7222c84687c3e7f392f13b23d9f391cac80b6b839052e58617dab714805
    Ports:         8500/TCP, 8502/TCP, 8301/TCP, 8301/UDP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
    Host Ports:    8500/TCP, 8502/TCP, 0/TCP, 0/UDP, 0/TCP, 0/TCP, 0/TCP, 0/UDP
    Command:
      /bin/sh
      -ec
      CONSUL_FULLNAME="hashicorp-consul"

      exec /bin/consul agent 
        -node="${NODE}" 
        -advertise="${ADVERTISE_IP}" 
        -bind=0.0.0.0 
        -client=0.0.0.0 
        -node-meta=pod-name:${HOSTNAME} 
        -hcl="ports { grpc = 8502 }" 
        -config-dir=/consul/config 
        -datacenter=dc1 
        -data-dir=/consul/data 
        -retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc 
        -domain=consul

    State:          Running
      Started:      Tue, 14 Jan 2020 20:58:35 +0800
    Ready:          False
    Restart Count:  0
    Readiness:      exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | 
grep -E '".+"'
] delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      ADVERTISE_IP:   (v1:status.podIP)
      NAMESPACE:     default (v1:metadata.namespace)
      NODE:           (v1:spec.nodeName)
    Mounts:
      /consul/config from config (rw)
      /consul/data from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from hashicorp-consul-client-token-4r5tv (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      hashicorp-consul-client-config
    Optional:  false
  hashicorp-consul-client-token-4r5tv:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hashicorp-consul-client-token-4r5tv
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/pid-pressure:NoSchedule
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
Events:
  Type     Reason     Age                   From                  Message
  ----     ------     ----                  ----                  -------
  Warning  Unhealthy  88s (x3207 over 14h)  kubelet, kind-worker  Readiness probe failed:

Ради полноты вот мой статус kubelet :

   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Wed 2020-01-15 10:59:06 +08; 1h 5min ago
     Docs: https://kubernetes.io/docs/home/
 Main PID: 11910 (kubelet)
    Tasks: 17
   Memory: 50.3M
      CPU: 1min 16.431s
   CGroup: /system.slice/kubelet.service
           └─11910 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml 

Jan 15 12:04:41 ubuntu kubelet[11910]: E0115 12:04:41.610779   11910 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message
Jan 15 12:04:42 ubuntu kubelet[11910]: W0115 12:04:42.370702   11910 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Jan 15 12:04:46 ubuntu kubelet[11910]: E0115 12:04:46.612639   11910 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message
Jan 15 12:04:47 ubuntu kubelet[11910]: W0115 12:04:47.371621   11910 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Jan 15 12:04:51 ubuntu kubelet[11910]: E0115 12:04:51.614925   11910 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message
Jan 15 12:04:52 ubuntu kubelet[11910]: W0115 12:04:52.372164   11910 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Jan 15 12:04:56 ubuntu kubelet[11910]: E0115 12:04:56.616201   11910 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message
Jan 15 12:04:57 ubuntu kubelet[11910]: W0115 12:04:57.372364   11910 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Jan 15 12:05:01 ubuntu kubelet[11910]: E0115 12:05:01.617916   11910 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message
Jan 15 12:05:02 ubuntu kubelet[11910]: W0115 12:05:02.372698   11910 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d

Любая помощь очень ценится.

Всего 2 ответа


Для отказоустойчивости консольного кластера рекомендуемый размер кворума равен 3 или 5, см. Таблицу развертывания.

Значение по умолчанию для кворума в Хелм-чартах равно 3

replicas (integer: 3) -The number of server agents to run.

`affinity (string) - This value defines the affinity for server pods. It defaults to allowing only a single pod on each node, which minimizes risk of the cluster becoming unusable if a node is lost`

Привязать сродство If you need to run more pods per node set this value to null.

Таким образом, для установки консула со значением кворума, равным 3, вам потребуется не менее 3 рабочих по расписанию рабочих узлов, удовлетворяющих требованию соответствия, для установки консула [фактически, вам нужно увеличить рабочие узлы до 5, если выбрано значение, которое будет обновлено до 5].

Это четко задокументировано в values.yaml на рулевой диаграмме, какие значения использовать для запуска в системе с меньшим количеством узлов

Сокращая количество реплик

~/test/consul-helm$ cat values.yaml | grep -i replica
  replicas: 3
  bootstrapExpect: 3 # Should <= replicas count
    # replicas. If you'd like a custom value, you can specify an override here.

Отключив Affinity

~/test/consul-helm$ cat values.yaml | grep -i -A 8 affinity

  # Affinity Settings
  # Commenting out or setting as empty the affinity variable, will allow
  # deployment to single node services such as Minikube
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              app: {{ template "consul.name" . }}
              release: "{{ .Release.Name }}"
              component: server
          topologyKey: kubernetes.io/hostname

Я скопировал вашу настройку, создав кластер из 3 узлов (1 мастер и 2 рабочих), и развернул консул со штурвалом и увидел то же, что вы видите. Все стручки бежали рядом с тем, который ожидал.

В объекте statefulset вы можете видеть, что есть podAntiAffinity, который запрещает планирование 2 или более серверных модулей на одном узле. Вот почему вы видите один пакет в состоянии ожидания.

Есть 4 способа, которыми я могу придумать, как заставить это работать.

  1. Главный узел имеет taint: node-role.kubernetes.io/master:NoSchedule который запрещает планирование любых модулей на главном узле. Вы можете удалить эту порчу, запустив: kubectl taint node kind-control-plane node-role.kubernetes.io/master:NoSchedule- (обратите внимание на знак минус, он говорит k8s удалить порчу), так что теперь планировщик сможет планировать один модуль консул-сервера , оставленный этому узлу.

  2. Вы можете добавить еще один рабочий узел.

  3. Вы можете удалить podAntiAffinity из объекта statfulset консул-сервера, чтобы планировщик не заботился о том, где запланированы модули.

  4. Измените requiredDuringSchedulingIgnoredDuringExecution на preferredDuringSchedulingIgnoredDuringExecution чтобы не выполнять это правило сходства, а только предпочтительнее .

Дайте мне знать, если это помогло.


Есть идеи?

10000