This page contains general FAQ for the GCE Ingress controller.
- How do I deploy an Ingress controller?
- I created an Ingress and nothing happens, now what?
- What are the cloud resources created for a single Ingress?
- The Ingress controller events complain about quota, how do I increase it?
- Why does the Ingress need a different instance group then the GKE cluster?
- Why does the cloud console show 0/N healthy instances?
- Can I configure GCE health checks through the Ingress?
- Why does my Ingress have an ephemeral ip?
- Can I pre-allocate a static-ip?
- Does updating a Kubernetes secret update the GCE TLS certs?
- Can I tune the loadbalancing algorithm?
- Is there a maximum number of Endpoints I can add to the Ingress?
- How do I match GCE resources to Kubernetes Services?
- Can I change the cluster UID?
- Why do I need a default backend?
- How does Ingress work across 2 GCE clusters?
- I shutdown a cluster without deleting all Ingresses, how do I manually cleanup?
- How do I disable the GCE Ingress controller?
- What GCE resources are shared between Ingresses?
- How do I debug a controller spin loop?
- Creating an Internal Load Balancer without existing ingress
- Can I use websockets?
On GCP (either GCE or GKE), every Kubernetes cluster has an Ingress controller running on the master, no deployment necessary.
Please check the following:
- Output of
kubectl describe
, as shown here - Do your Services all have a
NodePort
? - Do your Services either serve an HTTP status code 200 on
/
, or have a readiness probe as described in this section? - Do you have enough GCP quota?
Terminology:
- Global Forwarding Rule: Manages the Ingress VIP
- TargetHttpProxy: Manages SSL certs and proxies between the VIP and backend
- URL Map: Routing rules
- Backend Service: Bridges various Instance Groups on a given Service NodePort
- Instance Group: Collection of Kubernetes nodes
The pipeline is as follows:
Global Forwarding Rule -> TargetHTTPProxy
| \ Instance Group (us-east1)
Static IP URL Map - Backend Service(s) - Instance Group (us-central1)
| / ...
Global Forwarding Rule -> TargetHTTPSProxy
SSL cert
In addition to this pipeline:
- Each Backend Service requires a HTTP or HTTPS health check to the NodePort of the Service
- Each port on the Backend Service has a matching port on the Instance Group
- Each port on the Backend Service is exposed through a firewall-rule open
to the GCE LB IP ranges (
130.211.0.0/22
and35.191.0.0/16
)
GLBC is not aware of your GCE quota. As of this writing users get 3 GCE Backend Services by default. If you plan on creating Ingresses for multiple Kubernetes Services, remember that each one requires a backend service, and request quota. Should you fail to do so the controller will poll periodically and grab the first free backend service slot it finds. You can view your quota:
$ gcloud compute project-info describe --project myproject
See GCE documentation for how to request more.
The controller adds/removes Kubernetes nodes that are NotReady
from the lb
instance group. We cannot simply rely on health checks to achieve this for
a few reasons.
First, older Kubernetes versions (<=1.3) did not mark endpoints on unreachable nodes as NotReady. Meaning if the Kubelet didn't heart beat for 10s, the node was marked NotReady, but there was no other signal at the Service level to stop routing requests to endpoints on that node. In later Kubernetes version this is handled a little better, if the Kubelet doesn't heart beat for 10s it's marked NotReady, if it stays in NotReady for 40s all endpoints are marked NotReady. So it is still advantageous to pull the node out of the GCE LB Instance Group in 10s, because we save 30s of bad requests.
Second, continuing to send requests to NotReady nodes is not a great idea. The NotReady condition is an aggregate of various factors. For example, a NotReady node might still pass health checks but have the wrong nodePort to endpoint mappings. The health check will pass as long as something returns a HTTP 200.
Some nodes are reporting negatively on the GCE HTTP health check. Please check the following:
- Try to access any node-ip:node-port/health-check-url
- Try to access any public-ip:node-port/health-check-url
- Make sure you have a firewall-rule allowing access to the GCE LB IP range (created by the Ingress controller on your behalf)
- Make sure the right NodePort is opened in the Backend Service, and consequently, plugged into the lb instance group
Currently health checks are not exposed through the Ingress resource, they're
handled at the node level by Kubernetes daemons (kube-proxy and the kubelet).
However the GCE L7 lb still requires a HTTP(S) health check to measure node
health. By default, this health check points at /
on the nodePort associated
with a given backend. Note that the purpose of this health check is NOT to
determine when endpoint pods are overloaded, but rather, to detect when a
given node is incapable of proxying requests for the Service:nodePort
altogether. Overloaded endpoints are removed from the working set of a
Service via readiness probes conducted by the kubelet.
If /
doesn't work for your application, you can have the Ingress controller
program the GCE health check to point at a readiness probe as shows in this
example.
We plan to surface health checks through the API soon.
GCE has a concept of ephemeral and static IPs. A production website would always want a static IP, which ephemeral IPs are cheaper (both in terms of quota and cost), and are therefore better suited for experimentation.
- Creating a HTTP Ingress (i.e an Ingress without a TLS section) allocates an ephemeral IP for 2 reasons:
- we want to encourage secure defaults
- static-ips have limited quota and pure HTTP ingress is often used for testing
- Creating an Ingress with a TLS section allocates a static IP
- Modifying an Ingress and adding a TLS section allocates a static IP, but the IP will change.
- You can promote an ephemeral to a static IP by hand, if required.
Yes, please see this example.
Yes, expect O(30s) delay.
The controller should create a second SSL certificate suffixed with -1
and
atomically swap it with the SSL certificate in your target proxy, then delete
the obsolete SSL certificate.
Right now, a kube-proxy NodePort service is a necessary condition for Ingress on GCP. This is because the cloud LB doesn't understand how to route directly to your pods. Incorporating kube-proxy and cloud lb algorithms so they cooperate toward a common goal is still a work in progress. If you really want fine grained control over the algorithm, you should deploy the nginx controller.
This limit is directly related to the maximum number of endpoints allowed in a Kubernetes cluster, not the the HTTP LB configuration, since the HTTP LB sends packets to VMs. Ingress is not yet supported on single zone clusters of size > 1000 nodes (issue). If you'd like to use Ingress on a large cluster, spread it across 2 or more zones such that no single zone contains more than a 1000 nodes. This is because there is a limit to the number of instances one can add to a single GCE Instance Group. In a multi-zone cluster, each zone gets its own instance group.
The format followed for creating resources in the cloud is:
k8s-<resource-name>-<nodeport>-<cluster-hash>
, where nodeport
is the output of
$ kubectl get svc <svcname> --template '{{range $i, $e := .spec.ports}}{{$e.nodePort}},{{end}}'
cluster-hash
is the output of:
$ kubectl get configmap -o yaml --namespace=kube-system | grep -i " data:" -A 1
data:
uid: cad4ee813812f808
and resource-name
is a short prefix for one of the resources mentioned here
(eg: be
for backends, hc
for health checks). If a given resource is not tied
to a single node-port
, its name will not include the same.
The Ingress controller configures itself to add the UID it stores in a configmap in the kube-system
namespace.
$ kubectl --namespace=kube-system get configmaps
NAME DATA AGE
ingress-uid 1 12d
$ kubectl --namespace=kube-system get configmaps -o yaml
apiVersion: v1
items:
- apiVersion: v1
data:
uid: UID
kind: ConfigMap
...
You can pick a different UID, but this requires you to:
- Delete existing Ingresses
- Edit the configmap using
kubectl edit
- Recreate the same Ingress
After step 3 the Ingress should come up using the new UID as the suffix of all cloud resources. You can't simply change the UID if you have existing Ingresses, because renaming a cloud resource requires a delete/create cycle that the Ingress controller does not currently automate. Note that the UID in step 1 might be an empty string, if you had a working Ingress before upgrading to Kubernetes 1.3.
A note on setting the UID: The Ingress controller uses the token --
to split a machine generated prefix from the UID itself. If the user supplied UID is found to
contain --
the controller will take the token after the last --
, and use an empty string if it ends with --
. For example, if you insert foo--bar
as the UID,
the controller will assume bar
is the UID. You can either edit the configmap and set the UID to bar
to match the controller, or delete existing Ingresses as described
above, and reset it to a string bereft of --
.
All GCE URL maps require at least one default backend, which handles all
requests that don't match a host/path. In Ingress, the default backend is
optional, since the resource is cross-platform and not all platforms require
a default backend. If you don't specify one in your yaml, the GCE ingress
controller will inject the default-http-backend Service that runs in the
kube-system
namespace as the default backend for the GCE HTTP lb allocated
for that Ingress resource.
Some caveats concerning the default backend:
- It is the only Backend Service that doesn't directly map to a user specified NodePort Service
- It's created when the first Ingress is created, and deleted when the last Ingress is deleted, since we don't want to waste quota if the user is not going to need L7 loadbalancing through Ingress
- It has a HTTP health check pointing at
/healthz
, not the default/
, because/
serves a 404 by design
See kubemci documentation.
If you kill a cluster without first deleting Ingresses, the resources will leak. If you find yourself in such a situation, you can delete the resources by hand:
- Navigate to the cloud console and click on the "Networking" tab, then choose "LoadBalancing"
- Find the loadbalancer you'd like to delete, it should have a name formatted as:
k8s-um-ns-name--UUID
- Delete it, check the boxes to also cascade the deletion down to associated resources (eg: backend-services)
- Switch to the "Compute Engine" tab, then choose "Instance Groups"
- Delete the Instance Group allocated for the leaked Ingress, it should have a name formatted as:
k8s-ig-UUID
We plan to fix this soon.
As of Kubernetes 1.3, GLBC runs as a static pod on the master. If you want to disable it, you have 3 options:
Option 1. Have it no-op for an Ingress resource based on the ingress.class
annotation as shown here.
This can also be used to use one of the other Ingress controllers at the same time as the GCE controller.
Option 2. SSH into the GCE master node and delete the GLBC manifest file found at /etc/kubernetes/manifests/glbc.manifest
.
Option 3. Disable the addon in GKE via gcloud
:
Disable the addon in GKE at cluster bring-up time through the disable-addons
flag:
gcloud container clusters create mycluster --network "default" --num-nodes 1 \
--machine-type n1-standard-2 \
--zone $ZONE \
--disk-size 50 \
--scopes storage-full \
--disable-addons HttpLoadBalancing
Disable the addon in GKE for an existing cluster through the update-addons
flag:
gcloud container clusters update mycluster --update-addons HttpLoadBalancing=DISABLED
Every Ingress creates a pipeline of GCE cloud resources behind an IP. Some of these are shared between Ingresses out of necessity, while some are shared because there was no perceived need for duplication (all resources consume quota and usually cost money).
Shared:
-
Backend Services: because of low quota and high reuse. A single Service in a Kubernetes cluster has one NodePort, common throughout the cluster. GCE has a hard limit of the number of allowed Backend Services, so if multiple Ingresses all point to a single Service, that creates a single Backend Service in GCE pointing to that Service's NodePort.
-
Instance Group: since an instance can only be part of a single loadbalanced Instance Group, these must be shared. There is 1 Ingress Instance Group per zone containing Kubernetes nodes.
-
Health Checks: currently the health checks point at the NodePort of a Backend Service. They don't need to be shared, but they are since Backend Services are shared.
-
Firewall rule: There is a single firewall rule that covers health check traffic from the range of GCE loadbalancer IPs to entire NodePort range.
Unique:
Currently, a single Ingress on GCE creates a unique IP and URL Map. In this model the following resources cannot be shared:
- URL Map
- Target HTTP(S) Proxies
- SSL Certificates
- Static-ip
- Forwarding rules
The most likely cause of a controller spin loop is some form of GCE validation failure, eg:
- It's trying to delete a Backend Service already in use, say in a URL Map
- It's trying to add an Instance to more than 1 loadbalanced Instance Groups
- It's trying to flip the loadbalancing algorithm on a Backend Service to RATE, when some other Backend Service is pointing at the same Instance Group and asking for UTILIZATION
In all such cases, the work queue will put a single key (ingress namespace/name) that's getting continuously re-queued into exponential backoff. However, currently the Informers that watch the Kubernetes API are setup to periodically resync, so even though a particular key is in backoff, we might end up syncing all other keys every, say, 10m, which might trigger the same validation-error-condition when syncing a shared resource.
How the GCE ingress controller Works
To assemble an L7 Load Balancer, the ingress controller creates an unmanaged instance-group named k8s-ig--{UID}
and adds every known minion node to the group. For every service specified in all ingresses, a Backend Service is created to point to that instance group.
How the Internal Load Balancer Works
K8s does not yet assemble ILB's for you, but you can manually create one via the GCP Console. The ILB is composed of a regional forwarding rule and a regional Backend Service. Similar to the L7 LB, the Backend Service points to an unmanaged instance-group containing your K8s nodes.
The Complication
GCP will only allow one load balanced unmanaged instance-group for a given instance.
If you manually created an instance group named something like my-kubernetes-group
containing all your nodes and put an ILB in front of it, then you will probably encounter a GCP error when setting up an ingress resource. The controller doesn't know to use your my-kubernetes-group
group and will create it's own. Unfortunately, it won't be able to add any nodes to that group because they already belong to the ILB group.
As mentioned before, the instance group name is composed of a hard-coded prefix k8s-ig--
and a cluster-specific UID. The ingress controller will check the K8s configmap for an existing UID value at process start. If it doesn't exist, the controller will create one randomly and update the configmap.
Want an ILB and Ingress?
If you plan on creating both ingresses and internal load balancers, simply create the ingress resource first then use the GCP Console to create an ILB pointing to the existing instance group.
Want just an ILB for now, ingress maybe later?
Retrieve the UID via configmap, create an instance-group per used zone, then add all respective nodes to the group.
# Fetch instance group name from config map
GROUPNAME=`kubectl get configmaps ingress-uid -o jsonpath='k8s-ig--{.data.uid}' --namespace=kube-system`
# Create an instance group for every zone you have nodes. If you use GKE, this is probably a single zone.
gcloud compute instance-groups unmanaged create $GROUPNAME --zone {ZONE}
# Look at your list of your nodes
kubectl get nodes
# Add minion nodes that exist in zone X to the instance group in zone X. (Do not add the master!)
gcloud compute instance-groups unmanaged add-instances $GROUPNAME --zone {ZONE} --instances=A,B,C...
You can now follow the GCP Console wizard for creating an internal load balancer and point to the k8s-ig--{UID}
instance group.
Yes!
View the example.