Learn Kubernetes - from the beginning, part II, Pods, Nodes and Services

Follow me on Twitter, happy to take your suggestions on topics or improvements /Chris

This second part aims to give additional context to Nodes, Pods and introduce the concept Service. There is some theory and some practice with Kubectl.

Welcome back to this second part. By now concepts such kubectl, nodes and pods shouldn't be completely new to you but we will mention a little bit more what they are and of course, we will keep on learning about Kubernetes using our tool kubectl. We will also introduce the new concept of Services and how they are a better choice than a proxy to expose your apps.

This is part of a series of articles on Kubernetes:

In this part we will cover the following:

  • Deepen our knowledge on Pods and Nodes
  • Introduce Services and labeling
  • Perform an exercise, involving setting labels on Pods and use labels to query our artifacts

Resources

Concepts revisited

When we create a Deployment on Kubernetes, that Deployment creates Pods with containers inside them. So Pods are tied to Nodes and will continue to exist until terminated or deleted. Let's try to educate ourselves a bit more on Pods, Nodes and let's also introduce a new topic Services.

Pods

Pods are the atomic unit on the Kubernetes platform, i.e smallest possible deployable unit

We've stated the above before but it's worth mentioning again.

What else is there to know?

A Pod is an abstraction that represents a group of one or more containers, for example, Docker or rkt, and some shared resources for those containers. Those resources include:

  • Shared storage, as Volumes
  • Networking, as a unique cluster IP address
  • Information about how to run each container, such as the container image version or specific ports to use

A Pod can have more than one container. If it does contain more than one container it is so the other containers can support the primary application. Typical examples of helper applications are data pullers, data pushers, and proxies. You can read more on that use case here

  1. The containers in a Pod share an IP Address and port space and are:
  2. Always co-located
  3. Co-scheduled

Let me show you an image to make it easier to visualize:

As we can see above a Pod can have a lot of different artifacts in them that are able to communicate and support the app in some way.

Nodes

A Pod always runs on a Node

So Node is the Pods parent?

Yes.

A Node is a worker machine and may be either a virtual or a physical machine, depending on the cluster

Each Node is managed by the Master. A Node can have multiple pods.

So it's a one to many relationship

The Kubernetes master automatically handles scheduling the pods across the Nodes in the cluster

Every Kubernetes Node runs at least a:

  • Kubelet, is responsible for the pod spec and talks to the cri interface

  • Kube proxy, is the main interface for coms between nodes

  • A container runtime, (like Docker, rkt) responsible for pulling the container image from a registry, unpacking the container, and running the application.

Ok so a Node contains a Kubelet and container runtime and one to many Pods. I think I got it.

Let's show an image to make this info stick, cause it's quite important that we know what goes on, at least at a high level:

Services

Pods are mortal, they can die. Pods, in fact, have a lifecycle.

When a worker node dies, the Pods running on the Node are also lost.

What happens to our apps? 😦

You might think them and their data are lost but not so. The whole point with Kubernetes is to not let that happen. We normally deploy something like a ReplicaSet.

A ReplicaSet, what do you mean?

A ReplicaSet is a high-level artifact that can drive the cluster back to desired state via the creation of new Pods to keep your application running.

Ok so if a Pod goes down the ReplicaSet just creates a new Pod/s in its place?

Yes, exactly that. If you focus on defining a desired state the rest is up to Kubernetes.

Phew sounds really great then.

This concept of desired state is a very important one. You need to specify how many containers you want of each kind, at all times.

Oh so 4 database containers, 3 services etc?

Yes exactly.

So you don't have to care about the details just tell Kubernetes what state you want and it does the rest. If something goes up, Kubernetes ensures it comes back up again to desired state.

Each Pod in a Kubernetes cluster has a unique IP address, even Pods on the same Node, so there needs to be a way of automatically reconciling changes among Pods so that your applications continue to function.

Ok?

Yea, think like this. If a Pod containing your app goes down and another Pod is created in its place, running your app. Users should still be able to use your app after that.

Ok I got it. Makes me think...

The motivation for a Service

You should never refer to a Pod by it's IP address, just think what happens when a Pod goes down and comes back up again but this time with a different IP. It is for that reason a Service exists.

A Service in Kubernetes is an abstraction which defines a logical set of Pods and a policy by which to access them.

Makes me think of a routers and subnets

Yea I guess you can say there is a resemblance in there somewhere.

Services enable a loose coupling between dependent Pods and are defined using YAML or JSON file, just like all Kubernetes objects.

That's handy, just JSON and YAML 😃

Services and Labels

The set of Pods targeted by a Service is usually determined by a LabelSelector.

Although each Pod has a unique IP address, those IPs are not exposed outside the cluster without a Service. We can expose them through a proxy though as we showed in part I.

Wait, go back a second here, you said LabelSelector. I wasn't quite following?

Remember how we couldn't refer to Pods by IP, cause Pods might go down and a new Pod could come back in its place?

Yes

Well, labels are the answer to how Services and Pods are able to communicate. This is what we mean by loose coupling. By applying labels like for example frontend, backend, release and so on to Pods, we are able to refer to Pods by their logical name rather than their specifics, i.e IP number.

Oh I get it, so it's a high-level domain language

Mm, kind of.

Services and Traffic

Services allow your applications to receive traffic.

Services can be exposed in different ways by specifying a type in ServiceSpec, service specification.

  • ClusterIP (default), Exposes the Service on an internal IP in the cluster. This type makes the Service only reachable from within the cluster.
  • NodePort, Exposes the Service on the same port of each selected Node in the cluster using NAT. Makes a Service accessible from outside the cluster using <NodeIP>:<NodePort>. Superset of ClusterIP.
  • LoadBalancer, Creates an external load balancer in the current cloud (if supported) and assigns a fixed, external IP to the Service. Superset of NodePort.
  • ExternalName, Exposes the Service using an arbitrary name (specified by externalName in the spec) by returning a CNAME record with the name. No proxy is used. This type requires v1.7 or higher of kube-dns.

Ok I think I get it. Ensure I'm speaking externally to a Service instead of specific Pods. Depending on what I expose the Service as, that leads to different behavior?

Yea that's correct.

You said something about labels though, how do we create and apply them to Pods?

Yea lets talk about that next.

Labels

As we just mentioned, Services are the abstraction that allows pods to die and replicate in Kubernetes without impacting your application.

Now, Services match a set of Pods using labels and selectors, it allows us to operate on Pods like a group.

Labels are key/value pairs attached to objects and can be used in any number of ways:

  • Designate objects for development, test, and production
  • Embed version tags
  • Classify an object using tags

Labels can be attached to objects at creation time or later on. They can be modified at any time.

Lab - Fun with Labels and kubectl

It's a good idea to have read the first part of this series where we create a deployment. If you haven't you need to first create a deployment like so:

kubectl run kubernetes-first-app --image=gcr.io/google-samples/kubernetes-bootcamp:v1 --port=8080
1

Now we should be good to go.

Ok. I know you are probably all tired from all theory by now.

I bet you are just itching to learn more hands on Kubernetes with kubectl.

Well, the time for that has come 😃. We will do two things:

  1. Create a Service and learn how we can expose our app using said Service
  2. Learn about Labeling and how we can improve our querying game by having appropriate labels on our artifacts.

Let's create a new service.

We will get acquainted with the expose command.

Let's check for existing pods,

kubectl get pods
1

Next let's see what services we have:

kubectl get services
1

Next lets create a Service like so:

kubectl expose deployment/kubernetes-first-app --type="NodePort" --port 8080
1

As you can see above we are just targeting one of our deployments kubernetes-first-app and referring to it with [type]/[deployment name] and type being deployment.

We expose it as service of type NodePort and finally, we choose to expose it at port 8080.

Now run kubectl get services again and see the results:

As you can see we now have two services in use, our basic kubernetes service and our newly created kubernetes-first-app.

Next up we need to grab the port of our service and assign that to a variable:

export NODE_PORT=$(kubectl get services/kubernetes-first-app -o go-template='{{(index .spec.ports 0).nodePort}}')
echo NODE_PORT=$NODE_PORT
1
2

We now have a our port stored on environment variable NODE_PORT and we are ready to start communicating with our service like so:

curl $(minikube ip):$NODE_PORT
1

Which leads to the following output:

Creating and applying Labels

When we created our deployment and our Pod, it was automatically assigned with a label.

By typing

kubectl describe deployment
1

we can see the name of said label.

Next up we can query the pods by that same label

kubectl get pods -l run=kubernetes-first-app
1

Above we are using -l to query for a specific label and kubernetes-bootcamp as the name of the label. This gives us the following result:

You can do a similar query to your services:

kubectl get services -l run=kubernetes-first-app
1

That just shows that you can query on different levels, for specific Pods or Services that have Pods with that label.

Next up we will look at how to change the label

First let's get the name of the pod, like so:

POD_NAME=kubernetes-first-app-669789f4f8-6glpx
1

Above I'm just assigning what my Pod is called to a variable POD_NAME. Check with a kubectl getpods what your Pod is called.

Then we can add/apply the new label like so:

kubectl label pod $POD_NAME app=v1
1

Verify that the new label have been set, like so:

kubectl describe pod
1

or

kubectl describe pods $POD_NAME
1

As you can see from the result our new label app=v1 has been appended to existing labels.

Now we can query like so:

kubectl get pods -l app=v1
1

That's pretty much how labeling works, how to get available labels, apply them and use them in a query. Ensure to give them a descriptive name like an app version, a certain environment or a name like frontend or backend, something that makes sense to your situation.

Clean up

Ok, so we created a service. We should learn how to clean up after ourselves. Run the following command to remove our service:

kubectl delete service -l run=kubernetes-bootcamp
1

Verify the service is no longer there with:

kubectl get services
1

also, ensure our exposed IP and port can no longer be reached:

curl $(minikube ip):$NODE_PORT
1

Just because the service is gone doesn't mean the app is gone. The app should still be reachable on:

kubectl exec -ti $POD_NAME curl localhost:8080
1

Summary

So what did we learn? We learned a bit more on Pods and Nodes. Furthermore, we learned that we shouldn't speak directly to Pods but rather use a high-level abstraction such as Services. Services use labels as a way to define a domain language and apply those to different Pods.

Ok, so we understand a bit more on Kubernetes and how different concepts relate. We mentioned something called desired state a number of times but we didn't go into detail on how to set such a state. That's our next part in this series where we will cover how to set the desired state and how Kubernetes maintains it, so stay tuned.

Click here to go to the next part on Scaling.