Running an application in Kubernetes

To run an application in Kubernetes, we first need to package it up into one or more container images, push those images to an image registry, and then post a description of our app to the Kubernetes API server.

The description includes information such as the container image or images that contain our application components, how those components are related to each other, and which ones need to be run co-located (together on the same node) and which don’t. For each component, we can also specify how many copies (or replicas) you want to run. Additionally, the description also includes which of those components provide a service to either internal or external clients and should be exposed through a single IP address and made discoverable to the other components.

UNDERSTANDING HOW THE DESCRIPTION RESULTS IN A RUNNING CONTAINER

When the API server processes our app’s description, the Scheduler schedules the specified groups of containers onto the available worker nodes based on computational resources required by each group and the unallocated resources on each node

KEEPING THE CONTAINERS RUNNING

Once the application is running, Kubernetes continuously makes sure that the deployed state of the application always matches the description you provided. For example, if we specify that we always want five instances of a web server running, Kubernetes will always keep exactly five instances running. If one of those instances stops working properly, like when its process crashes or when it stops responding, Kubernetes will restart it automatically. Similarly, if a whole worker node dies or becomes inaccessible, Kubernetes will select new nodes for all the containers that were running on the node and run them on the newly selected nodes.

SCALING THE NUMBER OF COPIES

While the application is running, we can decide we want to increase or decrease the number of copies, and Kubernetes will spin up additional ones or stop the excess ones, respectively. We can even leave the job of deciding the optimal number of copies to Kubernetes. It can automatically keep adjusting the number, based on real-time metrics, such as CPU load, memory consumption, queries per second, or any other metric your app exposes.

Last updated