1
I Use This!
Activity Not Available

News

Posted over 6 years ago by Kirill
In a previous tutorial, you learned how to configure and deploy Prometheus to monitor your Kubernetes applications.  Configuring Prometheus is not a trivial task because you need to have a domain-specific knowledge including ... [More] Prometheus configuration format and Kubernetes auto-discover settings. Obviously, acquiring this knowledge takes time and effort.  However, as we show in this tutorial, you can dramatically simplify the deployment and management of your Prometheus instances with the Prometheus Operator developed by CoreOS. We discuss how the Prometheus Operator could benefit your monitoring pipeline, and then we walk you through setting up a working Prometheus Operator to collect Prometheus-format metrics from your applications. Let's get started! What are Operators? The concept of software operators was introduced by CoreOS back in 2016. In a nutshell, an operator is any application-specific or domain-specific controller that extends the Kubernetes API to simplify deployment, configuration, and management of complex stateful applications on behalf of Kubernetes users.  Under the hood, operators abstract basic Kubernetes APIs and controllers and automate common tasks for specific applications (e.g., Prometheus). Thanks to this abstraction, users can easily configure complex applications even with little knowledge of their domain-specific configuration or language. In addition, operators can be useful for a broad array of other tasks including safe coordination of app upgrades, service discovery, TLS certificate configuration, disaster recovery, backup management, etc. Prometheus Operator Building on the definition above, the Prometheus Operator is a piece of software on top of Kubernetes that enables simpler management of Prometheus instances, including configuration and service discovery. It allows the user to easily launch multiple instances of Prometheus, to configure Prometheus versions, as well to manage retention policies, persistence, and replicas. In addition, the Prometheus Operator can automatically generate monitoring target settings based on Kubernetes label queries. Users can just refer to services and pods they want to monitor in Prometheus Operator's manifest, and the Operator will take care of inserting appropriate Prometheus configuration for the Kubernetes auto-discovery. To implement this functionality, Prometheus Operator introduces additional resources and abstractions designed as Custom Resource Definitions (CRD). These include: Prometheus resource that describes the desired state of the Prometheus deployment. Service monitors that describe and manage monitoring targets to be scraped by Prometheus. The Prometheus resource connects to ServiceMonitors using a serviceMonitorSelector field. This way Prometheus sees what targets (apps) have to be scraped. Alert manager resource to define, configure, and manage Prometheus alert manager. In this article, we explore only the Prometheus resource and Service Monitors -- the minimum needed to configure Prometheus Operator to monitor your Kubernetes cluster. To complete examples used below, you'll need the following prerequisites: A running Kubernetes cluster. See Supergiant documentation for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube. A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here. With this environment set, we are going to monitor a simple web application exporting Prometheus-format metrics. Let's get started! Step 1: Create a Prometheus Operator A Prometheus Operator has to access Kubernetes API, nodes, and cluster components, so we should grant it some permissions. We can do this via the ClusterRole resource that defines an RBAC policy. The ClusterRole contains rules that represent a set of permissions. These permissions are additive, so we should list them all. We will be using the ClusterRole resource that can grant permissions to manipulate resources of the entire cluster as opposed to Role which is namespace-scoped. apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRole metadata: name: prometheus-operator rules: - apiGroups: - extensions resources: - thirdpartyresources verbs: - "*" - apiGroups: - apiextensions.k8s.io resources: - customresourcedefinitions verbs: - "*" - apiGroups: - monitoring.coreos.com resources: - alertmanagers - prometheuses - prometheuses/finalizers - servicemonitors verbs: - "*" - apiGroups: - apps resources: - statefulsets verbs: ["*"] - apiGroups: [""] resources: - configmaps - secrets verbs: ["*"] - apiGroups: [""] resources: - pods verbs: ["list", "delete"] - apiGroups: [""] resources: - services - endpoints verbs: ["get", "create", "update"] - apiGroups: [""] resources: - nodes verbs: ["list", "watch"] - apiGroups: [""] resources: - namespaces verbs: ["list"] The above manifest grants the Prometheus Operator the following cluster-wide permissions: read access to pods, nodes, and namespaces. read/write access to services and their endpoints. full access to secrets, ConfigMaps, StatefuleSets, Prometheus-related resources (alert managers, service monitors, etc.) and other third-party resources, etc. Next, we need to provide an identity for our Prometheus Operator. This can be done with a service account. apiVersion: v1 kind: ServiceAccount metadata: name: prometheus-operator Now, as we have a ClusterRole and a ServiceAccount , we need to bind the list of permissions defined in the ClusterRole to the Prometheus Operator. The ClusterRoleBinding allows associating a list of users, groups, or service accounts to a specific role. We are going to bind our ClusterRole to the Prometheus Operator's Service Account. apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: prometheus-operator roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: prometheus-operator subjects: - kind: ServiceAccount name: prometheus-operator namespace: default Note that roleRef.name should match the name of the ClusterRole created in the first step and the subjects.name should match the name of the Service Account created in the second step. We are going to create these resources in bulk, so put the above manifests into one file (e.g., authorize.yml) separating each manifest by - - - delimeter. Then run: kubectl create -f authorize.yml clusterrolebinding.rbac.authorization.k8s.io "prometheus-operator" created clusterrole.rbac.authorization.k8s.io "prometheus-operator" created serviceaccount "prometheus-operator" created Great! Now we have all permissions required by the Prometheus Operator to manage Prometheus instances and monitor applications. Let's create a one-replica deployment for the Prometheus Operator: apiVersion: extensions/v1beta1 kind: Deployment metadata: labels: k8s-app: prometheus-operator name: prometheus-operator spec: replicas: 1 template: metadata: labels: k8s-app: prometheus-operator spec: containers: - args: - --kubelet-service=kube-system/kubelet - --config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1 image: quay.io/coreos/prometheus-operator:v0.17.0 name: prometheus-operator ports: - containerPort: 8080 name: http resources: limits: cpu: 300m memory: 200Mi requests: cpu: 200m memory: 70Mi securityContext: runAsNonRoot: true runAsUser: 65534 serviceAccountName: prometheus-operator There are a few important things that this manifest does: Defines several arguments for the prometheus-operator container to run with. In particular, we load the configmap-reload image to be able to dynamically update Prometheus ConfigMaps and specify kube-system/kubelet in the --kubelet-service flag. Defines the Prometheus Operator as the non-root user with the user ID 65534. Associates the deployment with the service account created in the step above. Now, let's save this spec in the prometheus-deployment.yml and create the deployment: kubectl create -f prometheus-deployment.yml deployment.extensions "prometheus-operator" created Verify that the deployment's pods are running: kubectl get pods NAME READY STATUS RESTARTS AGE prometheus-operator-77648fb66c-skjqp 1/1 Running 0 1m Step 2: Deploy the App Shipping Prometheus-format Metrics At this point, the Prometheus Operator has no apps to monitor. Thus, before defining ServiceMonitors and Prometheus CRD, we need to deploy some app shipping Prometheus-format metrics. For this purpose, we used an example application from the Go client library that exports fictional RPC latencies of some service. To deploy the application in the Kubernetes cluster, we containerized it with Docker and pushed to the Docker Hub repository. Let's deploy this example app serving metrics at /metrics endpoint which Prometheus watches by default. Below is the deployment manifest we used: apiVersion: apps/v1 kind: Deployment metadata: name: rpc-app-deployment spec: selector: matchLabels: app: rpc-app replicas: 2 template: metadata: labels: app: rpc-app spec: containers: - name: rpc-app-cont image: supergiantkir/prometheus-test-app ports: - name: web containerPort: 8081 Please, note the containerPort 8081 which is the port defined in the application code. Save this manifest in rpc-app-deployment.yml and create the deployment: kubectl create -f rpc-app-deployment.yml deployment.apps "rpc-app-deployment" created Let's verify that our deployment successfully launched two pod replicas of our app: kubectl get pods -l app=rpc-app NAME READY STATUS RESTARTS AGE rpc-app-deployment-698bd8658d-glj6f 1/1 Running 0 1m rpc-app-deployment-698bd8658d-xsdd4 1/1 Running 0 1m To let the Prometheus Operator access this deployment, we need to expose a service. This service can then be discovered by the ServiceMonitor using label selectors. We need to create a service that selects pods by their app label and its rpc-app value. Let's take a look at this service manifest: apiVersion: v1 kind: Service metadata: name: rpc-app-service labels: app: rpc-app spec: ports: - name: web port: 8081 targetPort: 8081 protocol: TCP selector: app: rpc-app Also, take notice that we specified a targetPort for this service that refers to the port on backend pods of the service. If the targetPort value is not specified, Kubernetes automatically assigns the value of containerPort to the targetPort , but we included the field explicitly to highlight its importance. Let's save this spec above in some file (e.g., rpc-app-service.yml) and create the service: kubectl create -f rpc-app-service.yml service "rpc-app-service" created You can now verify that the service successfully discovered the deployment's endpoints and configured the right ports: kubectl describe svc rpc-app-service Name: rpc-app-service Namespace: default Labels: app=rpc-app Annotations: Selector: app=rpc-app Type: ClusterIP IP: 10.105.163.103 Port: web 8081/TCP TargetPort: 8081/TCP Endpoints: 172.17.0.7:8081,172.17.0.8:8081 Session Affinity: None Events: Step 3: Create a ServiceMonitor Prometheus Operator uses ServiceMonitors to auto-detect target pods based on the label selectors and associate them with the Prometheus instances. Let's take a look at the manifest below: apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: rpc-app labels: env: production spec: selector: matchLabels: app: rpc-app endpoints: - port: web The ServiceMonitor defined above will select pods labeled app:rpc-app using spec.selector.matchLabels field. Please notice that the spec.selector.matchLabels should match app:rpc-app so that the ServiceMonitor finds the corresponding endpoints of the deployment. Also, we defined the env:production label for the ServiceMonitor. This label will be used by the Prometheus Operator to find the ServiceMonitor. Finally, because we deployed our rpc-app-container with the named port "web," we can easily refer to it in the ServiceMonitor without specifying the port number. This allows us to change the port number later without affecting the integrity of other resources. Let's create the ServiceMonitor: kubectl create -f service-monitor.yml servicemonitor.monitoring.coreos.com "rpc-app" created Step 4: Create a Prometheus Resource The next step is to create a Prometheus resource. Its manifest defines the serviceMonitorSelector that associates ServiceMonitors with the operator. The value of this field should match the label env:production specified in the ServiceMonitor manifest above. Using ServiceMonitor labels makes it easy to dynamically reconfigure Prometheus. apiVersion: monitoring.coreos.com/v1 kind: Prometheus metadata: name: prometheus spec: serviceAccountName: prometheus serviceMonitorSelector: matchLabels: env: production resources: requests: memory: 400Mi Also, notice that you should refer to the service account created in Step 1 above. Without this, the Prometheus Operator won't be permitted to access the cluster resources and APIs. This tiny detail was addressed in the issue #1272 on GitHub.  Also, If RBAC authorization is enabled in your cluster, you must create RBAC rules for both Prometheus and Prometheus Operator. Refer to the chapter "Enable RBAC rules for Prometheus Pods" of the official CoreOS documentation to find the required RBAC resource definitions. Now, let's save this manifest in the prometheus-resource.yml and create the Prometheus resource: kubectl create -f prometheus-resource.yml prometheus.monitoring.coreos.com "prometheus" created Finally, we need to create a Prometheus Service of a NodePort type to expose Prometheus to the external world. That way we can access the Prometheus web interface. apiVersion: v1 kind: Service metadata: name: prometheus spec: type: NodePort ports: - name: web nodePort: 30900 port: 9090 protocol: TCP targetPort: web selector: prometheus: prometheus Save this spec in the prometheus-service.yml and create the service: kubectl create -f prometheus-service.yml service "prometheus" created You can now access the Prometheus dashboard from your browser. If running your cluster with Minikube, you can find the Prometheus IP and port with the following command: minikube service prometheus --url http://192.168.99.100:30900 You can then access the Prometheus dashboard in your browser entering this address. If you go the /targets endpoint, you'll see the list of the current Prometheus targets. Each deployment replica is treated as a separate target, so you'll see two targets in your dashboard. You can also find the target's labels and the time of the last scrape. The Prometheus Operator automatically created a working Prometheus configuration with the kubernetes_sd_configs for the auto-discovery of Kubernetes service endpoints. This is a really cool feature because it frees you from the necessity to learn Prometheus-specific configuration language. You can see the automatically generated Prometheus configuration under Status -> Configuration tab: Finally, we can visualize RPC time series generated by our example app. To do this, go to the Graph tab where you can select the metrics to visualize. In the example above, we visualized rpc_durations_histogram_seconds metrics. As you see, we used a "stacked" option for time series visualization, but you can of course opt for simple lines. You can play around with other RPC metrics and native Prometheus metrics as well. The web interface also supports Prometheus query language PromQL to select and aggregate metrics you need. PromQL has a rich functional semantics that allows you to work with time series instance and range vectors, scalars, and strings. To learn more about PromQL check out the official documentation. Conclusion As you've now learned, the Prometheus Operator for Kubernetes offers useful abstractions for configuring and managing your Prometheus monitoring pipeline. Using the operator means you no longer need to manually configure Kubernetes auto-discovery settings, which involves learning a lot of stuff. All you need to define is the ServiceMonitor with a list of pods from which to scrape metrics, and the Prometheus resource that automates configuration and links ServiceMonitors to running Prometheus instances. Along with these features, the Prometheus Operator supports fast configuration of Prometheus alert managers. All these features dramatically simplify the management of your Prometheus monitoring pipeline while retaining flexibility and control if needed. [Less]
Posted over 6 years ago by Kirill
In the first part of the Kubernetes monitoring series, we discussed how Kubernetes monitoring architecture is divided into the core metrics pipeline for system components and monitoring pipeline based on the Custom metrics API. ... [More] Full monitoring pipelines based on the Custom metrics API can process diverse types of metrics (both core and non-core), which makes them a good fit for monitoring both cluster components and user applications running in your cluster(s). Plenty of solutions exist for monitoring your Kubernetes clusters. Some of the most popular are Heapster, Prometheus, and a number of proprietary Application Performance Management (APM) vendors like Sysdig, Datadog, or Dynatrace.  In this article, we discuss Prometheus because it is open source software with native support for Kubernetes. Monitoring Kubernetes clusters with Prometheus is a natural choice because many Kubernetes components ship Prometheus-format metrics by default and, therefore, they can be easily discovered by Prometheus.  In this post, we'll overview the Prometheus architecture and walk you through configuring and deploying it to monitor an example application shipping the Prometheus-format metrics. Let's get started! What Is Prometheus? Prometheus is an open source monitoring and alerting toolkit originally developed by SoundCloud in 2012, and the platform has attracted a vibrant developer and user community. Prometheus is now closely integrated into cloud-native ecosystem and has native support for containers and Kubernetes. When you deploy Prometheus in production, you get the following features and benefits: A multi-dimensional data model. Prometheus stores all data as time series identified by metric name and key/value pairs. The data format looks like this: {=, ...} For example, using this format we can represent a total number of HTTP POST request to the /messages endpoint like this: api_http_requests_total{method="POST", handler="/messages"} This approach resembles the way Kubernetes organizes data with labels. Prometheus data model facilitates flexible and accurate time series data and is great if your data is highly dimensional. A Flexible Query Language. Prometheus ships with PromQL, a functional query language that leverages high dimensionality of data. It allows users to select, query, and aggregate metrics collected by Prometheus preparing them for subsequent analysis and visualization. PromQL is powerful in dealing with time series due to its native support for complex data types such as instant vectors and range vectors, as well as simple scalar and string data types. Efficient Pull Model for Metrics Collection. Prometheus collects metrics via a pull model over HTTP. This approach makes shipping application metrics to Prometheus very simple. In particular, you don't need to push metrics to Prometheus explicitly. All you need to do is to expose a web port in your application and design a REST API endpoint that will expose the Prometheus-format metrics. If your application does not have Prometheus-format metrics, there are several metrics exporters that will help you convert it to the native Prometheus format. Once the /metrics endpoint is created, Prometheus will use its powerful auto-discover plugins to collect, filter, and aggregate the metrics. Prometheus has good support for a number of metrics providers including Kubernetes, Open Stack, GCE, AWS EC2, Zookeeper Serverset, and more. Developed Ecosystem. Prometheus has a developed ecosystem of components and tools including various client libraries for instrumenting application code, special-purpose exporters to convert data into Prometheus format, AlertManagers, web UI, and more. Efficient auto-discove and excellent support for containers and Kubernetes make Prometheus a perfect choice monitoring Kubernetes applications and cluster components. For this tutorial, we will monitor a simple web application exporting Prometheus-format metrics. We used an example application from the Go client library that exports fictional RPC latencies of some service. To deploy the application in the Kubernetes cluster, we containerized it using Docker and pushed to the Docker Hub repository. To complete examples used below, you'll need the following prerequisites: A running Kubernetes cluster. See Supergiant documentation for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube. A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here. Step 1: Enabling RBAC for Prometheus We need to grant some permissions to Prometheus to access pods, endpoints, and services running in your cluster, and we can do this via the ClusterRole resource that defines an RBAC policy. In the ClusterRole manifest, we list various permissions for Prometheus to manipulate (read/write) various cluster resources. Let's look at the manifest below: apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRole metadata: name: prometheus rules: - apiGroups: [""] resources: - nodes - services - endpoints - pods verbs: ["get", "list", "watch"] - apiGroups: [""] resources: - configmaps verbs: ["get"] - nonResourceURLs: ["/metrics"] verbs: ["get"] The above manifest grants Prometheus the following cluster-wide permissions: read and watch access to pods, nodes, services, and endpoints. read access to ConfigMaps read access to non-resource URLs such as /metrics URLs shipping the Prometheus-format metrics. In addition to ClusterRole, we need to create a ServiceAccount for Prometheus to represent its identity in the cluster. apiVersion: v1 kind: ServiceAccount metadata: name: prometheus Finally, we need to bind the ServiceAccount and ClusterRole using the ClusterRoleBinding resource. The ClusterRoleBinding allows associating a list of users, groups, or service accounts with a specific role. apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: prometheus roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: prometheus subjects: - kind: ServiceAccount name: prometheus namespace: default Note that roleRef.name should match the name of the ClusterRole created in the first step and the subjects.name should match the name of the ServiceAccount created in the second step. We are going to create these resources in bulk, so put the above manifests into one file (e.g., rbac.yml), separating each manifest by - - - delimeter. Then run: kubectl create -f rbac.yml clusterrolebinding.rbac.authorization.k8s.io "prometheus" created clusterrole.rbac.authorization.k8s.io "prometheus" created serviceaccount "prometheus" created Step 2: Deploy Prometheus The next step is configuring Prometheus. The configuration will contain a list of scrape targets and Kubernetes auto-discovery settings that will allow Prometheus to automatically detect applications that ship metrics. global: scrape_interval: 15s # By default, scrape targets every 15 seconds. # Attach these labels to any time series or alerts when communicating with # external systems (federation, remote storage, Alertmanager). external_labels: monitor: 'codelab-monitor' # Scraping Prometheus itself scrape_configs: - job_name: 'prometheus' scrape_interval: 5s static_configs: - targets: ['localhost:9090'] - job_name: 'kubernetes-service-endpoints' kubernetes_sd_configs: - role: endpoints relabel_configs: - action: labelmap regex: __meta_kubernetes_service_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_service_name] action: replace target_label: kubernetes_name As you see, the configuration contains two main sections: global configuration and scrape configuration. The global section includes parameters that are valid in all configuration contexts. In this section, we define a 15-second scrape interval and external labels. In its turn, the scrape config section defines jobs/targets for Prometheus to watch. Here, you can override global values such as a scrape interval. In each job section, you can also provide a target endpoint for Prometheus to listen to. As you understand, Kubernetes services and deployments are dynamic. Therefore, we can't know their URL before running them. Fortunately, Prometheus auto-discover features can address this problem. Prometheus ships with the Kubernetes auto-discover plugin named kubernetes_sd_configs that we use in the second job definition. We set kubernetes_sd_configs to watch for only service endpoints shipping Prometheus-format metrics. We have also included some re-label rules for replacing lengthy Kubernetes names and labels with custom values to simplify monitoring. For this tutorial, we targeted only service endpoints, but you can configure kubernetes_sd_configs to watch nodes, pods, and any other resource in your Kubernetes cluster. So far, we've mentioned just a few configuration parameters supported by Prometheus. You may be also interested in some others such as: scrape_timeout --  how long it takes until a scrape request times out. basic_auth for setting "Authorization" header for each scrape request. service-specific auto-discover configurations for Consul, Amazon EC2, GCE, etc. For a full list of available configuration options, see the official Prometheus documentation. Let's save the configuration above in the prometheus.yml file and create the ConfigMap with the following command: kubectl create configmap prometheus-config --from-file prometheus.yml configmap "prometheus-config" created Next, we will deploy Prometheus using the container image from the Docker Hub repository. Our deployment manifest looks like this: apiVersion: apps/v1 kind: Deployment metadata: name: prometheus-deployment spec: replicas: 2 selector: matchLabels: app: prometheus template: metadata: labels: app: prometheus spec: containers: - name: prometheus-cont image: prom/prometheus volumeMounts: - name: config-volume mountPath: /etc/prometheus/prometheus.yml subPath: prometheus.yml ports: - containerPort: 9090 volumes: - name: config-volume configMap: name: prometheus-config serviceAccountName: prometheus To summarize what this manifest does: Launches two Prometheus replicas listening on the port 9090. Mounts the ConfigMap created previously at the default Prometheus config path of /etc/prometheus/prometheus.yml Associates Prometheus Service Account with the deployment to grant needed permissions. Let's save the manifest in the prometheus-deployment.yml and create the deployment: kubectl create -f prometheus-deployment.yaml deployment.extensions "prometheus-deployment" created To access the Prometheus web interface, we also need to expose the deployment as a service. We used the NodePort service type: kind: Service apiVersion: v1 metadata: name: prometheus-service spec: selector: app: prometheus ports: - name: promui nodePort: 30900 protocol: TCP port: 9090 targetPort: 9090 type: NodePort Let's create the service, saving the manifest in the prometheus-service.yaml and running the command below: kubectl create -f prometheus-service.yaml service "prometheus-service" created Alternatively, you can expose the deployment from your terminal. By doing so, you don't need to define the Service manifest: kubectl expose deployment prometheus-deployment --type=NodePort --name=prometheus-service service "prometheus-service" exposed Once the deployment is exposed, you can access the Prometheus web interface. If you are using Minikube, you can find the Prometheus UI URL by running minikube service with the --url flag: minikube service prometheus-service --url http://192.168.99.100:30900 Take note of the URL to access the Prometheus UI a little bit later when our test metrics app is deployed. Step 3: Deploy an Example App Shipping RPC Latency Metrics Prometheus is now deployed, so we are ready to make it consume some metrics. Let's deploy our example app serving metrics at the /metrics REST endpoint. Below is the deployment manifest we used: apiVersion: apps/v1 kind: Deployment metadata: name: rpc-app-deployment labels: app: rpc-app spec: replicas: 2 selector: matchLabels: app: rpc-app template: metadata: labels: app: rpc-app spec: containers: - name: rpc-app-cont image: supergiantkir/prometheus-test-app ports: - name: web containerPort: 8081 This deployment manifest is quite self-explanatory. Once deployed, the app will be shipping random RPC latencies data to /metrics endpoint. Please make sure that all the labels and label selectors match each other if you prefer to use your own names. Go ahead and create the deployment: kubectl create -f rpc-app-deployment.yaml deployment.apps "rpc-app-deployment" created As you remember, we configured Prometheus to watch service endpoints. That's why, we need to expose our app's deployment as a service. apiVersion: v1 kind: Service metadata: name: rpc-app-service labels: app: rpc-app spec: ports: - name: web port: 8081 targetPort: 8081 protocol: TCP selector: app: rpc-app type: NodePort For clarity, we set the value of the spec.ports[].targetPort to be the same as spec.ports[].port, although Kubernetes makes it automatically if no value is provided for targetPort. As with the Prometheus service, you can either create the service from the manifest or expose it inline in your terminal. If you opt for the manifest, run: kubectl create -f rpc-app-service.yaml service "rpc-app-service" created If you prefer the quick inline way, run: kubectl expose deployment rpc-app-deployment --type=NodePort --name=rpc-app-service service "rpc-app-service" exposed Let's verify that the service was successfully created: kubectl describe svc rpc-app-service Name: rpc-app-service Namespace: default Labels: app=rpc-app Annotations: Selector: app=rpc-app Type: NodePort IP: 10.110.41.174 Port: web 8081/TCP TargetPort: 8081/TCP NodePort: web 32618/TCP Endpoints: 172.17.0.10:8081,172.17.0.9:8081 Session Affinity: None External Traffic Policy: Cluster Events: As you see, the NodePort was assigned and the deployment's endpoints were successfully added to the service. We can now access the app's metrics endpoint on the specified IP and port. If you are using Minikube, you'll first need to get the service's IP with the following command: minikube service rpc-app-service --url http://192.168.99.100:30658 Now, let's use curl to GET some metrics from that endpoint: curl http://192.168.99.100:30658/metrics # TYPE promhttp_metric_handler_requests_total counter promhttp_metric_handler_requests_total{code="200"} 0 promhttp_metric_handler_requests_total{code="500"} 0 promhttp_metric_handler_requests_total{code="503"} 0 # HELP rpc_durations_histogram_seconds RPC latency distributions. # TYPE rpc_durations_histogram_seconds histogram rpc_durations_histogram_seconds_bucket{le="-0.00099"} 0 rpc_durations_histogram_seconds_bucket{le="-0.00089"} 0 rpc_durations_histogram_seconds_bucket{le="-0.0007899999999999999"} 0 rpc_durations_histogram_seconds_bucket{le="-0.0006899999999999999"} 2 rpc_durations_histogram_seconds_bucket{le="-0.0005899999999999998"} 18 rpc_durations_histogram_seconds_bucket{le="-0.0004899999999999998"} 59 rpc_durations_histogram_seconds_bucket{le="-0.0003899999999999998"} 236 rpc_durations_histogram_seconds_bucket{le="-0.0002899999999999998"} 669 rpc_durations_histogram_seconds_bucket{le="-0.0001899999999999998"} 1514 rpc_durations_histogram_seconds_bucket{le="-8.999999999999979e-05"} 2959 rpc_durations_histogram_seconds_bucket{le="1.0000000000000216e-05"} 4727 rpc_durations_histogram_seconds_bucket{le="0.00011000000000000022"} 6562 rpc_durations_histogram_seconds_bucket{le="0.00021000000000000023"} 8059 rpc_durations_histogram_seconds_bucket{le="0.0003100000000000002"} 8908 rpc_durations_histogram_seconds_bucket{le="0.0004100000000000002"} 9350 rpc_durations_histogram_seconds_bucket{le="0.0005100000000000003"} 9514 rpc_durations_histogram_seconds_bucket{le="0.0006100000000000003"} 9570 rpc_durations_histogram_seconds_bucket{le="0.0007100000000000003"} 9585 rpc_durations_histogram_seconds_bucket{le="0.0008100000000000004"} 9588 rpc_durations_histogram_seconds_bucket{le="0.0009100000000000004"} 9588 rpc_durations_histogram_seconds_bucket{le="+Inf"} 9588 rpc_durations_histogram_seconds_sum 0.11031870987310399 rpc_durations_histogram_seconds_count 9588 # HELP rpc_durations_seconds RPC latency distributions. # TYPE rpc_durations_seconds summary rpc_durations_seconds{service="exponential",quantile="0.5"} 6.43688069700151e-07 rpc_durations_seconds{service="exponential",quantile="0.9"} 2.3703557539528334e-06 rpc_durations_seconds{service="exponential",quantile="0.99"} 4.491775587389532e-06 rpc_durations_seconds_sum{service="exponential"} 0.014317520025277117 rpc_durations_seconds_count{service="exponential"} 14369 rpc_durations_seconds{service="normal",quantile="0.5"} 5.97571029483546e-06 rpc_durations_seconds{service="normal",quantile="0.9"} 0.0002795950678545625 rpc_durations_seconds{service="normal",quantile="0.99"} 0.0004838671111576318 rpc_durations_seconds_sum{service="normal"} 0.11031870987310399 rpc_durations_seconds_count{service="normal"} 9588 rpc_durations_seconds{service="uniform",quantile="0.5"} 8.961255876119688e-05 rpc_durations_seconds{service="uniform",quantile="0.9"} 0.0001764412468147929 rpc_durations_seconds{service="uniform",quantile="0.99"} 0.00019807911315607854 rpc_durations_seconds_sum{service="uniform"} 0.715789691590982 rpc_durations_seconds_count{service="uniform"} 7195 As you see, the request returned a number of Prometheus-formatted RPC latencies metrics. Each metric is formatted as {=, ...} and has a unique value. Thanks to the Prometheus Kubernetes auto-discover feature, we can expect that Prometheus has automatically discovered the app and has begun pulling these metrics. Let's access the Prometheus web interface to verify this. Use you Prometheus service IP and the NodePort obtained in Step 2 to access the Prometheus UI. If you go to the /targets endpoint, you'll see the list of the current Prometheus targets. There might be a lot of targets because we've configured Prometheus to watch all service endpoints. Among them, you'll find a target labeled app="rpc-app". That's our app. You can also find other labels and see the time of the last scrape. In addition, you can see the current Prometheus configuration under Status -> Configuration tab: Finally, we can visualize RPC time series generated by our example app. To do this, go to the Graph tab where you can select the metrics to visualize. In the image above, we visualized the rpc_durations_histogram_seconds_bucket metrics. You can play around with other RPC metrics and native Prometheus metrics as well. The web interface also supports Prometheus query language PromQL to select and aggregate metrics you need. PromQL has a rich functional semantics that allows working with time series instance and range vectors, scalars, and strings. To learn more about PromQL, check out the official documentation. Conclusion That's it! We've learned how to configure Prometheus to monitor applications serving Prometheus-format metrics.  Prometheus has a complex configuration language and settings, so we've just scratched the surface. Although Prometheus is a powerful tool, it might be challenging to configure and run it without a good knowledge of domain-specific language and configuration. To fill this gap, in the next tutorial we'll look into configuring and managing your Prometheus instances with Prometheus Operator -- a useful software management tool designed to simplify monitoring of your apps with Prometheus. Stayed tuned to our blog to find out more soon! [Less]
Posted over 6 years ago by Kirill
As you know from our earlier blog posts, Kubernetes (or k8s) is an open source platform for deploying and managing containerized applications at scale. It is designed to automate such container management tasks as deployment ... [More] , scaling, updating, scheduling, storage provisioning, networking, and service discovery, among others. With Kubernetes, it is simple to group multiple hosts running Linux containers and turn them into a working computer cluster controlled and managed by the intelligent control plane maintaining the desired state of your deployments. Although Kubernetes dramatically simplifies the deployment of containerized applications, its multi-level architecture and multiple abstraction layers (e.g., pods, services) introduce new complexities to the daily tasks of application monitoring. There are at least two reasons why traditional approaches to monitoring don't work well with Kubernetes: Containers in Kubernetes are intricately entangled with Kubernetes orchestration services. To provide orchestration services, Kubernetes encapsulates application containers in the abstractions known as pods. Pods ensure a shared network and storage interfaces for containers and access to various orchestration services provided by the platform. Pods can be further abstracted into services that act as load balancers distributing traffic across the backend Pods. Pods and services are typically managed by controllers that maintain the desired state of your apps and facilitate scaling and updating them. When you think of such complex architecture (we've only touched the tip of the iceberg), you see that any viable monitoring solution should be able to pave its way through the maze of abstractions to actual containers that provide application metrics. At the same time, a monitoring agent should not miss the bigger picture provided by various abstraction layers and the entire cluster. Kubernetes is a distributed and fluid environment. Kubernetes is a distributed environment where numerous applications and services are spread across multiple nodes of your cluster. From here, it follows that monitoring agents should be deployed on each node and be connected to a centralized monitoring pipeline. Second, Kubernetes is a fluid environment driven and defined by orchestration tasks. Containers are scaled up and down depending on the load and moved to other nodes all the time. Moreover, the physical infrastructure can itself be dynamically scaled horizontally with new nodes added or removed automatically. That being said, static monitoring solutions designed for standalone desktop applications are not good for Kubernetes. The platform requires monitoring tools that can dynamically capture container events and be tightly integrated with Kubernetes schedulers and controllers. These challenges lead us to a radically new approach to monitoring of containerized applications. Kubernetes Monitoring Approach The approach taken by Kubernetes to monitoring directly stems from the platform's design. According to Kubernetes documentation, "Kubernetes is not a traditional, all-inclusive PaaS (Platform as a Service) system. Since Kubernetes operates at the container level rather than at the hardware level, it provides some generally applicable features common to PaaS offerings. ... However, Kubernetes is not monolithic, and these default solutions are optional and pluggable. Kubernetes provides the building blocks for building developer platforms, but preserves user choice and flexibility where it is important." In other words, Kubernetes is designed to be extensible and pluggable. You can design any plugin, extension, and API compatible with the Kubernetes network and storage standards and add them to gain visibility in Kubernetes.  At the same time, Kubernetes is not a barebones platform. When it comes to monitoring, "It provides some integrations as proof of concept, and mechanisms to collect and export metrics." These mechanisms are themselves implemented as extensions and plugins running on top of Kubernetes cluster components. One of the most important of them is cAdvisor -- a built-in Kubernetes monitoring tool.  To complete examples below, you'll need the following: A running Kubernetes cluster. See Supergiant documentation for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube. A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here. cAdvisor In order to perform various orchestration tasks (e.g., scheduling, managing container resource requests and limits, Horizontal Pod Autoscaling (HPA), managing namespace resource constraints and quotas), cluster components such as scheduler, Kubelet, and controllers need to collect and analyze cluster metrics. Specifically, they need to know how much CPU, RAM and network resources are currently consumed by various applications and components of the cluster. Kubernetes ships with the built-in monitoring solution that provides such metrics -- cAdvisor. This tool is a running daemon that collects, processes, and exports information about containers. In particular, cAdvisor keeps information about container historical resource usage, resource isolation parameters, network statistics, etc. cAdvisor is a part of the Kubelet binary, which makes it tightly coupled with Kubelet. As you remember, the Kubelet is the primary "node agent" responsible for running and managing containers on nodes in accordance with PodSpecs (see the image below). Kubelet pulls container resource usage metrics directly from cAdvisor and makes decisions based on them. It also exposes the aggregated pod resource usage stats via a REST API. Kubernetes users can access cAdvisor via a simple UI that typically listens on port 4194 on most Kubernetes clusters. For example, if you are using Minikube you can start cAdvisor with: minikube start --extra-config=kubelet.CAdvisorPort=4194 To get access to the cAdvisor Web UI you can then: open http://$(minikube ip):4194 Native Monitoring since 1.6 and 1.8: Metrics API and Metrics Server cAdvisor is the in-tree monitoring product provided as a part of Kubernetes binary. It was a necessary component in the earlier Kubernetes versions, but the overall tendency is to move towards detaching add-on monitoring components from upstream Kubernetes components. The extensibility mindset for which Kubernetes is known was embodied in the recent changes to Kubernetes monitoring architecture. In particular, since 1.6 and 1.8 releases, Kubernetes moved to a new monitoring architecture based on Resource Metrics API and Custom Metrics API. These APIs are the part of the effort to standardize the way Kubernetes cluster components and third-party monitoring solutions access Kubernetes metrics generated by cAdvisor and other low-level metrics components. A new monitoring architecture outlined in the Kubernetes proposal consists of the core metrics pipeline exposed by the Resource Metrics API and the monitoring pipeline for third-party monitoring solutions. The goal of this architecture is to provide a stable, versioned API that core Kubernetes components can use along with a set of abstractions for custom/add-on monitoring applications to easily access Kubernetes core and custom metrics. Let's briefly discuss the design of these two pipelines. Core Metrics Pipeline A core metrics pipeline consists of the following components. Kubelet. Provides node/pod/container resource usage information (cAdvisor will be slimmed down to provide only core system metrics). Kubelet acts as a node-level and application-level metrics collector as opposed to cAdvisor responsible for cluster-wide metrics. Resource estimator. Runs as a DaemonSet that turns raw usage values collected from Kubelet into resource estimates ready for the use by schedulers or HPA to maintain the desired state of the cluster. Metrics-server. This is a mini-version of Heapster (Heapster is now deprecated) that was previously used as the main monitoring solution on top of cAdvisor for collecting Prometheus-format metrics. Metrics-server stores only the latest metrics values scraped from Kubelet and cAdvisor locally and has no sinks (i.e., does not store historical data). Master Metrics API. Metrics Server exposes the master metrics API via the Discovery summarizer to external clients. The API server. The server responsible for serving the master metrics API. A core metrics pipeline is designed for the use by core system components such as scheduler, horizontal pod autoscaler, and simpleUI components (e.g.,kubectl top). Although the core metrics pipeline does not include long-term metrics collection and storage, it provides a useful interface to implement it using Prometheus exposition format. Monitoring Pipeline The core metrics pipeline does not support third-party integrations. However, Kubernetes implements a monitoring pipeline interface specifically designed for full monitoring solutions (e.g., Prometheus). This approach removes the need to maintain Heapster as the integration endpoint for every metrics source and feature. The key part of the monitoring pipeline architecture in Kubernetes is Custom Metrics API. With this API, clients can access both core system metrics and application metrics. This API should be implemented by monitoring pipeline vendors, on top of their metrics storage solutions. Data collected by monitoring pipeline can contain the following metrics types: core system metrics non-core system metrics service metrics from application containers service metrics from Kubernetes infrastructure containers. Among popular monitoring solutions that can implement Custom Metrics API one should mention Prometheus, Heapster, and various proprietary APM solutions like Datadog and Dynatrace. Full monitoring pipelines are the subject of the second part of the Monitoring series. Deploying Metric Server To get an idea of how Kubernetes core metrics pipeline works, let's try to run the add-on Metrics Server. To get the server up and running, you'll first need to configure the aggregation layer. This layer is a new feature in Kubernetes 1.7 that allows Kubernetes apiserver to be extended with additional non-core APIs. Once the API and your add-on server are registered with the aggregation layer, the aggregator will be able to proxy relevant requests to these add-on API servers allowing you to serve custom API resources. To configure the aggregation layer for your Metrics server, you'll need to set a number of flags on the kube-apiserver --requestheader-client-ca-file= --requestheader-allowed-names=aggregator --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --proxy-client-cert-file= --proxy-client-key-file= Notice that for the first flag, you'll need to obtain a CA certificate if your cluster administrator has not provided you with that. You can find more information about these flags in the official kube-apiserver documentation. After you've configured the kube-apiserver, the next step is to deploy the metrics-server mentioned above. This can be done with the deployment manifests from the metrics-server GitHub repository. For more information about metrics-server check out the metrics-server repository. You can deploy the Metrics server directly from the metrics-server-master/deploy/1.8+/ directory of the downloaded repository: kubectl create -f metrics-server-master/deploy/1.8+/ clusterrolebinding.rbac.authorization.k8s.io "metrics-server:system:auth-delegator" created rolebinding.rbac.authorization.k8s.io "metrics-server-auth-reader" created apiservice.apiregistration.k8s.io "v1beta1.metrics.k8s.io" created serviceaccount "metrics-server" created deployment.extensions "metrics-server" created service "metrics-server" created clusterrole.rbac.authorization.k8s.io "system:metrics-server" created clusterrolebinding.rbac.authorization.k8s.io "system:metrics-server" created Once the Metrics server is deployed, we can easily access the resource metrics API with kubectl get --raw. For example, the following command returns the resource usage metrics for all nodes in your cluster. kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" | jq { "kind": "NodeMetricsList", "apiVersion": "metrics.k8s.io/v1beta1", "metadata": { "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes" }, "items": [ { "metadata": { "name": "node1.test.com", "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node1.test.com", "creationTimestamp": "2018-08-29T13:21:06Z" }, "timestamp": "2018-08-29T13:21:06Z", "window": "1m0s", "usage": { "cpu": "157m", "memory": "3845432Ki" } }, { "metadata": { "name": "node2.test.com", "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node2.test.com", "creationTimestamp": "2018-08-25T13:21:06Z" }, "timestamp": "2018-08-23T13:21:06Z", "window": "1m0s", "usage": { "cpu": "431m", "memory": "4559560Ki" } } ] } If you are using Minikube, you can easily enable metrics-server with a few commands. First, check what add-ons are enabled: minikube addons list - addon-manager: enabled - coredns: disabled - dashboard: enabled - default-storageclass: enabled - efk: disabled - freshpod: disabled - heapster: disabled - ingress: disabled - kube-dns: enabled - metrics-server: disabled - registry: disabled - registry-creds: disabled - storage-provisioner: enabled As you see, metrics-server add-on is currently disabled. To enable it, run:  minikube addons enable metrics-server metrics-server was successfully enabled Now, you can verify that the metrics-server is running in the kube-system namespace.  Conclusion In this article, we've discussed key components of the Kubernetes monitoring architecture. To sum it up, we've learned that the purpose of this architecture is to create an extensible and versioned metrics API for Kubernetes system components (e.g., scheduler, HPA) and custom monitoring pipelines.  The first task is accomplished by the core metrics pipeline that feeds metrics to HPA, scheduler, and other resource management systems in Kubernetes.  The second task is performed by the Custom Metrics API that allows accessing both system metrics and custom metrics by third-party monitoring adapters and pipelines.  Along with the Metrics API, Kubernetes continues to ship with the built-in monitoring solutions like cAdvisor, although the overall trend is to gradually remove built-in tools from upstream components, leaving it up to third-party vendors to create monitoring solutions based on the Custom Metrics API.  In the second part of this series, we'll discuss several third-party monitoring pipelines like Heapster, Prometheus, and Metricbeat that can be used to provide full monitoring services for your Kubernetes clusters and applications running on them. [Less]
Posted over 6 years ago by Kirill
Container technology and container orchestration are revolutionizing deployment and management of applications in a multi-node distributed environments at scale. Since Google open sourced Kubernetes in 2014, a number of reputable ... [More] tech companies have decided to move their container workloads to the platform, thereby contributing to its growing popularity and recognition in the community.  In 2018 we see a broad consensus that containers powered by containers orchestration frameworks make application CI/CD, deployment, and management at scale much more efficient and productive.  However, real benefits of containerization are often hidden beneath a layer of complex terminology that is known only to a few experts in the field.  In this article, we are going to educate business leaders and IT managers about the actual cost-saving potential of container technology and lift the veil on the complexity of container orchestration. The article is organized as follows. In the first part, we explain container architecture and the advantages of containers over virtual machines. In the second part, we focus on the economic benefits of container orchestration and present case studies of two IT companies that have successfully adopted container technologies and appreciated its benefits. Let's get started! What Are Containers and Why are They so Efficient? Linux containers are technologies that allow packaging and isolating applications with their entire runtime environment (e.g., binaries, files, dependencies). This makes it easy to move the containerized applications between various environments while retaining full functionality. Sounds familiar? Don't virtual machines (VMs) offer the same functionality? The answer is "yes" and "no." To make a long story short, container runtimes were developed as an alternative to immutable virtual machine (VM) images. VM images are heavier (i.e., consume more resources) than containers because they require a full OS to operate. Because of that, VMs are slower to start up, and just a few of them can occupy an entire server. In contrast, containers do not require a full OS packaged into them to work. Since they use OS-level virtualization as opposed to hardware-level virtualization in VMs, multiple containers can share a single host OS. Containers typically include a small snapshot of the host filesystem and dependencies they need. This is not enough though. Containers can request additional resources and services from the host OS when needed. Thanks to this self-contained design and flexibility, containers allow disentangling applications from the underlying infrastructure and isolating them from the host environment, thereby making them portable and environment-agnostic. But how does this translate to cost saving? That's not so difficult to understand. Let's first look at the image below to get an idea of how VMs and containers differ. As you see, a server hosting 3 applications in 3 virtual machines in the standard VM approach would require three copies of guest OS running on the server. To clarify things, a guest OS is a virtual OS managed by the hypervisor, which, in most cases, is different from the host OS on which it runs.  Virtualization of the OS typically requires a lot of resources. This means that VMs running our three apps will be very heavy and will consume a lot of memory and disk and CPU resources.  How is the container approach different? In a container world, all 3 applications could be managed by the container engine (e.g., Docker) and share one single version of the host OS.  Now you get an idea of the basic advantage here: with containers, more applications can run on the same hardware because we avoid duplication of heavy OS images. So instead of requiring multiple hosts to deploy your apps, you can use a single host. Sounds like an immense cost saving? It really is! Note: There is widespread confusion that Linux containers are, in essence, mini versions of VMs. Indeed, we might erroneously assume that when we look inside a Linux container. There we will find the familiar filesystem structure, devices, and software used in any Linux distribution. However, the contents of the container's filesystem and its runtime environment are not a full OS but a small representation of the target OS needed for the container to work. The kernel and underlying resources are still provided by the host OS, whereas the system devices and software are provided by the image. A host Linux OS is, therefore, able to run a container even though it appears to be an entirely different Linux distribution. Are Containers Flawless and Do We Still Need VMs? It is important to understand that both containers and VMs have their unique place in the IT world. The arrival of containers does not mean that VMs became obsolete and that we don't need them anymore. There are a number of scenarios when you should consider using VMs. For example: VMs are a better choice for apps that require all of the operating system’s resources and functionality. VMs are better if you have a wide variety of operating systems to manage. VMs can run pretty much any operating system, whereas containers lock you in Linux distributions. VMs are better if you want the flexibility of running multiple applications. If you want to run multiple copies of one application (e.g., database), you'll be better off with containers. Users should also remember that containers are not flawless. The most widely cited problem with containers is security. Contrary to popular misconception, containers are not fully self-contained. This means that containers can have access to the OS kernel, all devices, SELinux, Cgroups, and all system files. If the container has superuser privileges, the host security might be compromised. This means that you can't run random container applications on your system as root. Recently, however, Kubernetes has done a very good job of providing lots of security tools (SELinux, AppArmor, highly configurable policies and networking options, etc.) to users, but that they just aren't set up by default and it takes time and training to do it properly. All things considered, containers have certain limitations, but all of them are ultimately solvable.  In what follows, we continue with our analysis of container cost-saving benefits that immediately appear when the caveats discussed in this section are addressed.  Container Design Diversity You can leverage various container design patterns to regulate how many resources your container requires and consumes. There are three basic patterns to choose from: "scratch" containers, container OS, and full OS containers. Popular container runtimes (like Docker) allow creating very lightweight and fast boot-up containers known as "scratch" containers. They are based on a "scratch environment" that includes minimal resources and dependencies. By default, scratch containers have no access to SSH or higher-level OS functions, which means they are somewhat limited. However, if you need just a super-small Linux kernel and minimal functionality to perform some tasks without a full access to the host OS, a scratch container is a way to go. They will dramatically reduce your infrastructure costs. If you want more exposure to the host OS, you can opt for a container OS. These container images provide a package manager to install dependencies. A good example of the container OS is Alpine OS container available in the official Docker Hub repository. These containers are also very small: usually no more than 5-8 megabytes in size (e.g., Docker Alpine OS image is only 5Mb in size). However, since you can install dependencies, the size of a container OS can increase quite fast if it's not kept in check. Finally, major container systems like Docker allow creating containers with a full OS. For instance, you can create a container based on Ubuntu or any other Linux distribution. Full OS containers will be significantly larger than container OS and "scratch" containers, so they are less desirable if you want to save on infrastructure resources. Leveraging these three container design patterns, you can create a perfect mix of containers in your deployment, minimizing unnecessary infrastructure costs. Cost-Saving Benefits of Containers At this point, you know how container architecture secures more efficient utilization of resources than VMs. However, containerization cost-saving benefits do not end there. Let's list the most important of them: Containers are open source and free. Most popular container platforms like Docker are open source and free for anyone to use. They are built around open-source Linux distributions and standardized technologies like cgroups, systemd, user namespaces, and other Linux concepts and libraries that enable container isolation and OS-level virtualization. In contrast, even though some virtual machine platforms like KVM are free, others, like VMware cost quite a lot of money when used at large scale. Lower infrastructure costs. We have already seen that containers can be more lightweight and faster than immutable VMs. OS-level virtualization used in containers allows fitting multiple containers on a single host. Containers do not require heavy duplicate OS images to run. These features lead to double-digit resource savings even with quite simple implementations (see case studies below). Lower configuration and management costs. In the pre-container era, infrastructure / operations teams spent much of their time configuring servers, which was an error-prone and tedious work. However, the rules of the game have changed with the arrival of containers. Because containers are very portable, self-contained, and have all dependencies packaged to run, they are less dependent on the specific server environments and configuration. In other words, they are no longer entangled with the host environment. This implies you can save time and money on configuring servers with dependencies, environmental variables, support and system libraries, networking, etc. Most of what you need to run your application in any environment can be packaged inside the container at the development/build time rather than at the deployment time. Better synergies between developers and infrastructure engineers. This benefit is directly connected to the previous point. Container technologies make cooperation and coordination between application developers and infrastructure teams much easier. Developers can build the application and package it into the container with all needed dependencies and settings and then hand the container over to engineering team that needs only to know what network and storage requirements the container has. Engineers no longer need to manually install all these dependencies and tune the server for the app to work. The container will simply use those self-contained settings and internal environment to communicate with the host kernel and use resources it needs. The up sides of this are evident: faster time to market, lower engineering costs, and smaller teams specialized in very narrow tasks. Low maintenance costs. Thanks to their self-contained design, containers introduce environment parity of development, testing, and production environments. In other words, all these environments are consistent when you use containers. This translates to immense cost savings because consistent environments are much easier to maintain with a smaller team. There are also fewer support tickets opened, which frees up support's time for building stronger relationships with clients. Cost-Saving Benefits of Kubernetes Containers are great, but they are just "units of deployments" that should be efficiently managed if you run them at scale. When you run multiple containers distributed across multiple hosts, manual updates and scaling are the error-prone and non-trivial tasks. Without automation, companies using containers at scale run into the risk of longer downtimes, slower update cycles, and a growing gap between development, test, and production environments. Being aware of these risks, medium and large companies alike are incorporating container orchestration in their application management. Kubernetes (or K8s) is widely regarded as one of the best container management platforms in the market. Kubernetes is an open source platform for the deployment and management of containerized applications at scale. It automates deployment, scaling, scheduling, update, and networking of containerized applications. The platform simplifies grouping multiple hosts with containerized applications on them into a homogenous cluster managed by the orchestration engine. Since 2014, when Google open sourced Kubernetes, a number of companies and developers have contributed to the project, building dozens of integrations with popular cloud providers, storage systems, and networking infrastructures, etc. Kubernetes is supported by the growing ecosystem and community and is currently the most popular container orchestration tool around. Kubernetes is a mature platform that ships with all features for running containers in the public, private, hybrid clouds, multi-cloud, and on-premises ranging from networking, support for stateful apps and various storage systems, DNS, service discovery, and microservices, etc. If you follow best practices, you can expect Kubernetes to become a significant cost-savings component of your business. We compiled a list of cost benefits that can be achieved with Kubernetes: Reduction of administration and operations burden. Applications deployed in Kubernetes clusters are extremely cheap to maintain. Once the cluster is set up and properly configured, you can expect your applications to run with extremely low downtime and great performance without frequent support intervention. If your company does not use container orchestration, infrastructure and operations teams will often have to fix things manually in case of a node or pod failure. With Kubernetes, you no longer have support and maintenance overheads for your applications. Kubernetes' Control Plane regularly monitors the health of nodes and pods, intervening when the desired state is not achieved by launching new pods or rescheduling them to a healthy node. Thanks to Kubernetes' internal cluster monitoring system, there is a dramatic decrease in support issues, and companies are able to better allocate ops time to building an even higher relationship standard with customers. Ops teams can be smaller and more efficient too. Faster deployment times. Kubernetes' declarative syntax makes it easy to specify the desired state of your deployments, leaving it upon the controller to deploy and maintain it. Powered by the underlying container technology, Kubernetes also ensures fast image pulls and startup of your applications. Efficient resource management. Kubernetes uses an efficient resource management model implemented at the container, pod, and cluster levels. At the container level, you can assign resource requests and limits specified in raw CPU and RAM values to containers. These parameters control the minimum and maximum amount of resources available to the container at any time of its lifecycle. By setting various request/limit ratios, you can create diverse classes of pods -- best-effort, guaranteed, and burstable --depending on your application's needs. Thus, resource requests and limits ensure full control of cluster administrator over the resource utilization in the cluster. In addition, Kubernetes supports namespace-wide resource management (note: namespaces may be described as virtual areas of the cluster assigned to specific users). With Kubernetes, you can define default resource request and limits automatically applied to containers, resource constraints (minimum and maximum resource requests and limits), and resource quota for all containers running in a given namespace. With all these features, you can ensure that your cluster always has available resources for running applications and dramatically decrease cloud infrastructure costs. Less churn due to high availability clusters. Modern web applications are expected to be up all the time. Any downtime can undermine trust and customer confidence and result in customer churn. To avoid downtime, Kubernetes is designed with high availability requirements in mind. The platform ships with these features that enable high availability by default:  Automatic maintenance of the desired state. Kubernetes allows running multiple pod replicas (redundancy) and maintains the number of application instances you need (desired state). Therefore, you can always expect your application to be up and running. Efficient system of cluster leader election. Recent versions of Kubernetes ensure that if the current master node fails for some reason, a new cluster leader is elected. In this way, Kubernetes always maintains the integrity of master functions and ensures continuity of cluster services. Liveliness and readiness probes for applications. Kubernetes allows integrating various probes to check if your pods are healthy, running, and serving traffic. Kubernetes will emit events if something is wrong with your pods. Node health checks. Kubernetes regularly monitors node health, and if a node fails, it reschedules pods running on it to a healthy node. Rolling updates. You can incrementally update multiple instances of your applications with zero downtime. Kubernetes will ensure that old versions of your app are not deleted before new ones are started. Decrease costs with autoscaling. Kubernetes ships with native autoscaling functionality implemented in the horizontal pod autoscaler (HPA). It allows scaling pods up and down, depending on the ever-changing application load (CPU) and traffic or any other custom metrics. The HPA is very useful as it comes to managing your infrastructure costs: normally, you don't require more applications instances running than the real-time demand for your app's services. HPA will address that by scaling the number of pods in a replication controller, deployment, or replica set based on metrics you provide. Our Supergiant Kubernetes-as-a-Service toolkit extends this auto-scaling functionality even further. Its efficient cost-saving algorithm based on machine learning and real-time analysis of computer resources ensures that your apps consume the exact amount of resources (CPU and memory) they actually need. Supergiant can select nodes with the right amount of resources so that your pods are always tightly packed and you don't pay for unused resources. Need More Evidence? A number of companies appreciate the immense cost benefits of using Kubernetes for their containerized workloads. Let's briefly discuss two case studies of companies that benefited from moving their workloads to containers and Kubernetes: Qbox and Pinterest. Qbox Qbox Inc. provides a hosted Elasticsearch service that simplifies deployment and management of Elasticsearch clusters with major cloud providers (e.g., AWS). Initially, the company was single-tenant with each ES node being its own dedicated machine on AWS (which itself is a VM). This approach was based on hand-picking certain instance types optimized for Elasticsearch and leaving it up to users to configure single-tenant, multi-node clusters running on isolated VMs in any region. Qbox added a markup on the per-compute-hour price for the DevOps support and monitoring. However, Qbox AWS bills quickly get out of hand when the company grew to thousands of clusters. In addition to that, support began spending most of their time on replacing dead nodes and answering support tickets. To make things worse, the company faced the problem of more resources allocated to clusters compared to the usage. Qbox had thousands of servers with a collective CPU utilization under 5%. VMs turned out to be extremely unproductive when deployed at scale. Facing the problem of inefficient resource usage, squeezing profit margins, and fierce competition from cloud-hosted Elasticsearch providers (Google and AWS), Qbox decided to adopt the container-first approach based on Kubernetes, Docker, and Supergiant, the tool developed by the company to manage its Kubernetes deployments. The transition to a containerized architecture was worth the effort. Performance improvement came almost immediately. With Kubernetes and Supergiant, Qbox could "pack" more applications on a single host, which translated to more efficient use of its infrastructure and reduction of cloud costs.  To provide granular control over resource sharing while avoiding the problem of "noisy neighbors," Qbox also took advantage of Kubernetes requests and limits. By setting container-specific requests and limits, Qbox achieved more fine-grained control over resource utilization in its clusters and successfully moved to more practical, performant, and cost-effective multi-tenancy. Thus, Kubernetes solved both the utilization and the noisy neighbor problem. Although Qbox is multi-tenant, everyone gets what they pay  for without any interference from other users (and they even get more, because have only CPU (not RAM) limits, so they can use more than they paid for if no one else is on the server). Qbox transition to Kubernetes gave birth to the Supergiant Kubernetes-as-a-Service platform that was originally used by the company to simplify deployment of containers on Kubernetes. As a major component of Supergiant, Qbox developed a cost-reduction packing algorithm that can efficiently package containers on nodes avoiding under-utilization of resources and efficiently spin up new nodes or remove old ones depending on the load. Using Supergiant resulted in an immediate 25% drop in Qbox infrastructure footprint. Overall, the company saved 50% (about $600k per year). Pinterest Pinterest is a web application that operates a system designed to discover and share information on the web, mostly using images, GIFs, and videos. Pinterest had 200 million monthly active users in September 2017. The challenge Pinterest faced in 2015 was managing over 1000 microservices, multiple layers of infrastructure, and diverse setup tools. Back in 2015, the company deployment process looked as follows. It had one base Amazon Machine Image (AMI) with an OS, common shared packages, and installed tools. For some services, mostly large and complex ones, the company also had a service-specific AMI built on the base AMI and having all service dependency packages in it. In addition to that, the company used two deployment tools: Puppet for provisioning of cron jobs, infra components, and Teletraan for the deployment of production service code and some ML models. Using this architecture at scale resulted in several hard challenges: Engineers had to be involved in every part of the system. In particular, they had to make AMI builds and learn numerous configuration languages, deployment tools, and features of different cloud environments. Since there was no clear separation of applications from their hosting environment, environments on the hosts gradually diverged and caused operational issues. In response to these challenges, in early 2016, Pinterest decided to move its microservices to Docker containers and chose Kubernetes as the orchestration system. The immediate impact of this decision was: Simplified deployment and management of various pieces of software and infrastructure. Reduced build times due to the lightweight nature of containers and ability to automate deployments with Kubernetes. Reclaimed 80% of hardware capacity during the peak hours. For example, the company's Jenkins Kubernetes cluster was using 30% fewer instance-hours per day compared to the previous static cluster. Increased speed to market as a result of automated scheduling, rolling updates, and successful usage of other orchestration features of the Kubernetes platform. Conclusion As the evidence demonstrates, containers and Kubernetes have an immense cost-savings potential appreciated by a number of major companies. In particular, containers can significantly decrease infrastructure costs because they are more lightweight than VMs and can share a single OS. Other benefits of containers include faster CI/CD pipelines, better coordination between development and engineering teams, and low maintenance costs. If you add Kubernetes to the equation, you can save even more with autoscaling, efficient cluster-level resource management, rolling updates, and efficient application scheduling. Thanks to containerized applications and container orchestration, you can expect two-digit cost savings and more productive utilization of your development, operations, and support teams' time. What's Next? Read some articles that introduce to key concepts of Kubernetes : Kubernetes Pods Kubernetes Networking Kubernetes Storage Find more about Supergiant: Supergiant documentation Read about Supergiant packing algorithm [Less]
Posted over 6 years ago by Kirill
Kubernetes (or K8s) is an open source platform for the deployment and management of containerized applications at scale. It is designed to automate many processes and tasks such as deployment, scaling, update, scheduling, and ... [More] communication of containerized applications. With Kubernetes, it is simple to group multiple hosts running Linux containers and turn them into a working computer cluster controlled and managed by the intelligent control plane maintaining the desired state of your deployments. Kubernetes was originally designed and developed by Google for the management of its huge computer clusters and application deployments. Google generated more than 2 billion container deployments per week in 2014, meaning that every second Google was launching an average of 3,300 containers. All these containers were managed by their internal orchestration platform named Borg that became a predecessor of Kubernetes. Google open-sourced Kubernetes in 2014. This move continued Google's track record of successful contributions to container ecosystem and cloud-native movement including the development of cGroups that ultimately made Docker possible and BorgMon, which became the inspiration for Prometheus.  To make a long story short, since 2014 a number of companies and developers have contributed to the Kubernetes project, building dozens of integrations with popular cloud providers, storage systems, and networking infrastructures etc. Kubernetes is supported by the growing ecosystem and community and is currently the most popular container orchestration tool around. Why Do Companies Need Kubernetes? Container runtimes were designed as an alternative to running immutable virtual machine (VM) images. VM images are much heavier (require more resources) than containers and, thus, need more servers to be deployed. In contrast, modern container technologies simplify running thousands of lightweight containers on a single host, which leads to the radical saving of computing resources. Also, containers allow you to disentangle applications from the underlying infrastructure and isolate them from the host environment using an autonomous filesystem, virtual networks, and dependencies. This isolation makes containers much more portable and easier to deploy than VMs. In itself, however, container technologies like Docker do not fully address the challenge of running containerized applications in production. Docker did offer Swarm (and still does) for orchestration, but Swarm, ultimately, didn’t offer as much as K8s. Think about this for a moment: real production applications include multiple containers deployed across many server hosts. When you manage hundreds or thousands of containers across multiple nodes in production, you'll need to have an ability to scale them depending on the application's load, enable communication and external access (e.g., via microservices), storage, running regular health checks, managing updates, and many other tasks. These are all orchestration tasks that do not come out of the box in container runtimes. Developing your own orchestration framework to perform these tasks would be an unnecessary overhead for your business.  Fortunately, though, Google open-sourced Kubernetes, so you can get all these handy features and tools without writing a single line of code! With Kubernetes, you get such features as manual and automatic scaling, multi-level networking and service discovery, native support for co-located and tightly coupled applications, and many more. Let's discuss some of these features. Scalability Is Not a Problem  Scalability is among the major concerns of modern production-grade applications exposed to millions of potential users. When it comes to running applications in clusters of hundreds or even thousands of servers, scaling becomes a complex administration task that involves many prerequisites and caveats. Here are some of the questions that might arise along the way: Is the desired number of application instances running? How many of those instances are healthy and ready to serve the traffic? What is the current application load? How many replicas are needed to service that load? How many nodes are currently available for scheduling? How many resources are available in the cluster? These are just a few questions which could make a cluster administrator without appropriate tools go crazy. The obvious solution is automation, and this is where Kubernetes truly shines.  For example, the platform's controllers like deployments can monitor how many applications are running in your cluster, and if, for some reason, this number is different from the desired state, Kubernetes will scale the deployment up or down to reach it. Under the hood, Kubernetes will also keep a record of available nodes and resources to intelligently schedule your applications during the scaling process. At the same time, you can always do a manual scaling using Kubernetes command line tools (kubectl). How about auto-scaling? This feature is critical for managing the ever-changing application load. It's also important because it comes down to managing your infrastructure costs: normally, you don't need more applications instances running than the current demand for your services. Kubernetes ships with a Horizontal Pod Autoscaler that scales the number of pods in a replication controller, deployment, or replica set based on observed CPU utilization.  Our new Supergiant Kubernetes-as-a-Service platform extends this auto-scaling feature even further. Its efficient cost-reduction algorithm based on machine learning and real-time analysis of traffic and resources ensures that you utilize precisely the storage and memory that are needed for your applications to work properly. In this way, Supergiant adds the horizontal auto-scaling feature to Kubernetes' native vertical scaling functionality while introducing immense cost savings. Efficient Multi-Layer Networking and Service Discovery The purpose of Kubernetes networking is to turn containers into bona fide "virtual hosts" that can communicate with each other across nodes, thereby combining the benefits of VMs, containers, and microservices. Kubernetes networking is based on several layers, all serving this final goal: Container-to-container communication on localhost and using a pod's network namespace. This networking layer enables the container network interfaces for tightly coupled containers that can communicate with each other on specified ports much like the conventional applications. Pod-to-pod communication that enables communication of pods across nodes. Kubernetes can turn the cluster into a virtual network where all pods can communicate with each other no matter what nodes they land on. Services. A Service abstraction defines a policy (microservice) for accessing pods by other applications. Services act as load-balancers that distribute requests across backend pods managed by the service. Kubernetes is very flexible about networking solutions that you can use. However, the platform imposes the following principles for the cluster-level network interfaces: All containers can communicate with each other without NAT. All nodes can communicate with all containers (and vice versa) without NAT. The IP seen by one container is the same IP seen by other containers. There are a number of powerful networking implementations of this model including Cilium, Contriv, Flannel, and others. Enabling Co-Located and Tightly Coupled Applications Kubernetes packages containers in abstractions called pods that provide Kubernetes infrastructure resources and services to containers. Pods work as wrappers for the containers that provide interfaces for sharing resources and communication between them. In particular, containers can communicate via localhost on the pod network and share resources via volumes assigned to pods. This allows implementing various tightly coupled application designs where one container serves as the main application and another container (sidecar) helps it process data or consume logs. Containers in a pod also can have shared fate and behavior, which dramatically simplifies the deployment of co-located applications. Fast, Zero Downtime Updates Users expect your applications to be up all the time. However, if your app runs thousands of containers in production, doing manual updates can introduce a lot of risks and cause downtimes. The last thing you want is to make your application unavailable when it's being updated.  Fortunately, Kubernetes ships with a native support for rolling updates which allow updates to pass with zero downtime. Kubernetes controllers will incrementally update running pods according to parameters you specified, ensuring that the old versions of your application are running before the new replicas are launched. Also, with Kubernetes API, you can have fine-grained control over the maximum number of pods unavailable and the maximum surge of new pods above the desired number specified in your deployment. As an added benefit, Kubernetes will store all updates you make in the revision history, so you can always roll back the deployment to any point in time if your update went wrong for some reason (e.g., image pull error). Efficient Management of Hardware Resources in your Cluster When you run thousands of applications in production and have multiple teams working on dozens of projects at the same time, cluster resource management becomes essential for the efficient distribution of your limited cloud budget.  Kubernetes was built with this practical concern in mind. It is based on the efficient resource management model implemented from the lowest level of containers and pods to the highest level of your cluster. At the container level, Kubernetes allows assigning resource requests and limits that control how many resources are requested by the containers and set the upper boundary of resource usage. By setting different request/limit ratios, you can create diverse classes of pods.-- best-effort, guaranteed, and burstable --depending on your application's needs. Also, Kubernetes allows efficiently managing resources at the namespace level. For example, you can define default resource requests and limits automatically applied to containers, resource constraints (minimum and maximum resource requests and limits), and resource quota for all containers running in a given namespace. These features enable efficient resource utilization by applications in your cluster, and they help divide resources productively between different teams. For example, with the namespace resource constraints you can control the share of cluster resources assigned to production and development workloads, ensuring efficient distribution of budget across different workload types. With all these features, you can ensure that your cluster always has available resources for running your applications and dramatically decrease cloud infrastructure costs. Native Support for Stateful Applications Kubernetes comes with a native support for stateful applications like databases and key-value stores. In particular, its persistent volumes subsystem provides an API that abstracts details of the underlying storage infrastructure (e.g., AWS EBS, Azure Disk, etc.), allowing users and administrators to focus on storage capacity and storage types that their applications will consume, rather than the subtle details of each storage provider's API. Persistent volumes allow reserving the needed amount of resources using persistent volume claims. The claim is automatically bound to volumes that match storage type, capacity, and other requirements specified in the claim. Claims are automatically unbound, too, if a matching volume does not exist. This feature allows efficiently reserving storage by applications running in your Kubernetes cluster.  Another great feature for stateful applications is dynamic storage provisioning. In Kubernetes, administrators can describe various storage types available in the cluster and their specific reclaim, mounting, and backup policies. After such storage classes are defined, Kubernetes can automatically provision the requested amount of resources from the underlying storage provider like AWS EBS or Azure disk. Also, one of the best statefulness features in Kubernetes is stateful sets, which are relevant for applications that require stable and unique network identifiers, stable and persistent storage, ordered deployment and scaling, and ordered and automated rolling updates. Using described APIs you can deploy production-grade stateful apps of any complexity in Kubernetes. Seamless Integration with Cloud and Container Ecosystem Kubernetes works well with all major container runtimes, cloud environments, and cloud native applications. In particular, Kubernetes has: Support for popular container runtimes. Kubernetes 1.5 release came out with the Container Runtime Interface (CRI), a plugin interface that allows using a wide variety of container runtimes without the need to recompile. Since that release, Kubernetes has simplified the usage of various container runtimes compatible with the CRI. Support for multiple volume types. Kubernetes allows creating custom volume plugins that help abstract any external volume infrastructure and use it inside the Kubernetes cluster. Kubernetes currently supports over 25 volume plugins, which include volumes of cloud service providers (AWS EBS, GCE Persistent Disk), object storage systems (CephFS), network filesystems (NFS, Gluster), data center filesystems (Quobyte), and more. Easy integration with cloud providers and use of their native services. Kubernetes makes it easy to deploy clusters on popular cloud platforms and to use their native infrastructure and networking tools. For example, Kubernetes supports external load balancers provided by major cloud providers. They give externally-accessible IP addresses that send traffic to the specified ports on your cluster nodes. Extensibility and Pluggability Kubernetes emphasizes the philosophy of extensibility and pluggability, which means that the platform preserves user choice and flexibility where those matter. Kubernetes aims to support the widest variety of workloads and application types possible and to be easy to integrate with any environment and tool. Some pluggin framework supported by Kubernetes include: Container Network Interface (CNI() plugins: these implement the CNI networking model and are designed for interoperability. The out-of-tree volume plugins such as the Container Storage Interface (CSI) and FlexVolume. They enable storage vendors to create custom storage plugins without adding them to the Kubernetes repository. How Does Supergiant Add to Kubernetes? Supergiant simplifies deployment and management of applications in Kubernetes for developers and administrators. In addition to easing the configuration and deployment of Helm charts, Supergiant facilitates clusters on multiple cloud providers, striving for truly agnostic infrastructure. It achieves this with an autoscaling system designed to increase efficiency and reduce costs. Autoscaling ensures that non-utilized infrastructure is not paid for by downscaling unused nodes and resource packing. In addition to that, Supergiant implements various abstract layers for load balancing, application deployment, basic monitoring, node deployment, and destruction on a highly usable UI. What's Next? Read some articles that introduce to key concepts of Kubernetes : Kubernetes Pods Kubernetes Networking Kubernetes Storage Find more about Supergiant: Supergiant documentation Read about Supergiant packing algorithm [Less]
Posted over 6 years ago by Kirill
We previously discussed how to use Secrets API to populate containers and pods with sensitive data and enhance the security of your Kubernetes application. Secrets are handy in detaching passwords and other credentials from pod ... [More] manifests and in preventing bad actors from ever getting them.  Kubernetes ConfigMaps apply the same approach to configuration data that can be easily detached from pod specs using a simple API.  In this tutorial, we'll discuss various ways to create and expose ConfigMaps to your pods and containers. By the end of this article, you'll be able to inject configuration details into pods and containers without actually exposing them as literal values. This pattern enables better isolation and extensibility of your Kubernetes applications and allows easier maintainability of your deployment code. You'll see it yourself soon. Let's get started! Definition of a ConfigMap In a nutshell, ConfigMap is a Kubernetes API object designed to detach configuration from container images. The basic rationale behind using ConfigMaps is to make Kubernetes applications portable, maintainable, and extensible. A ConfigMap API resource stores configuration data as key-value pairs. This data can be easily converted to files and environmental variables accessible inside container runtime and/or volumes mounted to the containers.  Unlike Kubernetes Secrets, however, ConfigMap data is not obfuscated using base64 and is consumed as a plaintext. This is because you are not supposed to store sensitive information in your ConfigMaps. If you need to inject credentials to your Kubernetes application, use Secrets API instead. There are several ways to create ConfigMaps in Kubernetes: from literal values, from files, and using ConfigMap manifests. A general pattern for creating ConfigMaps using kubectl looks like: kubectl create configmap where the map-name is the name of a ConfigMap and the data-source corresponds to a key-value pair in the ConfigMap, where Key is the file name of the key you provided on the CLI, and Value is the file contents or the literal value you provided on the CLI. In what follows, we'll see how to use this pattern to create ConfigMaps in Kubernetes. Tutorial To complete examples in this tutorial, you'll need: A running Kubernetes cluster. See Supergiant documentation for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube. A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here. Creating ConfigMaps from Files If you have a lot of configuration settings, the most viable option is storing them in files and creating ConfigMaps from those files. To illustrate this, let's first create a text file with some configuration. We came up with a simple front-end configuration that contains the following settings: color.primary=purple color.brand=yellow font.size.default = 14px Save this configuration in some file (e.g., front-end) and use --from-file argument to ship this file to the ConfigMap: kubectl create configmap front-end-config --from-file=front-end configmap "front-end-config" created To see a detailed information about a new ConfigMap, you can run: kubectl describe configmap front-end-config You should see the following output: Name: front-end-config Namespace: default Labels: Annotations: Data ==== front-end: ---- color.primary=purple color.brand=yellow font.size.default = 14px Events: As you see, the name of the file with configuration settings turned into the ConfigMap's key and the contents of that file (all configuration fields) became the value of that key. Alternatively, if you want to provide a key name different from the filename, you can use the following pattern: kubectl create configmap front-end-config --from-file=ui-config=front-end In this case, the key of the ConfigMap would be ui-config instead of front-end. Creating ConfigMaps from Literal Values If you plan to use a few configuration settings, you can create a ConfigMap from the literal value using --from-literal argument. For example, kubectl create configmap some-config --from-literal=font.size=14px --from-literal=color.default=green configmap "some-config" created will create a ConfigMap with two key-value pairs specified in the --from-literal arguments. As you see, using this argument, we can pass in multiple key-value pairs. You can get a detailed description of the new ConfigMap using the following command: kubectl get configmap some-config -o yaml which will output something like this: apiVersion: v1 data: color.default: green font.size: 14px kind: ConfigMap metadata: creationTimestamp: 2018-07-23T08:21:02Z name: some-config namespace: default resourceVersion: "414488" selfLink: /api/v1/namespaces/default/configmaps/some-config uid: 56360207-8e51-11e8-9c6c-0800270c281a As you see, each key-value pair of your configuration is represented as a separate entry in the data section of the ConfigMap. Creating ConfigMaps With a ConfigMap Manifest Defining a ConfigMap manifest is useful when you want to create multiple configuration key-value pairs that can be accessed as environmental variables or files in the volumes mounted to the container/s in your pod. ConfigMap manifests look similar to API resources we've already discussed but have their distinct fields. Below is an example of a simple ConfigMap that stores three key-value pairs: apiVersion: v1 kind: ConfigMap metadata: name: trading-strategy namespace: default data: strategy.type: HFT strategy.maxVolume: "5000" strategy.risk: high As you see, the main difference between this manifest and other API resources is the ConfigMap kind and special data field that stores key-value pairs. Save this spec in the trading-strategy.yaml and create a ConfigMap running the following command: kubectl create -f trading-strategy.yaml configmap "trading-strategy" created Check if the ConfigMap was successfully created: kubectl get configmap trading-strategy -o yaml apiVersion: v1 data: strategy.maxVolume: "5000" strategy.risk: high strategy.type: HFT kind: ConfigMap metadata: creationTimestamp: 2018-07-20T09:27:44Z name: trading-strategy namespace: default resourceVersion: "395247" selfLink: /api/v1/namespaces/default/configmaps/trading-strategy uid: 283cacd6-8bff-11e8-a2b0-0800270c281a That's it! A new ConfigMap can be now used in pods. For example, one option is to expose configuration data in the container's environmental variables: apiVersion: v1 kind: Pod metadata: name: demo-envr spec: containers: - name: envtest image: supergiantkir/k8s-liveliness ports: - containerPort: 8080 env: - name: STRATEGY_RISK valueFrom: configMapKeyRef: name: trading-strategy key: strategy.risk - name: STRATEGY_TYPE valueFrom: configMapKeyRef: name: trading-strategy key: strategy.type Let's briefly discuss key configuration fields of this pod spec: spec.containers[].env[].name -- the name of the environmental variable to map ConfigMap key to. spec.containers[].env[].valueFrom.configMapKeyRef.name -- the name of ConfigMap to use for this environmental variable. spec.containers[].env[].valueFrom.configMapKeyRef.key -- a ConfigMap key to use for this environmental variable. Save this spec in the demo-envr.yaml and create the pod: kubectl create -f demo-envr.yaml pod "demo-envr" created Once the pod is ready and running, get a shell to the container: kubectl exec -it demo-envr -- /bin/bash From within the container, you can access configuration as environmental variables by using printenv command. printenv STRATEGY_TYPE HFT printenv STRATEGY_RISK high Awesome, isn't it? Now, you can access environmental variables inside your container. For example, the container's scripts could use environmental variables defined in the ConfigMap to set up your application. If you have many ConfigMap keys, it might be more viable to define those keys formatted as POSIX environmental variables and expose them to the pod using envFrom field of the spec. Let's create a new ConfigMap to see how it works: apiVersion: v1 kind: ConfigMap metadata: name: ui-config namespace: default data: FONT_DEFAULT_COLOR: green FONT_DEFAULT_SIZE: "14px" Notice that ConfigMap keys are now formatted as POSIX environmental variable names. Now, you can save the spec in the ui-config.yaml and create a ConfigMap running the following command: kubectl create -f ui-config.yaml configmap "ui-config" created Next, let's create a new pod that will use this ConfigMap: apiVersion: v1 kind: Pod metadata: name: demo-from-env spec: containers: - name: envtest image: supergiantkir/k8s-liveliness ports: - containerPort: 8080 envFrom: - configMapRef: name: ui-config Notice that spec.containers[].envFrom[].configMapRef field takes only the name of our ConfigMap (i.e., we need not specify all key-value pairs). Save this spec in the demo-from-env.yaml and create the pod running the following command: kubectl create -f demo-from-env.yaml pod "demo-from-env" created Check if the pod was created: kubectl get pod demo-from-env NAME READY STATUS RESTARTS AGE demo-from-env 1/1 Running 0 2m Once the pod is up and running, get a shell to the active container kubectl exec -it demo-from-env -- /bin/bash And print the environmental variables using env command from the bash: env FONT_DEFAULT_COLOR=green FONT_DEFAULT_SIZE=14px SHLVL=1 HOME=/root YARN_VERSION=1.6.0 .... As you see, the configuration variables defined in the ConfigMap were successfully populated into the environmental variables of the container. Using envFrom is less verbose because you don't define individual environmental variables. This benefit, however, comes with the requirement of the proper formatting of variable names (see the note below): Note: In case if you are using envFrom instead of env to create environmental variables in the container, the environmental names will be created from the ConfigMap's keys. If a ConfigMap key has invalid environment variable name, it will be skipped but the pod will be allowed to start. Kubernetes uses the same conventions as POSIX for checking the validity of environmental variables but that might change. According to POSIX: Environment variable names used by the utilities in the Shell and Utilities volume of IEEE Std 1003.1-2001 consist solely of uppercase letters, digits, and the '_' (underscore) from the characters defined in Portable Character Set and do not begin with a digit. Other characters may be permitted by an implementation; applications shall tolerate the presence of such names. If the environmental variable name does not pass the check, the  InvalidVariableNames event will be fired and the message with the list of invalid keys that were skipped will be generated. Injecting ConfigMaps into the Container's Volume As you remember from the previous tutorial, Kubernetes supports a  configMap volume type that can be used to inject configuration defined in the ConfigMap object for the use of containers in your pod. This option is useful when you want to populate configuration files inside the container with configuration key-value pairs defined in your ConfigMap. To illustrate this use case, let's populate the container's volume with the ConfigMap data defined in the example above. apiVersion: v1 kind: Pod metadata: name: demo-config-volume spec: containers: - name: demo-cont image: supergiantkir/k8s-liveliness volumeMounts: - name: config-volume mountPath: /etc/config volumes: - name: config-volume configMap: name: trading-strategy restartPolicy: Never In this pod spec, we define a configMap volume type for the pod and mount it to the /etc/config path inside the container. Save this spec in the demo-config-volume.yaml and create the pod running the following command: kubectl create -f demo-config-volume.yaml pod "demo-config-volume" created Once the pod is ready and running, get a shell to the container kubectl exec -it demo-config-volume -- /bin/bash and check the /etc/config folder: ls /etc/config/ strategy.maxVolume strategy.risk strategy.type As you see, Kubernetes created three files in that folder. Each file's name is derived from the key name and the file contents are the key value. You can verify this easily: cat /etc/config/strategy.risk high If you want to map the ConfigMap keys to different file names, you can slightly adjust the pod spec above using volumes.configMap.items field. apiVersion: v1 kind: Pod metadata: name: demo-config-volume spec: containers: - name: demo-cont image: supergiantkir/k8s-liveliness volumeMounts: - name: config-volume mountPath: /etc/config volumes: - name: config-volume configMap: name: trading-strategy items: - key: strategy.risk path: risk restartPolicy: Never Now, the strategy.risk configuration will be stored under the path /etc/config/risk instead of /etc/config/strategy.risk as in the example above. Please note that the default path won't be used if items are used. Thus, each piece of the ConfigMap desired must be reflected there.  Also, take note that if a wrong ConfigMap key is specified, the volume will not be created. Update Note: as soon as our ConfigMap is consumed by a volume, Kubernetes will be running periodic checks on the configuration. If the ConfigMap is updated, Kubernetes will ensure that the projected keys are updated as well. The update may take some time, depending on the kubelet sync period. However, if your container uses a ConfigMap as a subPath volume mount, the configuration won't be updated. Cleaning Up Let's delete all assets and objects created during this tutorial. Delete ConfigMaps: kubectl delete configmap trading-strategy configmap "trading-strategy" deleted kubectl delete configmap ui-config configmap "ui-config" deleted Delete pods: kubectl delete pod demo-config-volume pod "demo-config-volume" deleted kubectl delete pod demo-envr pod "envr" deleted kubectl delete pod demo-from-env pod "demo-from-env" Finally, delete all files with resource manifests if you don't need them anymore. Conclusion One of the main rules of good application development and containerized applications deployment is to separate configuration from the rest of your application. This allows deployments to be easily maintainable and extensible by different developer teams. Keeping configuration in ConfigMaps and exposing them to your containers when needed embodies this vision for your Kubernetes applications. Instead of injecting configuration directly to container image, you can leverage the power of ConfigMaps and pods to mount configuration key-value pairs to specific volume paths or environmental variables inside the container's runtime. This approach will dramatically simplify the management of configuration in your Kubernetes applications ensuring their maintainability and extensibility. [Less]
Posted almost 7 years ago by Kirill
In the recent tutorial, we discussed Secrets API designed to encode sensitive data and expose it to pods in a controlled way, enabling secrets encapsulation and sharing between containers.  However, Secrets are only one component ... [More] of the pod- and container-level security in Kubernetes. Another important dimension is a security context that facilitates management of access rights, privileges, and permissions for processes and filesystems in Kubernetes.  In this tutorial, we'll discuss how to set access rights and privileges for container processes within a pod using discretionary access control (DAC) and ensuring proper isolation of container processes from the host using Linux capabilities. By the end of this tutorial, you'll know how to limit the ability of containers to negatively impact your infrastructure and other containers and limit access of users to sensitive data and mission-critical programs in your Kubernetes environment. Let's get started! Defining Security Context A security context can be defined as a set of constraints applied to a container in order to achieve the following goals: Enable a distinct isolation between a container and the host/node it runs on. Many users of containers underestimate this task and think that containers are properly isolated from hosts like virtual machines (VMs). The reality is different though. Privileged processes (e.g., running as root) running in the container are identical to privileged processes that run on the host. Therefore, running an application in the container does not isolate it from the host. Running containers as root can cause serious problems if Docker images from untrusted sources are used. Prevent containers from negatively impacting the infrastructure or other containers. These basic goals necessitate the following best practices for using security contexts in Kubernetes: Drop process privileges in containers as quickly as possible or be aware of them. Run services as non-root whenever possible. Don't use random Docker images in your system. Security contexts in Kubernetes facilitate implementation of this task and help protect your system against various security risks. We'll discuss below how to achieve the goals outlined above by using PodSecurityContext and SecurityContext in your pods and containers. Tutorial To complete examples in this tutorial, you'll need: A running Kubernetes cluster. See Supergiant documentation for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube. A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here. Using Security Contexts in Pods and Containers Security context settings implement basic philosophy of discretionary access control (DAC). This is a type of access control in which a given user has complete control over all programs it owns and executes. This user can also determine the permissions of other users for accessing and modifying these files or programs. DAC contrasts with mandatory access control (MAC) by which the operating system (OS) constraints the ability of a subject (e.g., process) or initiator to access or perform some operations on computing objects (e.g., files), In Kubernetes, using DAC implies that you, as a user or administrator, can set access and permission constraints on files and processes ran in your pods and containers. Security contexts can be specified for the entire pods and/or for individual containers.  Let's first start with the pod-level security context. To specify security settings for a pod, you need to include the securityContext field in the pod manifest. This field is a PodSecurityContext object that saves security context in the Kubernetes API. Let's create a pod with a security context using the example below. This is a pod that runs a simple Node.js application that we wrote and saved in the public Docker Hub repository. apiVersion: v1 kind: Pod metadata: name: security-context-pod spec: securityContext: runAsUser: 2500 fsGroup: 2000 volumes: - name: security-context-vol emptyDir: {} containers: - name: security-context-cont image: supergiantkir/k8s-liveliness volumeMounts: - name: security-context-vol mountPath: /data/test securityContext: allowPrivilegeEscalation: false As you can see, we have two security contexts in this pod. The first one is a pod-level security context defined by the PodSecurityContext object, and the second one is a SecurityContext defined for the individual container. Pod-level security context works for all individual containers in the pod, but, field values of container.securityContext take precedence over field values of PodSecurityContext. In other words, if the container-level security context is defined, it overrides the pod-level security context. You now have a basic understanding of how security contexts work, so let's discuss key settings available for the PodSecurityContext: .spec.securityContext.runAsUser -- This field specifies the User ID (UID) with which to run the Entrypoint (default executable of the image) of the container process. If the field value is not specified, it defaults to the UID defined in the image metadata. The discussed field can be also used in the spec.containers[].securityContext , in which case it takes precedence over the same field in the PodSecurityContext. In our example, the field specifies that for any containers in the pod, the container process runs with user ID 2500. .spec.securityContext.fsGroup -- The field defines a special supplemental group that assigns a group ID (GID) for all containers in the pod. Also, this group ID is associated with the emptyDir volume mounted at /data/test and with any files created in that volume. You should remember that only certain volume types allow the kubelet to change the ownership of a volume to be owned by the pod. If the volume type allows this (as emptyDir volume type) the owning GID will be the fsGroup. .spec.securityContext.runAsGroup -- This field is useful in cases when you want to run the entrypoint of the container process by a group rather than a user. In this case, you can specify a GID for that group using this field. If the field is not set, the image default will be used. If the field is set both in SecurityContext and PodSecurityContext, the value specified in the container's SecurityContext takes precedence over the one specified in the PodSecurityContext. .spec.securityContext.runAsNonRoot -- The field determines whether the pod's container should run as a non-root user. If set to true, the kubelet will validate the image at runtime to make sure that it does not run as UID 0 (root) and won't start the container if it does. If set in both SecurityContext and PodSecurityContext, the value specified in SecurityContext takes precedence. The discussed field is very important for preventing privileged processes in containers from accessing the system and the host. Now, as you understand key options for PodSecurityContext, save the spec above in security-context-demo.yaml and create the Pod: kubectl create -f security-context-demo.yaml pod "security-context-pod" created Now, verify that the pod is running: kubectl get pod security-context-pod NAME READY STATUS RESTARTS AGE security-context-pod 1/1 Running 0 16s Next, we will check the ownership of processes run within the Node.js container. First, get a shell to the running container: kubectl exec -it security-context-pod -- /bin/bash Inside the container, list all running processes: ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND 2500 1 0.7 2.0 983564 41352 ? Ssl 11:24 0:00 npm 2500 16 0.0 0.0 4340 736 ? S 11:24 0:00 sh -c node serv 2500 17 0.4 1.7 882368 35848 ? Sl 11:24 0:00 node server.js 2500 23 0.0 0.1 20252 3252 pts/0 Ss 11:24 0:00 /bin/bash 2500 28 0.0 0.1 17500 2056 pts/0 R+ 11:24 0:00 ps aux Awesome! The output above shows that all processes in the container are run by the UID 2500 as we expected. Remember that we set the GID for all containers and volumes in our Pod? Let's check how it worked out. Go to the /data directory in the container's filesystem root and list the permissions of the /test directory inside it: cd /data ls -l You should see something like this: drwxrwsrwx 2 root 2000 4096 Jul 19 11:23 test The output shows that the /data/demo directory has group ID 2000, which is the value of fsGroup. Hypothetically, all new files and directories will also receive the GID defined by the fsGroup. Let's check if this is true: cd test echo This file has the same GID as the parent directory > demofile Now, check the file's ownership: ls -l -rw-r--r-- 1 2500 2000 51 Jul 19 11:30 demofile As you see, the demofile has a group ID 2000, which is the value of fsGroup. As simple as that! Overriding Pod Security Context in the Container As we've already mentioned, a container's SecurityContext takes precedence over the PodSecurityContext. Therefore, you can set a pod-level security context for all containers in the pod and override it if needed by modifying a SecurityContext for individual containers. Let's create a new pod to see how this works: apiVersion: v1 kind: Pod metadata: name: override-security-demo spec: securityContext: runAsUser: 3000 containers: - name: override-security-cont image: supergiantkir/k8s-liveliness securityContext: runAsUser: 2000 allowPrivilegeEscalation: false This pod runs the container with the same Docker image as in the example above, but this time UID to run the process with is specified both for the pod and the container inside it. Before creating this Pod, let's discuss key options available in the container's SecurityContext: .spec.containers[]securityContext.runAsUser -- The same as in the PodSecurityContext .spec.containers[]securityContext.runAsGroup -- The same as in the PodSecurityContext .spec.containers[]securityContext.runAsNonRoot -- The same as in the PodSecurityContext .spec.containers[].securityContext.allowPrivilegeEscalation -- This field controls whether a process can get more privileges than its parent process. More specifically, it controls whether the no_new_privs flag will be set on the container process. AllowPrivilegeEscalation is always true when the container is: (1) run as Privileged (2) has a CAP_SYS_ADMIN Linux capability enabled. .spec.containers[].securityContext.privileged -- The field tells kubelet to run the container in the privileged mode. Processes in privileged containers are essentially identical to root processes on the host. The default value is false. .spec.containers[].securityContext.readOnlyRootFilesystem -- Defines whether a container has a read-only root filesystem. The default value is false. .spec.containers[].securityContext.seLinuxOptions -- The SELinux context to be applied to the container. If the value is unspecified, the container runtime (e.g., Docker) will assign a random SELinux context for each container in a pod. If the value is set in both SecurityContext and PodSecurityContext, the value specified in SecurityContext takes precedence. Save this spec in the override-security-demo.yaml and create the pod running the following command: kubectl create -f override-security-demo.yaml pod "override-security-demo" created Next, verify that the pod is running: kubectl get pod override-security-demo NAME READY STATUS RESTARTS AGE override-security-demo 1/1 Running 0 45s Then, as in the first example, get a shell to the running container to check the ownership of container processes: kubectl exec -it override-security-demo -- /bin/bash Inside the container, show the list of running processes: ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND 2000 1 0.2 2.0 983532 40992 ? Ssl 10:14 0:00 npm 2000 16 0.0 0.0 4340 800 ? S 10:14 0:00 sh -c node serv 2000 17 0.1 1.7 883392 36084 ? Sl 10:14 0:00 node server.js 2000 23 0.0 0.1 20252 3232 pts/0 Ss 10:16 0:00 /bin/bash 2000 28 0.0 0.1 17500 2060 pts/0 R+ 10:16 0:00 ps aux As you see, all the processes are run with the UID 2000 which is the value of runAsUser specified for the Container. It overrides the UID value 3000 specified for the pod. Using Linux Capabilities If you want a fine-grained control over process privileges, you can use Linux capabilities. To understand how they work, we need a basic introduction to the Unix/Linux processes. In a nutshell, traditional Unix implementations have two classes of processes: (1) privileged processes (whose user ID is 0, referred to as root or as superuser) and (2) unprivileged processes (that have a non-zero UID).  In contrast to privileged processes that bypass all kernel permission checks, unprivileged processes have to pass full permission checking based on the process's credentials such as effective UID, GID, and supplementary group list). Starting with kernel 2.2, Linux has divided privileged processes' privileges into distinct units, known as capabilities. These distinct units/privileges can be independently assigned and enabled for unprivileged processes introducing root privileges to them. Kubernetes users can use Linux capabilities to grant certain privileges to a process without giving it all privileges of the root user. This is helpful for improving container isolation from the host since containers no longer need to write as root -- you can just grant certain root privileges to them and that's it. To add or remove Linux capabilities for a container, you can include the capabilities field in the securityContextsection of the container manifest. Let's see an example: apiVersion: v1 kind: Pod metadata: name: linux-cpb-demo spec: securityContext: runAsUser: 3000 containers: - name: linux-cpb-cont image: supergiantkir/k8s-liveliness securityContext: capabilities: add: ["NET_ADMIN"] In this example, we assigned a CAP_NET_ADMIN capability to the container. This Linux capability allows a process to perform various network-related operations such as interface configuration, administration of IP firewall, modifying routing tables, enabling multicasting, etc. For the full list of available capabilities, see the official Linux documentation. Note: Linux capabilities have the form CAP_XXX. However, when you list capabilities in your Container manifest, you must omit the CAP_ part of the constant. For example, to add CAP_NET_ADMIN capability, include SYS_TIME in your list of capabilities. Cleaning Up As this tutorial is over, let's clean after ourselves. Don't forget to delete all pods: kubectl delete pod security-context-pod pod "security-context-pod" deleted kubectl delete pod override-security-demo pod "override-security-demo" deleted kubectl delete pod linux-cpb-demo pod "linux-cpb-demo" deleted Also, you may wish to delete all files with the pod manifests if you don't need them anymore. Conclusion In this article, we have discussed how to use Kubernetes security contexts in your pods and containers. Security contexts are a powerful tool for controlling access rights and privileges of processes running in the pod's containers. Kubernetes allows setting a pod-level security context for all containers and overriding it by the individual containers using SecurityContext manifest.  Kubernetes security contexts are also helpful if you want to isolate container processes from the host. In particular, you learned how to use Linux capabilities to grant certain root privileges to processes allowing them to run as non-root while giving them root privileges necessary for them to work. All these features make Kubernetes security context a powerful addition to Kubernetes secrets that allow improving the security of your Kubernetes application and proper isolation of container environment from other users and underlying nodes. [Less]
Posted almost 7 years ago by Kirill
Do your Kubernetes pods or deployments sometimes crash or begin to behave in a way not expected?  Without knowledge of how to inspect and debug them, Kubernetes developers and administrators will struggle to identify the reasons ... [More] for the application failure. Fortunately, Kubernetes ships with powerful built-in debugging tools that allow inspecting cluster-level, node-level, and application-level issues.  In this article, we focus on several application-level issues you might face when you create your pods and deployments. We'll show several examples of using kubectl CLI to debug pending, waiting, or terminated pods in your Kubernetes cluster. By the end of this tutorial, you'll be able to identify the causes of pod failures at a fraction of time, making debugging Kubernetes applications much easier. Let's get started! Tutorial To complete examples in this tutorial, you need the following prerequisites: A running Kubernetes cluster. See Supergiant GitHub wiki for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube. A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here. One of the most common reasons for a pod being unable to start is that Kubernetes can't find a node on which to schedule the pod. The scheduling failure might be because of the excessive resource request by pod containers. If for some reason you lost track of how many resources are available in your cluster, the pod's failure to start might confuse and puzzle you. Kubernetes built-in pod inspection and debugging functionality come to the rescue, though. Let's see how. Below is the Deployment spec that creates 5 replicas of Apache HTTP Server (5 Pods) requesting 0.3 CPU and 500 Mi each (Check this article to learn more about Kubernetes resource model): apiVersion: apps/v1 kind: Deployment metadata: name: httpd-deployment labels: app: httpd spec: replicas: 5 selector: matchLabels: app: httpd template: metadata: labels: app: httpd spec: containers: - name: httpd image: httpd:latest ports: - containerPort: 80 resources: requests: cpu: "0.3" memory: "500Mi" Let's save this spec in the httpd-deployment.yaml and create the deployment with the following command: kubectl create -f httpd-deployment.yaml deployment.extensions "httpd-deployment" created Now, if we check the replicas created we'll see the following output: kubectl get pods NAME READY STATUS RESTARTS AGE httpd-deployment-b644c8654-54fpq 0/1 Pending 0 38s httpd-deployment-b644c8654-82brr 1/1 Running 0 38s httpd-deployment-b644c8654-h9cj2 1/1 Running 0 38s httpd-deployment-b644c8654-jsl85 0/1 Pending 0 38s httpd-deployment-b644c8654-wkqqx 1/1 Running 0 38s As you see, only 3 replicas of 5 are Ready and Running and 2 others are in the Pending state. If you are a Kubernetes newbie, you'll probably wonder what all these statuses mean. Some of them are quite easy to understand (e.g., Running ) while others are not. Just to remind the readers, a pod's life cycle includes a number of phases defined in the PodStatus object. Possible values for phase include the following: Pending: Pods with a pending status have been already accepted by the system, but one or several container images have not been yet downloaded or installed. Running: The pod has been scheduled to a specific node, and all its containers are already running. Succeeded: All containers in the pod were successfully terminated and will not be restarted. Failed: At least one container in the pod was terminated with a failure. This means that one of the containers in the pod either exited with a non-zero status or was terminated by the system. Unknown: The state of the pod cannot be obtained for some reason, typically due to a communication error. Two replicas of our deployment are Pending, which means that the pods have not yet been scheduled by the system. The next logical question why that is the case? Let's use the main inspection tool at our disposal --kubectl describe. Run this command with one of the pods that have a Pending status: kubectl describe pod httpd-deployment-b644c8654-54fpq Name: httpd-deployment-b644c8654-54fpq Namespace: default Node: Labels: app=httpd pod-template-hash=620074210 Annotations: Status: Pending IP: Controlled By: ReplicaSet/httpd-deployment-b644c8654 Containers: httpd: Image: httpd:latest Port: 80/TCP Host Port: 0/TCP Requests: cpu: 300m memory: 500Mi Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-9wdtd (ro) Conditions: Type Status PodScheduled False Volumes: default-token-9wdtd: Type: Secret (a volume populated by a Secret) SecretName: default-token-9wdtd Optional: false QoS Class: Burstable Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 4m (x37 over 15m) default-scheduler 0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory. Let's discuss some fields of this description that are useful for debugging: Namespace -- A Kubernetes namespace in which the pod was created. You might sometimes forget the namespace in which the deployment and pod were created and then be surprised to find no pods when running kubectl get pods. In this case, check all available namespaces by running kubectl get namespaces and access pods in the needed namespace by running kubectl get pods --namespace . Status -- A pod's lifecycle phase defined in the PodStatus object (see the discussion above). Conditions: PodScheduled -- A Boolean value that tells if the pod was scheduled. The value of this field indicates that our pod was not scheduled. QoS Class -- Resource guarantees for the pod defined by the quality of service (QoS) Class. In accordance with QoS, pods can be Guaranteed, Burstable and Best-Effort (see the image below). Events -- pod events emitted by the system. Events are very informative about the potential reasons for the pod's issues. In this example, you can find the event with a FailedScheduling Reason and the informative message indicating that the pod was not scheduled due to insufficient CPU and insufficient memory. Events such as these are stored in etcd to provide high-level information on what is going on in the cluster. To list all events, we can use the following command: kubectl get events LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE 3h 3h 1 apache-server-558f6f49f6-8bjnc.1541c8c4e84a9d6c Pod Normal SuccessfulMountVolume kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-9wdtd" 3h 3h 1 apache-server-558f6f49f6-8bjnc.1541c8c4f1c64e87 Pod Normal SandboxChanged kubelet, minikube Pod sandbox changed, it will be killed and re-created. 3h 3h 1 apache-server-558f6f49f6-8bjnc.1541c8c500f534e9 Pod spec.containers{httpd} Normal Pulled kubelet, minikube Container image "httpd:2-alpine" already present on machine 3h 3h 1 apache-server-558f6f49f6-8bjnc.1541c8c503df3cb5 Pod spec.containers{httpd} Normal Created kubelet, minikube Created container 3h 3h 1 apache-server-558f6f49f6-8bjnc.1541c8c50a061e37 Pod spec.containers{httpd} Normal Started kubelet, minikube Started container 3h 3h 1 apache-server-558f6f49f6-p7mkl.1541c8c4711915e3 Pod Normal SuccessfulMountVolume kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-9wdtd" 3h 3h 1 apache-server-558f6f49f6-p7mkl.1541c8c475d37603 Pod Normal SandboxChanged kubelet, minikube Pod sandbox changed, it will be killed and re-created. Please, remember that all events are namespaced so you should indicate the namespace you are searching by typing: kubectl get events --namespace=my-namespace As you see, kubectl describe pod function is very powerful in identifying pod issues. It allowed us to find out that the pod was not created due to insufficient memory and CPU. Another way to retrieve extra information about a pod is passing the -o yaml format flag to kubectl get pod: kubectl get pod httpd-deployment-b644c8654-54fpq -o yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: 2018-07-13T10:33:58Z generateName: httpd-deployment-b644c8654- labels: app: httpd pod-template-hash: "620074210" name: httpd-deployment-b644c8654-54fpq namespace: default ownerReferences: - apiVersion: extensions/v1beta1 blockOwnerDeletion: true controller: true kind: ReplicaSet name: httpd-deployment-b644c8654 uid: 40329209-8688-11e8-bf09-0800270c281a resourceVersion: "297148" selfLink: /api/v1/namespaces/default/pods/httpd-deployment-b644c8654-54fpq uid: 40383a20-8688-11e8-bf09-0800270c281a spec: containers: - image: httpd:latest imagePullPolicy: Always name: httpd ports: - containerPort: 80 protocol: TCP resources: requests: cpu: 300m memory: 500Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: default-token-9wdtd readOnly: true dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 30 tolerations: - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 volumes: - name: default-token-9wdtd secret: defaultMode: 420 secretName: default-token-9wdtd status: conditions: - lastProbeTime: null lastTransitionTime: 2018-07-13T10:33:58Z message: '0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory.' reason: Unschedulable status: "False" type: PodScheduled phase: Pending qosClass: Burstable This command will output all information that Kubernetes has about this pod. It will contain the description of all spec options and fields you specified including any annotations, restart policy, statuses, phases, and more. The abundance of pod-related data makes this command one of the best tools for debugging pods in Kubernetes. That's it! To fix the scheduling issue, you'll need to request the appropriate amount of CPU and memory. While doing so, please, keep in mind that Kubernetes starts with some default daemons and services like kube-proxy. Therefore, you can't request 1.0 of CPU for your apps. Scheduling is only one amongst the common issues for your pods stuck in the pending stage. Let's create another deployment to illustrate other potential scenarios: apiVersion: apps/v1 kind: Deployment metadata: name: apache-server labels: app: httpd spec: replicas: 3 selector: matchLabels: app: httpd strategy: type: RollingUpdate rollingUpdate: maxSurge: 40% maxUnavailable: 40% template: metadata: labels: app: httpd spec: containers: - name: httpd image: httpd:23-alpine ports: - containerPort: 80 All we need to know about this deployment, is that it creates 3 replicas of the Apache HTTP server and specifies the custom RollingUpdate strategy. Let's save this spec in the httpd-deployment-2.yaml and create the deployment running the following command: kubectl create -f httpd-deployment-2.yaml deployment.apps "apache-server" created Let's check whether all replicas were successfully created: kubectl get pods NAME READY STATUS RESTARTS AGE apache-server-dc9bf8469-bblb4 0/1 ImagePullBackOff 0 53s apache-server-dc9bf8469-x2wwq 0/1 ErrImagePull 0 53s apache-server-dc9bf8469-xhmm7 0/1 ImagePullBackOff 0 53s Oops! As you see, all three dods are not Ready and have ImagePullBackoff and ErrImagePull statuses. These statuses indicate that something wrong has happened while pulling the httpd image from the Docker hub repository. Let's describe one of the pods in the deployment to find out more information: kubectl describe pod apache-server-dc9bf8469-bblb4 Name: apache-server-dc9bf8469-bblb4 Namespace: default Node: minikube/10.0.2.15 Start Time: Fri, 13 Jul 2018 15:25:02 +0300 Labels: app=httpd pod-template-hash=875694025 Annotations: Status: Pending IP: 172.17.0.6 Controlled By: ReplicaSet/apache-server-dc9bf8469 Containers: httpd: Container ID: Image: httpd:23-alpine Image ID: Port: 80/TCP Host Port: 0/TCP State: Waiting Reason: ImagePullBackOff Ready: False Restart Count: 0 Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-9wdtd (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: default-token-9wdtd: Type: Secret (a volume populated by a Secret) SecretName: default-token-9wdtd Optional: false QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 3m default-scheduler Successfully assigned apache-server-dc9bf8469-bblb4 to minikube Normal SuccessfulMountVolume 3m kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-9wdtd" Normal Pulling 2m (x4 over 3m) kubelet, minikube pulling image "httpd:23-alpine" Warning Failed 2m (x4 over 3m) kubelet, minikube Failed to pull image "httpd:23-alpine": rpc error: code = Unknown desc = Error response from daemon: manifest for httpd:23-alpine not found Warning Failed 2m (x4 over 3m) kubelet, minikube Error: ErrImagePull Normal BackOff 1m (x6 over 3m) kubelet, minikube Back-off pulling image "httpd:23-alpine" Warning Failed 1m (x6 over 3m) kubelet, minikube Error: ImagePullBackOff If you scroll down this description to the bottom, you'll see details on why the pod failed: "Failed to pull image "httpd:23-alpine": rpc error: code = Unknown desc = Error response from daemon: manifest for httpd:23-alpine not found". This means that we specified a container image that does not exist. Let's update our deployment with the right httpd container image version to fix the issue: kubectl set image deployment/apache-server httpd=httpd:2-alpine deployment.apps "apache-server" image updated Then, let's check the deployment's pods again: kubectl get pods NAME READY STATUS RESTARTS AGE apache-server-558f6f49f6-8bjnc 1/1 Running 0 36s apache-server-558f6f49f6-p7mkl 1/1 Running 0 36s apache-server-558f6f49f6-q8gj5 1/1 Running 0 36s Awesome! The Deployment controller has managed to pull the new image and all Pod replicas are now Running. Finding the Reasons your Pod Crashed Sometimes, your pod might crash due to some syntax errors in commands and arguments for the container. In this case, kubectl describe pod will provide you only with the error name but not the explanation of its cause. Let's create a new pod to illustrate this scenario: apiVersion: v1 kind: Pod metadata: name: pod-crash labels: app: demo spec: containers: - name: busybox image: busybox command: ['sh'] args: ['-c', 'MIN=5 SEC=45; echo "$(( MIN*60 + SEC + ; ))"'] This pod uses BusyBox sh command to calculate the arithmetic value of two variables. Let's save the spec in the pod-crash.yaml and create the pod running the following command: kubectl create -f pod-crash.yaml pod "pod-crash" created Now, if you check the pod, you'll see the following output: kubectl get pods NAME READY STATUS RESTARTS AGE pod-crash 0/1 CrashLoopBackOff 1 11s The CrashLoopBackOff status means that you have a pod starting, crashing, starting again, and then crashing again. Kubernetes attempts to restart this pod because the default restartPolicy: Always is enabled. If we had set the policy to Never, the pod would not be restarted. The status above, however, does not indicate the precise reason for the pod's crash. Let's try to find more details: kubectl describe pod pod-crash Containers: busybox: Container ID: docker://f9a67ec6e37281ff16b114e9e5a1f1c0adcd027bd1b63678ac8d09920a25c0ed Image: busybox Image ID: docker-pullable://busybox@sha256:141c253bc4c3fd0a201d32dc1f493bcf3fff003b6df416dea4f41046e0f37d47 Port: Host Port: Command: sh Args: -c MIN=5 SEC=45; echo "$(( MIN*60 + SEC + ; ))" State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 2 Started: Mon, 16 Jul 2018 10:32:35 +0300 Finished: Mon, 16 Jul 2018 10:32:35 +0300 Ready: False Restart Count: 5 Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-9wdtd (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: default-token-9wdtd: Type: Secret (a volume populated by a Secret) SecretName: default-token-9wdtd Optional: false QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 5m default-scheduler Successfully assigned pod-crash to minikube Normal SuccessfulMountVolume 5m kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-9wdtd" Normal Pulled 4m (x4 over 5m) kubelet, minikube Successfully pulled image "busybox" Normal Created 4m (x4 over 5m) kubelet, minikube Created container Normal Started 4m (x4 over 5m) kubelet, minikube Started container Normal Pulling 3m (x5 over 5m) kubelet, minikube pulling image "busybox" Warning BackOff 2s (x23 over 4m) kubelet, minikube Back-off restarting failed container The description above indicates that the pod is not Ready and that it was terminated because of the Back-Off Error. However, the description does not provide any further explanation of why the error occurred. Where should we search then? We are most likely to find the reason for the pod's crash in the BusyBox container logs. You can check them by running kubectl logs ${POD_NAME} ${CONTAINER_NAME}. Note that ${CONTAINER_NAME} can be omitted for pods that only contain a single container (as in our case) kubectl logs pod-crash sh: arithmetic syntax error Awesome! There must be something wrong with our command or arguments syntax. Indeed, we made a typo inserting ; into the expression echo "$(( MIN*60 + SEC + ; ))"'. Just fix that typo and you are good to go! Pod Fails Due to the 'Unknown Field' Error In the earlier versions of Kubernetes, a pod could be created even if the error was made in the spec's field name or value. In this case, the error would be silently ignored if the pod was created with the --validate flag set to false. In the newer Kubernetes versions (we are using Kubernetes 1.10.0), the --validate option is always set to true by default so the error for the unknown field is always printed. Therefore, the debugging becomes much easier. Let's create a pod with a wrong field value to illustrate this: apiVersion: v1 kind: Pod metadata: name: pod-field-error labels: app: demo spec: containers: - name: busybox image: busybox comand: ['sh'] args: ['-c', 'MIN=5 SEC=45; echo "$(( MIN*60 + SEC))"'] Let's save this spec in the pod-field-error.yaml and create the pod with the following command: kubectl create -f pod-field-error.yaml error: error validating "pod-field-error.yaml": error validating data: ValidationError(Pod.spec.containers[0]): unknown field "comand" in io.k8s.api.core.v1.Container; if you choose to ignore these errors, turn validation off with --validate=false As you see, the pod start was blocked because of the unknown 'comand' field (we made a typo in this field intentionally). If you are using older versions of Kubernetes and the pod is created with the error silently ignored, delete the pod and run it with kubectl create --validate -f pod-field-error.yaml . This command will help you find the reason for the error: kubectl create --validate -f pod-field-error.yaml I0805 10:43:25.129850 46757 schema.go:126] unknown field: comand I0805 10:43:25.129973 46757 schema.go:129] this may be a false alarm, see https://github.com/kubernetes/kubernetes/issues/6842 pods/pod-field-error Cleaning Up This tutorial is over, lso t's clean up after ourselves. Delete Deployments: kubectl delete deployment httpd-deployment deployment.extensions "httpd-deployment" deleted kubectl delete deployment apache-server deployment.extensions "apache-server" deleted Delete Pods: kubectl delete pod pod-crash pod "pod-crash" deleted kubectl delete pod pod-field-error pod "pod-field-error" deleted You may also want to delete files with spec definitions if you don't need them anymore. Conclusion As this tutorial demonstrated, Kubernetes ships with great debugging tools that help identify the reasons for pod failure or unexpected behavior at a fraction of time. The rule of thumb for Kubernetes debugging is first to find out the pod status and pod events and then check event messages by using kubectl describe or kubectl get events. If your pod crashes and the detailed error message is not available, you can check the containers' logs to find container-level errors and exceptions. These simple tools will dramatically increase your debugging speed and efficiency freeing up time for more productive work.  Stay tuned to upcoming blogs to learn more about node-level and cluster-level debugging in Kubernetes. [Less]
Posted almost 7 years ago by Kirill
Do your Kubernetes pods or deployments sometimes crash or begin to behave in a way not expected?  Without knowledge of how to inspect and debug them, Kubernetes developers and administrators will struggle to identify the reasons ... [More] for the application failure. Fortunately, Kubernetes ships with powerful built-in debugging tools that allow inspecting cluster-level, node-level, and application-level issues.  In this article, we focus on several application-level issues you might face when you create your pods and deployments. We'll show several examples of using kubectl CLI to debug pending, waiting, or terminated pods in your Kubernetes cluster. By the end of this tutorial, you'll be able to identify the causes of pod failures at a fraction of time, making debugging Kubernetes applications much easier. Let's get started! Tutorial To complete examples in this tutorial, you need the following prerequisites: A running Kubernetes cluster. See Supergiant GitHub wiki for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube. A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here. One of the most common reasons for a pod being unable to start is that Kubernetes can't find a node on which to schedule the pod. The scheduling failure might be because of the excessive resource request by pod containers. If for some reason you lost track of how many resources are available in your cluster, the pod's failure to start might confuse and puzzle you. Kubernetes built-in pod inspection and debugging functionality come to the rescue, though. Let's see how. Below is the Deployment spec that creates 5 replicas of Apache HTTP Server (5 Pods) requesting 0.3 CPU and 500 Mi each (Check this article to learn more about Kubernetes resource model): apiVersion: apps/v1 kind: Deployment metadata: name: httpd-deployment labels: app: httpd spec: replicas: 5 selector: matchLabels: app: httpd template: metadata: labels: app: httpd spec: containers: - name: httpd image: httpd:latest ports: - containerPort: 80 resources: requests: cpu: "0.3" memory: "500Mi" Let's save this spec in the httpd-deployment.yaml and create the deployment with the following command: kubectl create -f httpd-deployment.yaml deployment.extensions "httpd-deployment" created Now, if we check the replicas created we'll see the following output: kubectl get pods NAME READY STATUS RESTARTS AGE httpd-deployment-b644c8654-54fpq 0/1 Pending 0 38s httpd-deployment-b644c8654-82brr 1/1 Running 0 38s httpd-deployment-b644c8654-h9cj2 1/1 Running 0 38s httpd-deployment-b644c8654-jsl85 0/1 Pending 0 38s httpd-deployment-b644c8654-wkqqx 1/1 Running 0 38s As you see, only 3 replicas of 5 are Ready and Running and 2 others are in the Pending state. If you are a Kubernetes newbie, you'll probably wonder what all these statuses mean. Some of them are quite easy to understand (e.g., Running ) while others are not. Just to remind the readers, a pod's life cycle includes a number of phases defined in the PodStatus object. Possible values for phase include the following: Pending: Pods with a pending status have been already accepted by the system, but one or several container images have not been yet downloaded or installed. Running: The pod has been scheduled to a specific node, and all its containers are already running. Succeeded: All containers in the pod were successfully terminated and will not be restarted. Failed: At least one container in the pod was terminated with a failure. This means that one of the containers in the pod either exited with a non-zero status or was terminated by the system. Unknown: The state of the pod cannot be obtained for some reason, typically due to a communication error. Two replicas of our deployment are Pending, which means that the pods have not yet been scheduled by the system. The next logical question why that is the case? Let's use the main inspection tool at our disposal --kubectl describe. Run this command with one of the pods that have a Pending status: kubectl describe pod httpd-deployment-b644c8654-54fpq Name: httpd-deployment-b644c8654-54fpq Namespace: default Node: Labels: app=httpd pod-template-hash=620074210 Annotations: Status: Pending IP: Controlled By: ReplicaSet/httpd-deployment-b644c8654 Containers: httpd: Image: httpd:latest Port: 80/TCP Host Port: 0/TCP Requests: cpu: 300m memory: 500Mi Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-9wdtd (ro) Conditions: Type Status PodScheduled False Volumes: default-token-9wdtd: Type: Secret (a volume populated by a Secret) SecretName: default-token-9wdtd Optional: false QoS Class: Burstable Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 4m (x37 over 15m) default-scheduler 0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory. Let's discuss some fields of this description that are useful for debugging: Namespace -- A Kubernetes namespace in which the pod was created. You might sometimes forget the namespace in which the deployment and pod were created and then be surprised to find no pods when running kubectl get pods. In this case, check all available namespaces by running kubectl get namespaces and access pods in the needed namespace by running kubectl get pods --namespace . Status -- A pod's lifecycle phase defined in the PodStatus object (see the discussion above). Conditions: PodScheduled -- A Boolean value that tells if the pod was scheduled. The value of this field indicates that our pod was not scheduled. QoS Class -- Resource guarantees for the pod defined by the quality of service (QoS) Class. In accordance with QoS, pods can be Guaranteed, Burstable and Best-Effort (see the image below). Events -- pod events emitted by the system. Events are very informative about the potential reasons for the pod's issues. In this example, you can find the event with a FailedScheduling Reason and the informative message indicating that the pod was not scheduled due to insufficient CPU and insufficient memory. Events such as these are stored in etcd to provide high-level information on what is going on in the cluster. To list all events, we can use the following command: kubectl get events LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE 3h 3h 1 apache-server-558f6f49f6-8bjnc.1541c8c4e84a9d6c Pod Normal SuccessfulMountVolume kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-9wdtd" 3h 3h 1 apache-server-558f6f49f6-8bjnc.1541c8c4f1c64e87 Pod Normal SandboxChanged kubelet, minikube Pod sandbox changed, it will be killed and re-created. 3h 3h 1 apache-server-558f6f49f6-8bjnc.1541c8c500f534e9 Pod spec.containers{httpd} Normal Pulled kubelet, minikube Container image "httpd:2-alpine" already present on machine 3h 3h 1 apache-server-558f6f49f6-8bjnc.1541c8c503df3cb5 Pod spec.containers{httpd} Normal Created kubelet, minikube Created container 3h 3h 1 apache-server-558f6f49f6-8bjnc.1541c8c50a061e37 Pod spec.containers{httpd} Normal Started kubelet, minikube Started container 3h 3h 1 apache-server-558f6f49f6-p7mkl.1541c8c4711915e3 Pod Normal SuccessfulMountVolume kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-9wdtd" 3h 3h 1 apache-server-558f6f49f6-p7mkl.1541c8c475d37603 Pod Normal SandboxChanged kubelet, minikube Pod sandbox changed, it will be killed and re-created. Please, remember that all events are namespaced so you should indicate the namespace you are searching by typing: kubectl get events --namespace=my-namespace As you see, kubectl describe pod function is very powerful in identifying pod issues. It allowed us to find out that the pod was not created due to insufficient memory and CPU. Another way to retrieve extra information about a pod is passing the -o yaml format flag to kubectl get pod: kubectl get pod httpd-deployment-b644c8654-54fpq -o yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: 2018-07-13T10:33:58Z generateName: httpd-deployment-b644c8654- labels: app: httpd pod-template-hash: "620074210" name: httpd-deployment-b644c8654-54fpq namespace: default ownerReferences: - apiVersion: extensions/v1beta1 blockOwnerDeletion: true controller: true kind: ReplicaSet name: httpd-deployment-b644c8654 uid: 40329209-8688-11e8-bf09-0800270c281a resourceVersion: "297148" selfLink: /api/v1/namespaces/default/pods/httpd-deployment-b644c8654-54fpq uid: 40383a20-8688-11e8-bf09-0800270c281a spec: containers: - image: httpd:latest imagePullPolicy: Always name: httpd ports: - containerPort: 80 protocol: TCP resources: requests: cpu: 300m memory: 500Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: default-token-9wdtd readOnly: true dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 30 tolerations: - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 volumes: - name: default-token-9wdtd secret: defaultMode: 420 secretName: default-token-9wdtd status: conditions: - lastProbeTime: null lastTransitionTime: 2018-07-13T10:33:58Z message: '0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory.' reason: Unschedulable status: "False" type: PodScheduled phase: Pending qosClass: Burstable This command will output all information that Kubernetes has about this pod. It will contain the description of all spec options and fields you specified including any annotations, restart policy, statuses, phases, and more. The abundance of pod-related data makes this command one of the best tools for debugging pods in Kubernetes. That's it! To fix the scheduling issue, you'll need to request the appropriate amount of CPU and memory. While doing so, please, keep in mind that Kubernetes starts with some default daemons and services like kube-proxy. Therefore, you can't request 1.0 of CPU for your apps. Scheduling is only one amongst the common issues for your pods stuck in the pending stage. Let's create another deployment to illustrate other potential scenarios: apiVersion: apps/v1 kind: Deployment metadata: name: apache-server labels: app: httpd spec: replicas: 3 selector: matchLabels: app: httpd strategy: type: RollingUpdate rollingUpdate: maxSurge: 40% maxUnavailable: 40% template: metadata: labels: app: httpd spec: containers: - name: httpd image: httpd:23-alpine ports: - containerPort: 80 All we need to know about this deployment, is that it creates 3 replicas of the Apache HTTP server and specifies the custom RollingUpdate strategy. Let's save this spec in the httpd-deployment-2.yaml and create the deployment running the following command: kubectl create -f httpd-deployment-2.yaml deployment.apps "apache-server" created Let's check whether all replicas were successfully created: kubectl get pods NAME READY STATUS RESTARTS AGE apache-server-dc9bf8469-bblb4 0/1 ImagePullBackOff 0 53s apache-server-dc9bf8469-x2wwq 0/1 ErrImagePull 0 53s apache-server-dc9bf8469-xhmm7 0/1 ImagePullBackOff 0 53s Oops! As you see, all three dods are not Ready and have ImagePullBackoff and ErrImagePull statuses. These statuses indicate that something wrong has happened while pulling the httpd image from the Docker hub repository. Let's describe one of the pods in the deployment to find out more information: kubectl describe pod apache-server-dc9bf8469-bblb4 Name: apache-server-dc9bf8469-bblb4 Namespace: default Node: minikube/10.0.2.15 Start Time: Fri, 13 Jul 2018 15:25:02 +0300 Labels: app=httpd pod-template-hash=875694025 Annotations: Status: Pending IP: 172.17.0.6 Controlled By: ReplicaSet/apache-server-dc9bf8469 Containers: httpd: Container ID: Image: httpd:23-alpine Image ID: Port: 80/TCP Host Port: 0/TCP State: Waiting Reason: ImagePullBackOff Ready: False Restart Count: 0 Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-9wdtd (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: default-token-9wdtd: Type: Secret (a volume populated by a Secret) SecretName: default-token-9wdtd Optional: false QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 3m default-scheduler Successfully assigned apache-server-dc9bf8469-bblb4 to minikube Normal SuccessfulMountVolume 3m kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-9wdtd" Normal Pulling 2m (x4 over 3m) kubelet, minikube pulling image "httpd:23-alpine" Warning Failed 2m (x4 over 3m) kubelet, minikube Failed to pull image "httpd:23-alpine": rpc error: code = Unknown desc = Error response from daemon: manifest for httpd:23-alpine not found Warning Failed 2m (x4 over 3m) kubelet, minikube Error: ErrImagePull Normal BackOff 1m (x6 over 3m) kubelet, minikube Back-off pulling image "httpd:23-alpine" Warning Failed 1m (x6 over 3m) kubelet, minikube Error: ImagePullBackOff If you scroll down this description to the bottom, you'll see details on why the pod failed: "Failed to pull image "httpd:23-alpine": rpc error: code = Unknown desc = Error response from daemon: manifest for httpd:23-alpine not found". This means that we specified a container image that does not exist. Let's update our deployment with the right httpd container image version to fix the issue: kubectl set image deployment/apache-server httpd=httpd:2-alpine deployment.apps "apache-server" image updated Then, let's check the deployment's pods again: kubectl get pods NAME READY STATUS RESTARTS AGE apache-server-558f6f49f6-8bjnc 1/1 Running 0 36s apache-server-558f6f49f6-p7mkl 1/1 Running 0 36s apache-server-558f6f49f6-q8gj5 1/1 Running 0 36s Awesome! The Deployment controller has managed to pull the new image and all Pod replicas are now Running. Finding the Reasons your Pod Crashed Sometimes, your pod might crash due to some syntax errors in commands and arguments for the container. In this case, kubectl describe pod will provide you only with the error name but not the explanation of its cause. Let's create a new pod to illustrate this scenario: apiVersion: v1 kind: Pod metadata: name: pod-crash labels: app: demo spec: containers: - name: busybox image: busybox command: ['sh'] args: ['-c', 'MIN=5 SEC=45; echo "$(( MIN*60 + SEC + ; ))"'] This pod uses BusyBox sh command to calculate the arithmetic value of two variables. Let's save the spec in the pod-crash.yaml and create the pod running the following command: kubectl create -f pod-crash.yaml pod "pod-crash" created Now, if you check the pod, you'll see the following output: kubectl get pods NAME READY STATUS RESTARTS AGE pod-crash 0/1 CrashLoopBackOff 1 11s The CrashLoopBackOff status means that you have a pod starting, crashing, starting again, and then crashing again. Kubernetes attempts to restart this pod because the default restartPolicy: Always is enabled. If we had set the policy to Never, the pod would not be restarted. The status above, however, does not indicate the precise reason for the pod's crash. Let's try to find more details: kubectl describe pod pod-crash Containers: busybox: Container ID: docker://f9a67ec6e37281ff16b114e9e5a1f1c0adcd027bd1b63678ac8d09920a25c0ed Image: busybox Image ID: docker-pullable://busybox@sha256:141c253bc4c3fd0a201d32dc1f493bcf3fff003b6df416dea4f41046e0f37d47 Port: Host Port: Command: sh Args: -c MIN=5 SEC=45; echo "$(( MIN*60 + SEC + ; ))" State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 2 Started: Mon, 16 Jul 2018 10:32:35 +0300 Finished: Mon, 16 Jul 2018 10:32:35 +0300 Ready: False Restart Count: 5 Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-9wdtd (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: default-token-9wdtd: Type: Secret (a volume populated by a Secret) SecretName: default-token-9wdtd Optional: false QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 5m default-scheduler Successfully assigned pod-crash to minikube Normal SuccessfulMountVolume 5m kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-9wdtd" Normal Pulled 4m (x4 over 5m) kubelet, minikube Successfully pulled image "busybox" Normal Created 4m (x4 over 5m) kubelet, minikube Created container Normal Started 4m (x4 over 5m) kubelet, minikube Started container Normal Pulling 3m (x5 over 5m) kubelet, minikube pulling image "busybox" Warning BackOff 2s (x23 over 4m) kubelet, minikube Back-off restarting failed container The description above indicates that the pod is not Ready and that it was terminated because of the Back-Off Error. However, the description does not provide any further explanation of why the error occurred. Where should we search then? We are most likely to find the reason for the pod's crash in the BusyBox container logs. You can check them by running kubectl logs ${POD_NAME} ${CONTAINER_NAME}. Note that ${CONTAINER_NAME} can be omitted for pods that only contain a single container (as in our case) kubectl logs pod-crash sh: arithmetic syntax error Awesome! There must be something wrong with our command or arguments syntax. Indeed, we made a typo inserting ; into the expression echo "$(( MIN*60 + SEC + ; ))"'. Just fix that typo and you are good to go! Pod Fails Due to the 'Unknown Field' Error In the earlier versions of Kubernetes, a pod could be created even if the error was made in the spec's field name or value. In this case, the error would be silently ignored if the pod was created with the --validate flag set to false. In the newer Kubernetes versions (we are using Kubernetes 1.10.0), the --validate option is always set to true by default so the error for the unknown field is always printed. Therefore, the debugging becomes much easier. Let's create a pod with a wrong field value to illustrate this: apiVersion: v1 kind: Pod metadata: name: pod-field-error labels: app: demo spec: containers: - name: busybox image: busybox comand: ['sh'] args: ['-c', 'MIN=5 SEC=45; echo "$(( MIN*60 + SEC))"'] Let's save this spec in the pod-field-error.yaml and create the pod with the following command: kubectl create -f pod-field-error.yaml error: error validating "pod-field-error.yaml": error validating data: ValidationError(Pod.spec.containers[0]): unknown field "comand" in io.k8s.api.core.v1.Container; if you choose to ignore these errors, turn validation off with --validate=false As you see, the pod start was blocked because of the unknown 'comand' field (we made a typo in this field intentionally). If you are using older versions of Kubernetes and the pod is created with the error silently ignored, delete the pod and run it with kubectl create --validate -f pod-field-error.yaml . This command will help you find the reason for the error: kubectl create --validate -f pod-field-error.yaml I0805 10:43:25.129850 46757 schema.go:126] unknown field: comand I0805 10:43:25.129973 46757 schema.go:129] this may be a false alarm, see https://github.com/kubernetes/kubernetes/issues/6842 pods/pod-field-error Cleaning Up This tutorial is over, lso t's clean up after ourselves. Delete Deployments: kubectl delete deployment httpd-deployment deployment.extensions "httpd-deployment" deleted kubectl delete deployment apache-server deployment.extensions "apache-server" deleted Delete Pods: kubectl delete pod pod-crash pod "pod-crash" deleted kubectl delete pod pod-field-error pod "pod-field-error" deleted You may also want to delete files with spec definitions if you don't need them anymore. Conclusion As this tutorial demonstrated, Kubernetes ships with great debugging tools that help identify the reasons for pod failure or unexpected behavior at a fraction of time. The rule of thumb for Kubernetes debugging is first to find out the pod status and pod events and then check event messages by using kubectl describe or kubectl get events. If your pod crashes and the detailed error message is not available, you can check the containers' logs to find container-level errors and exceptions. These simple tools will dramatically increase your debugging speed and efficiency freeing up time for more productive work.  Stay tuned to upcoming blogs to learn more about node-level and cluster-level debugging in Kubernetes. [Less]
Posted almost 7 years ago by Kirill
In a previous tutorial, you learned how to use Kubernetes jobs to perform some tasks sequentially or in parallel. However, Kubernetes goes even further with task automation by enabling Jobs to create cron jobs that perform ... [More] finite, time-related tasks that run repeatedly at any time you specify. Cron jobs can be used to automate a wide variety of common computing tasks such as creating database backups and snapshots, sending emails or upgrading Kubernetes applications. Before you can learn how to run cron jobs, make sure to consult our earlier tutorial about Kubernetes Jobs. If you are ready, let's delve into the basics of cron jobs where we'll show you how they work and how to create and manage them. Let's get started! Definition of Cron Jobs Cron (which originated from the Greek word for time χρόνος) initially was a utility time-based job scheduler in Unix-like operating system. At the OS level, cron files are used to schedule jobs (commands or shell scripts) to run periodically at fixed times, dates, or intervals. They are useful for automating system maintenance, administration, or scheduled interaction with the remote services (software and repository updates, emails, etc.). First used in the Unix-like operating systems, cron jobs implementations have become ubiquitous today. Cron Job API became a standard feature in Kubernetes in 1.8 and is widely supported by the Kubernetes ecosystem for automated backups, synchronization with remote services, system, and application maintenance (upgrades, updates, cleaning the cache) and more. Read on because we will show you a basic example of a cron job used to perform a mathematic operation. Tutorial To complete examples in this tutorial, you need the following prerequisites: A running Kubernetes cluster at version >= 1.8 (for cron job). For previous versions of Kubernetes (< 1.8) you need to explicitly turn on batch/v2alpha1 API by passing --runtime-config=batch/v2alpha1=true to the API server (see how to do this in this tutorial), and then restart both the API server and the controller manager component. See Supergiant GitHub wiki for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube. A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here. Let's assume we have a simple Kubernetes jobs to calculate a π to 3000 places using perl and print out the result to stdout. apiVersion: batch/v1 kind: Job metadata: name: pi spec: template: spec: containers: - name: pi image: perl command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(3000)"] restartPolicy: Never backoffLimit: 4 We can easily turn this simple job into a cron job. In essence, a cron job is a type of the API resource that creates a standard Kubernetes job executed at a specified date or interval. The following template can be used to turn our π job into a full-fledged cron job: apiVersion: batch/v1beta1 kind: CronJob metadata: name: pi-cron spec: schedule: "*/1 * * * *" startingDeadlineSeconds: 20 successfulJobsHistoryLimit: 5 jobTemplate: spec: completions: 2 template: metadata: name: pi spec: containers: - name: pi image: perl command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(3000)"] restartPolicy: Never Let's look closely at the key fields of this spec: .spec.schedule -- a scheduled time for the cron job to be created and executed. The field takes a cron format string, such as 0 * * * * or @hourly. The cron format string uses the format of the standard crontab (cron table) file -- a configuration file that specifies shell commands to run periodically on a given schedule. See the format in the example below: # ┌───────────── minute (0 - 59) # │ ┌───────────── hour (0 - 23) # │ │ ┌───────────── day of month (1 - 31) # │ │ │ ┌───────────── month (1 - 12) # │ │ │ │ ┌───────────── day of week (0 - 6) (Sunday to Saturday; # │ │ │ │ │ 7 is also Sunday on some systems) # │ │ │ │ │ # │ │ │ │ │ # * * * * * command to execute Each asterisk from the left to the right corresponds to a minute, an hour, a day of month, a month, a day of week on which to perform the cron job and the command to execute for it. In this example, we combined a slash (/) with a 1-minute range to specify a step/interval at which to perform the job. For example, */5 written in the minutes field would cause the cron job to calculate π every 5 minutes. Correspondingly, if we wanted to perform the cron job hourly, we could write 0 */1 * * * to accomplish that.  Format Note: The question mark (?) in the schedule field has the same meaning as an asterisk *. That is, it stands for any of available value for a given field. .spec.jobTemplate -- a cron job's template. It has exactly the same schema as a job but is nested into a cron job and does not require an apiVersion or kind. .spec.startingDeadlineSeconds -- a deadline in seconds for starting the cron job if it misses its schedule for some reason (e.g., node unavailability). A cron job that does not meet its deadline is regarded as failed. Cron jobs do not have any deadlines by default. .spec.concurrencyPolicy --  specifies how to treat concurrent executions of a Job created by the cron job. The following concurrency policies are allowed: Allow (default): the cron job supports concurrently running jobs. Forbid: the cron job does not allow concurrent job runs. If the current job has not finished yet, a new job run will be skipped. Replace: if the previous job has not finished yet and the time for a new job run has come, the previous job will be replaced by a new one. In this example, we are using the default allow policy. Computing π to 3000 places and printing out will take more than a minute. Therefore, we expect our cron job to run a new job even if the previous one has not yet completed. .spec.suspend -- if the field is set to true, all subsequent job executions are suspended. This setting does not apply to executions which already began. The default value is false. .spec.successfulJobsHistoryLimit -- the field specifies how many successfully completed jobs should be kept in job history. The default value is 3. .spec.failedJobsHistoryLimit -- the field specifies how many failed jobs should be kept in job history. The default value is 1. Setting this limit to 0 means that no jobs will be kept after completion. That's it! Now you have a basic understanding of available cron job settings and options.  Let's continue with the tutorial. Open two terminal windows. In the first one, you are going to watch the jobs created by the cron job: kubectl get jobs --watch Let's save the spec above in the cron-job.yaml and create a cron job running the following command in the second terminal: kubectl create -f cron-job.yaml cronjob.batch "pi-cron" created In a minute, you should see that two π jobs (as per the Completions value) were successfully created in the first terminal window: kubectl get jobs --watch NAME DESIRED SUCCESSFUL AGE pi-cron-1531219740 2 0 0s pi-cron-1531219740 2 0 0s You can also check that the cron job was successfully created by running: kubectl get cronjobs NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE pi-cron */1 * * * * False 1 57s 1m Computing π to 3000 places is computationally intensive and takes more time than our cron job schedule (1 minute). Since we used the default concurrency policy ("allow"), you'll see that the cron job will start new jobs even though the previous ones have not yet completed: kubectl get jobs --watch NAME DESIRED SUCCESSFUL AGE pi-cron-1531219740 2 0 0s pi-cron-1531219740 2 0 0s pi-cron-1531219800 2 0 0s pi-cron-1531219800 2 0 0s pi-cron-1531219860 2 0 0s pi-cron-1531219860 2 0 0s pi-cron-1531219740 2 1 2m pi-cron-1531219800 2 1 1m pi-cron-1531219860 2 1 57s pi-cron-1531219920 2 0 0s pi-cron-1531219920 2 0 0s pi-cron-1531219740 2 2 3m pi-cron-1531219800 2 2 2m pi-cron-1531219860 2 2 1m pi-cron-1531219920 2 1 20s pi-cron-1531219920 2 2 35s pi-cron-1531219740 2 2 3m pi-cron-1531219740 2 2 3m pi-cron-1531219740 2 2 3m pi-cron-1531219980 2 0 0s As you see, some old jobs are still in process and new ones are created without waiting for them to finish. That's how Allow concurrency policy works! Now, let's check if these jobs are computing the π correctly. To do this, simply find one pod created by the job: kubectl get pods NAME READY STATUS RESTARTS AGE pi-cron-1531220100-sbqrx 0/1 Completed 0 3m pi-cron-1531220100-t8l2v 0/1 Completed 0 3m pi-cron-1531220160-bqcqf 0/1 Completed 0 2m pi-cron-1531220160-mqg7t 0/1 Completed 0 2m pi-cron-1531220220-dzmfp 0/1 Completed 0 1m pi-cron-1531220220-zrh85 0/1 Completed 0 1m pi-cron-1531220280-k2ttw 0/1 Completed 0 23s Next, select one pod from the list and check its logs: kubectl logs pi-cron-1531220220-dzmfp You'll see a pi number calculated to the 3000 place after the comma (that's pretty impressive): 3.14159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566593344612847564823378678316527120190914564856692346034861045432664821339360726024914127372458700660631558817488152092096282925409171536436789259036001133053054882046652138414695194151160943305727036575959195309218611738193261179310511854807446237996274956735188575272489122793818301194912983367336244065664308602139494639522473719070217986094370277053921717629317675238467481846766940513200056812714526356082778577134275778960917363717872146844090122495343014654958537105079227968925892354201995611212902196086403441815981362977477130996051870721134999999837297804995105973173281609631859502445945534690830264252230825334468503526193118817101000313783875288658753320838142061717766914730359825349042875546873115956286388235378759375195778185778053217122680661300192787661119590921642019893809525720106548586327886593615338182796823030195203530185296899577362259941389124972177528347913151557485724245415069595082953311686172785588907509838175463746493931925506040092770167113900984882401285836160356370766010471018194295559619894676783744944825537977472684710404753464620804668425906949129331367702898915210475216205696602405803815019351125338243003558764024749647326391419927260426992279678235478163600934172164121992458631503028618297455570674983850549458858692699569092721079750930295532116534498720275596023648066549911988183479775356636980742654252786255181841757467289097777279380008164706001614524919217321721477235014144197356854816136115735255213347574184946843852332390739414333454776241686251898356948556209921922218427255025425688767179049460165346680498862723279178608578438382796797668145410095388378636095068006422512520511739298489608412848862694560424196528502221066118630674427862203919494504712371378696095636437191728746776465757396241389086583264599581339047802759009946576407895126946839835259570982582262052248940772671947826848260147699090264013639443745530506820349625245174939965143142980919065925093722169646151570985838741059788595977297549893016175392846813826868386894277415599185592524595395943104997252468084598727364469584865383673622262609912460805124388439045124413654976278079771569143599770012961608944169486855584840635342207222582848864815845602850601684273945226746767889525213852254995466672782398645659611635488623057745649803559363456817432411251507606947945109659609402522887971089314566913686722874894056010150330861792868092087476091782493858900971490967598526136554978189312978482168299894872265880485756401427047755513237964145152374623436454285844479526586782105114135473573952311342716610213596953623144295248493718711014576540359027993440374200731057853906219838744780847848968332144571386875194350643021845319104848100537061468067491927819119793995206141966342875444064374512371819217999839101591956181467514269123974894090718649423196 Awesome! Our cron job works as expected. You can imagine how this functionality might be useful for making regular backups of your database, application upgrades and any other task. As it comes to automation, cron jobs are gold! Cleaning Up If you don’t need a cron Job anymore, delete it with kubectl delete cronjob: $ kubectl delete cronjob pi-cron cronjob "pi-cron" deleted Deleting the cron job will remove all the jobs and pods it created and stop it from spawning additional jobs. Conclusion Hopefully, you now have a better understanding of how cron jobs can help you automate tasks in your Kubernetes application. We used a simple example that can kickstart your thought process. However, when working with the real world Kubernetes cron jobs, please, be aware of the following limitation. A cron job creates a job object approximately once per execution time of its schedule. There are certain scenarios where two jobs are created or no Job is created at all. Therefore, to avoid side effects jobs should be idempotent, which means they should not change the data consumed by other scheduled jobs. If .spec.startingDeadlineSeconds is set to a large value or left unset (the default) and if .spec. concurrencyPolicy is set to Allow, the jobs will always run at least once. If you want to start the job notwithstanding the delay, set a longer .spec.startingDeadlineSeconds if starting your job is better than not starting it at all. If you keep these limitations and best practices in mind, your cron jobs will never let your application down. [Less]