**The original author of this page is** [**Jorge**](https://www.linkedin.com/in/jorge-belmonte-a924b616b/) **\(read his original post** [**here**](https://sickrov.github.io/)**\)**
* **Pod**: Wrapper around a container or multiple containers with. A pod should only contain one application \(so usually, a pod run just 1 container\). The pod is the way kubernetes abstracts the container technology running.
* **Service**: Each pod has 1 internal **IP address** from the internal range of the node. However, it can be also exposed via a service. The **service has also an IP address** and its goal is to maintain the communication between pods so if one dies the **new replacement** \(with a different internal IP\) **will be accessible** exposed in the **same IP of the service**. It can be configured as internal or external. The service also actuates as a **load balancer when 2 pods are connected** to the same service. When a **service** is **created** you can find the endpoints of each service running `kubectl get endpoints`
* **Kubelet**: Primary node agent. The component that establishes communication between node and kubectl, and only can run pods \(through API server\). The kubelet doesn’t manage containers that were not created by Kubernetes.
* **Kube-proxy**: is the service in charge of the communications \(services\) between the apiserver and the node. The base is an IPtables for nodes. Most experienced users could install other kube-proxies from other vendors.
* **Sidecar container**: Sidecar containers are the containers that should run along with the main container in the pod. This sidecar pattern extends and enhances the functionality of current containers without changing them. Nowadays, We know that we use container technology to wrap all the dependencies for the application to run anywhere. A container does only one thing and does that thing very well.
* **Api Server:** Is the way the users and the pods use to communicate with the master process. Only authenticated request should be allowed.
* **Scheduler**: Scheduling refers to making sure that Pods are matched to Nodes so that Kubelet can run them. It has enough intelligence to decide which node has more available resources the assign the new pod to it. Note that the scheduler doesn't start new pods, it just communicate with the Kubelet process running inside the node, which will launch the new pod.
* **Kube Controller manager**: It checks resources like replica sets or deployments to check if, for example, the correct number of pods or nodes are running. In case a pod is missing, it will communicate with the scheduler to start a new one. It controls replication, tokens, and account services to the API.
* **etcd**: Data storage, persistent, consistent, and distributed. Is Kubernetes’s database and the key-value storage where it keeps the complete state of the clusters \(each change is logged here\). Components like the Scheduler or the Controller manager depends on this date to know which changes have occurred \(available resourced of the nodes, number of pods running...\)
Note that as the might be several nodes \(running several pods\), there might also be several master processes which their access to the Api server load balanced and their etcd synchronized.
When a pod creates data that shouldn't be lost when the pod disappear it should be stored in a physical volume. **Kubernetes allow to attach a volume to a pod to persist the data**. The volume can be in the local machine or in a **remote storage**. If you are running pods in different physical nodes you should use a remote storage so all the pods can access it.
* **ConfigMap**: You can configure **URLs** to access services. The pod will obtain data from here to know how to communicate with the rest of the services \(pods\). Note that this is not the recommended place to save credentials!
* **Secret**: This is the place to **store secret data** like passwords, API keys... encoded in B64. The pod will be able to access this data to use the required credentials.
* **Deployments**: This is where the components to be run by kubernetes are indicated. A user usually won't work directly with pods, pods are abstracted in **ReplicaSets** \(number of same pods replicated\), which are run via deployments. Note that deployments are for **stateless** applications. The minimum configuration for a deployment is the name and the image to run.
* **Ingress**: This is the configuration that is use to **expose the application publicly with an URL**. Note that this can also be done using external services, but this is the correct way to expose the application.
* If you implement an Ingress you will need to create **Ingress Controllers**. The Ingress Controller is a **pod** that will be the endpoint that will receive the requests and check and will load balance them to the services. the ingress controller will **send the request based on the ingress rules configured**. Note that the ingress rules can point to different paths or even subdomains to different internal kubernetes services.
* A better security practice would be to use a cloud load balancer or a proxy server as entrypoint to don't have any part of the Kubernetes cluster exposed.
* When request that doesn't match any ingress rule is received, the ingress controller will direct it to the "**Default backend**". You can `describe` the ingress controller to get the address of this parameter.
**Minikube** can be used to perform some **quick tests** on kubernetes without needing to deploy a whole kubernetes environment. It will run the **master and node processes in one machine**. Minikube will use virtualbox to run the node. See [**here how to install it**](https://minikube.sigs.k8s.io/docs/start/).
```text
$ minikube start
😄 minikube v1.19.0 on Ubuntu 20.04
✨ Automatically selected the virtualbox driver. Other choices: none, ssh
💿 Downloading VM boot image ...
> minikube-v1.19.0.iso.sha256: 65 B / 65 B [-------------] 100.00% ? p/s 0s
**`Kubectl`** is the command line tool fro kubernetes clusters. It communicates with the Api server of the master process to perform actions in kubernetes or to ask for data.
Each configuration file has 3 parts: **metadata**, **specification** \(what need to be launch\), **status** \(desired state\).
Inside the specification of the deployment configuration file you can find the template defined with a new configuration structure defining the image to run:
#### Example of Deployment + Service declared in the same configuration file \(from [here](https://gitlab.com/nanuchi/youtube-tutorial-series/-/blob/master/demo-kubernetes-components/mongo.yaml)\)
As a service usually is related to one deployment it's possible to declare both in the same configuration file \(the service declared in this config is only accessible internally\):
A **ConfigMap** is the configuration that is given to the pods so they know how to locate and access other services. In this case, each pod will know that the name `mongodb-service` is the address of a pod that they can communicate with \(this pod will be executing a mongodb\):
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: mongodb-configmap
data:
database_url: mongodb-service
```
Then, inside a **deployment config** this address can be specified in the following way so it's loaded inside the env of the pod:
You can find different example of storage configuration yaml files in [https://gitlab.com/nanuchi/youtube-tutorial-series/-/tree/master/kubernetes-volumes](https://gitlab.com/nanuchi/youtube-tutorial-series/-/tree/master/kubernetes-volumes).
Kubernetes supports **multiple virtual clusters** backed by the same physical cluster. These virtual clusters are called **namespaces**. These are intended for use in environments with many users spread across multiple teams, or projects. For clusters with a few to tens of users, you should not need to create or think about namespaces at all. You only should start using namespaces to have a better control and organization of each part of the application deployed in kubernetes.
Namespaces provide a scope for names. Names of resources need to be unique within a namespace, but not across namespaces. Namespaces cannot be nested inside one another and **each** Kubernetes **resource** can only be **in****one****namespace**.
There are 4 namespaces by default if you are using minikube:
```text
kubectl get namespace
NAME STATUS AGE
default Active 1d
kube-node-lease Active 1d
kube-public Active 1d
kube-system Active 1d
```
* **kube-system**: It's not meant or the users use and you shouldn't touch it. It's for master and kubectl processes.
* **kube-public**: Publicly accessible date. Contains a configmap which contains cluster information
* **kube-node-lease**: Determines the availability of a node
* **default**: The namespace the user will use to create resources
Note that most Kubernetes resources \(e.g. pods, services, replication controllers, and others\) are in some namespaces. However, other resources like namespace resources and low-level resources, such as nodes and persistenVolumes are not in a namespace. To see which Kubernetes resources are and aren’t in a namespace:
```bash
kubectl api-resources --namespaced=true #In a namespace
kubectl api-resources --namespaced=false #Not in a namespace
```
{% endhint %}
You can save the namespace for all subsequent kubectl commands in that context.
Helm is the **package manager** for Kubernetes. It allows to package YAML files and distribute them in public and private repositories. These packages are called **Helm Charts**.
```text
helm search <keyword>
```
Helm is also a template engine that allows to generate config files with variables:
A **Secret** is an object that **contains sensitive data** such as a password, a token or a key. Such information might otherwise be put in a Pod specification or in an image. Users can create Secrets and the system also creates Secrets. The name of a Secret object must be a valid **DNS subdomain name**. Read here [the official documentation](https://kubernetes.io/docs/concepts/configuration/secret/).
The following configuration file defines a **secret** called `mysecret` with 2 key-value pairs `username: YWRtaW4=` and `password: MWYyZDFlMmU2N2Rm`. It also defines a **pod** called `secretpod` that will have the `username` and `password` defined in `mysecret` exposed in the **environment variables**`SECRET_USERNAME`__and __`SECRET_PASSWOR`. It will also **mount** the `username` secret inside `mysecret` in the path `/etc/foo/my-group/my-username` with `0640` permissions.
**etcd** is a consistent and highly-available **key-value store** used as Kubernetes backing store for all cluster data. Let’s access to the secrets stored in etcd:
By default all the secrets are **stored in plain** text inside etcd unless you apply an encryption layer. The following example is based on [https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/](https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/)
After that, you need to set the `--encryption-provider-config` flag on the `kube-apiserver` to point to the location of the created config file. You can modify `/etc/kubernetes/manifest/kube-apiserver.yaml` and add the following lines:
Data is encrypted when written to etcd. After restarting your `kube-apiserver`, any newly created or updated secret should be encrypted when stored. To check, you can use the `etcdctl` command line program to retrieve the contents of your secret.
should match `mykey: bXlkYXRh`, mydata is encoded, check [decoding a secret](https://kubernetes.io/docs/concepts/configuration/secret#decoding-a-secret) to completely decode the secret.
### Vulnerabilities - OS <a id="vulnerabilities---os"></a>
Is mandatory to keep in mind to define privilege and access control for container / pod:
* userID’s and groupID’s.
* Privileged or unprivileged escalation runs.
* Linux.
More info at: [https://kubernetes.io/docs/tasks/configure-pod-container/security-context/](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/)
#### userID and groupID <a id="userid-and-groupid"></a>
Pod security policies control the security policies about how a pod has to run. More info at: [https://kubernetes.io/docs/concepts/policy/pod-security-policy/](https://kubernetes.io/docs/concepts/policy/pod-security-policy/)
Kubernetes has an **authorization module named Role-Based Access Control** \([**RBAC**](https://kubernetes.io/docs/reference/access-authn-authz/rbac/)\) that helps to set utilization permissions to the API server.
The RBAC table is constructed from “**Roles**” and “**ClusterRoles**.” The difference between them is just where the role will be applied – a “**Role**” will grant access to only **one****specific****namespace**, while a “**ClusterRole**” can be used in **all namespaces** in the cluster. Moreover, ClusterRoles can also grant access to:
A **role binding****grants the permissions defined in a role to a user or set of users**. It holds a list of subjects \(users, groups, or service accounts\), and a reference to the role being granted. A **RoleBinding** grants permissions within a specific **namespace** whereas a **ClusterRoleBinding** grants that access **cluster-wide**.
**Permissions are additive** so if you have a clusterRole with “list” and “delete” secrets you can add it with a Role with “get”. So be aware and test always your roles and permissions and **specify what is ALLOWED, because everything is DENIED by default.**
1.**Role\ClusterRole –** The actual permission. It contains _**rules**_ that represent a set of permissions. Each rule contains [resources](https://kubernetes.io/docs/reference/kubectl/overview/#resource-types) and [verbs](https://kubernetes.io/docs/reference/access-authn-authz/authorization/#determine-the-request-verb). The verb is the action that will apply on the resource.
2.**Subject \(User, Group or ServiceAccount\) –** The object that will receive the permissions.
3.**RoleBinding\ClusterRoleBinding –** The connection between Role\ClusterRole and the subject.
It's very important to **protect the access to the Kubernetes Api Server** as a malicious actor with enough privileges could be able to abuse it and damage in a lot of way the environment.
It's important to secure both the **access** \(**whitelist** origins to access the API Server and deny any otehr connection\) and the [**authentication**](https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-authentication-authorization/) \(following the principle of **least****privilege**\). And definitely **never****allow****anonymous****requests**.
* Basically prevents kubelets from adding/removing/updating labels with a node-restriction.kubernetes.io/ prefix. This label prefix is reserved for administrators to label their Node objects for workload isolation purposes, and kubelets will not be allowed to modify labels with that prefix.
* And also, allows kubelets to add/remove/update these labels and label prefixes.
* Ensure with labels the secure workload isolation.
* Avoid specific pods from API access.
* Avoid ApiServer exposure to the internet.
* Avoid unauthorized access RBAC.
* ApiServer port with firewall and IP whitelisting.
\*\*\*\*[**Release cycles**](https://kubernetes.io/docs/setup/release/version-skew-policy/): Each 3 months there is a new minor release -- 1.20.3 = 1\(Major\).20\(Minor\).3\(patch\)