Current location: MCM Design Book.
(🚧 Please see Change Log for new additions/corrections.Please Check on 8th Oct for v3.1 release!🏗)
Introduction
A Kubernetes Controller is a program that watches for lifecycle events on specific resources and triggers one or more reconcile functions in response. A reconcile function is called with the Namespace and Name of an object corresponding to the resource and its job is to make the object Status match the declared state in the object Spec.
Machine Controller Manager aka MCM is a group of cooperative controllers that manage the lifecycle of the worker machines, machine-classes machine-sets and machine deployments. All these objects are custom resources.
- A worker Machine is a provider specific VM/instance that corresponds to a k8s Node. (k8s doesn't bring up nodes by its own, the MCM does so by using cloud provider API's abstracted by the Driver facade to bring up machines and map them to nodes)
- A MachineClass represents a template that contains cloud provider specific details used to create machines.
- A MachineSet ensures that the specified number of
Machine
replicas are running at a given point of time. Analogoues to k8s ReplicaSets. - A MachineDeployment provides a declarative update for
MachineSet
andMachines
. Analogous to k8s Deployments.
All the custom resources (Machine-*
objects) mentioned above are stored in the K8s control cluster. The nodes corresponding to the machines are created and registered in the target cluster.
For productive Gardener deployments, the control cluster is the control plane of the shoot cluster and since the MCM is running in the shoot's control plane, the kubeconfig for the control cluster is generally specified as the In-Cluster Config. The target cluster is the shoot cluster and hence the target cluster config is the shoot kube config.
Project Structure
%%{init: {'themeVariables': { 'fontSize': '10px'}, "flowchart": {"useMaxWidth": false }}}%% flowchart TB subgraph MCM mcm["machine-controller-manager (Common MC Code, MachineSet, MachineDeploy controllers)"] mcmlo["machine-controller-manager-provider-local (Machine Controller Local atop K8s Kind)"] mcmaws["machine-controller-manager-provider-aws (Machine Controller for AWS)"] mcmazure["machine-controller-manager-provider-azure (Machine Controller for Azure)"] mcmgcp["machine-controller-manager-provider-gcp (Machine Controller for GCP)"] mcmx["machine-controller-manager-provider-X (Machine Controller for equinox/openstack/etc)"] end mcmlo--uses-->mcm mcmaws--uses-->mcm mcmazure--uses-->mcm mcmgcp--uses-->mcm mcmx-->mcm
The MCM project is divided into:
- The MCM Module. This contains
- The MCM Controller Type and MCM Controller Factory Method. The
MCM Controller
is responsible for reconciling theMachineDeployment
andMachineSet
custom resources. - MCM Main which creates and starts the MCM Controller.
- The MC Controller Type and MC Controller Factory Method.
- The
MC Controller
implements the reconciliation loop forMachineClass
andMachine
objects but delegates creation/updation/deletion/status-retrieval of Machines to theDriver
facade.
- The
- The Driver facade that abstracts away the lifecycle operations on Machines and obtaining Machine status.
- Utility Code leveraged by provider modules.
- The MCM Controller Type and MCM Controller Factory Method. The
- The provider specific modules named as
machine-controller-manager-provider-<providerName>
.- Contains a main file located at
cmd/machine-controller/main.go
that instantiate aDriver
implementation (Ex: AWSDriver) and then create and start aMC Controller
using the MC Controller Factory Method, passing theDriver
impl. In other worlds, each provider module starts its independent machine controller. - See MCM README for list of provider modules
- Contains a main file located at
The MCM leverages the old-school technique of writing controllers directly using client-go. Skeleton code for client types is generated using client-gen. A barebones example is illustrated in the sample controller.
The Modern Way of writing controllers is by leveraging the Controller Runtime and generating skeletal code fur custom controllers using the Kubebuilder Tool.
The MCM has a planned backlog to port the project to the controller runtime. The details of this will be documented in a separate proposal. (TODO: link me in future).
This book describes the current design of the MCM in order to aid code comprehension for development, enhancement and migratiion/port activities.
Deployment Structure
The MCM Pod's are part of the Deployment
named machine-controller-manager
that resides in the shoot control plane. After logging into the shoot control plane (use gardenctl
), you can the deployment details using k get deploy machine-controller-manager -o yaml
. The MCM deployment has two containers:
machine-controller-manager-provider-<provider>
. Ex:machine-controller-manager-provider-aws
. This container name is a bit misleading as it starts the provider specific machine controller main program responsible for reconciling machine-classes and machines. See Machine Controller. (Ideally the-manager
should have been removed)
Container command configured on AWS:
./machine-controller
--control-kubeconfig=inClusterConfig
--machine-creation-timeout=20m
--machine-drain-timeout=2h
--machine-health-timeout=10m
--namespace=shoot--i034796--tre
--port=10259
--target-kubeconfig=/var/run/secrets/gardener.cloud/shoot/generic-kubeconfig/kubeconfig`
<provider>-machine-controller-manager
. Ex:aws-machine-controller-manager
. This container name is a bit misleading as it starts the machine deployment controller main program responsible for reconciling machine-deployments and machine-sets. (See: TODO: link me). Ideally it should have been called simplymachine-deployment-controller
as it is provider independent.
Container command configured on AWS
./machine-controller-manager
--control-kubeconfig=inClusterConfig
--delete-migrated-machine-class=true
--machine-safety-apiserver-statuscheck-timeout=30s
--machine-safety-apiserver-statuscheck-period=1m
--machine-safety-orphan-vms-period=30m
--machine-safety-overshooting-period=1m
--namespace=shoot--i034796--tre
--port=10258
--safety-up=2
--safety-down=1
--target-kubeconfig=/var/run/secrets/gardener.cloud/shoot/generic-kubeconfig/kubeconfig
Local Development Tips
First read Local Dev MCM
Running MCM Locally
After setting up a shoot cluster in the dev landscape, you can run your local copy of MCM and MC to manage machines in the shoot cluster.
Example for AWS Shoot Cluster:
- Checkout
https://github.com/gardener/machine-controller-manager
andhttps://github.com/gardener/machine-controller-manager-provider-aws/
cd machine-controller-manager
and run./hack/gardener_local_setup.sh --seed <seedManagingShoot> --shoot <shootName> --project <userId> --provider aws
- Ex:
./hack/gardener_local_setup.sh --seed aws-ha --shoot aw2 --project i034796 --provider aws
- The above will set the replica count of the
machine-controller-manager
deployment in the shoot control plane to 0 and also set an annotationsdependency-watchdog.gardener.cloud/ignore-scaling
to prevent DWD from scalig it back up. Now, you can run your local dev copy.
- Ex:
- Inside the MCM directlry run
make start
- MCM controller should start without errors. Last line should look like:
I0920 14:11:28.615699 84778 deployment.go:433] Processing the machinedeployment "shoot--i034796--aw2-a-z1" (with replicas 1)
- MCM controller should start without errors. Last line should look like:
- Change to the provider directory. Ex
cd <checkoutPath>/machine-controllr-manager-provider-aws
and runmake start
- MC controller should start without errors. Last line should look like
I0920 14:14:37.793684 86169 core.go:482] List machines request has been processed successfully
I0920 14:14:37.896720 86169 machine_safety.go:59] reconcileClusterMachineSafetyOrphanVMs: End, reSync-Period: 5m0s
Change Log
- WIP Draft of Orphan/Safety
- WIP for machine set controller.