Reconcile Cluster Machine Set
A MachineSet
is to a Machine
in an analogue of what a ReplicaSet
is to a Pod
. A MachineSet
ensures that the specified number of Machines are running at any given time.
A MachineSet
is rarely rarely created directly. It is generally owned by its parent MachineDeployment and its ObjectMetadata.OwnerReferenes slice has a reference to the parent deployment.
The MCM controller reconcileClusterMachineSet
is called from objects retrieved from the machineSetQueue
as shown below.
worker.Run(c.machineSetQueue,
"ClusterMachineSet",
worker.DefaultMaxRetries, true, c.reconcileClusterMachineSet, stopCh, &waitGroup)
The following is the flow diagram for func (c *controller) reconcileClusterMachineSet(key string) error
. As can be observed, it could be optimized better. For any error in the below, the ms key is added back to the machineSetQueue
according to the default rate limiting.
%%{init: { 'themeVariables': { 'fontSize': '11px'},"flowchart": {"defaultRenderer": "elk"}} }%% flowchart TD Begin((" ")) -->GetMachineSet["machineSet=Get MS From Lister"] -->ValidateMS["validation.ValidateMachineSet(machineSet)"] -->ChkDeltimestamp1{"machineSet.DeletionTimestamp?"} ChkDeltimestamp1-->|no| AddFinalizersIMissing["addFinalizersIfMissing(machineSet)"] ChkDeltimestamp1-->|yes| GetAllMS["allMachineSets = list all machine sets"]--> GetMSSelector["selector = LabelSelectorAsSelector(machineSet.Spec.Selector)"] -->ClaimMachines["claimedMachines=claimMachines(machineSet, selector, allMachines)"] -->SyncNT["synchronizeMachineNodeTemplates(claimedMachines, machineSet)"] -->SyncMC["syncMachinesConfig(claimedMachines, machineSet)"] -->SyncMCK["syncMachinesClassKind(claimedMachines, machineSet)"] -->ChkDeltimestamp2{"machineSet.DeletionTimestamp?"}-->|no| ScaleUpDown ChkDeltimestamp2-->|yes| ChkClaimedMachinesLen{"len(claimedMachines) == 0?"} ChkClaimedMachinesLen-->|yes| DelMSFinalizers["delFinalizers(machineSet)"] ChkClaimedMachinesLen-->|no| TermMachines["terminateMachines(claimedMachines,machineSet)"]-->CalcMSStatus DelMSFinalizers-->CalcMSStatus ScaleUpDown["manageReplicas(claimedMachines) // scale out/in machines"] -->CalcMSStatus["calculateMachineSetStatus(claimedMachines, machineSet, errors)"] -->UpdateMSStatus["updateMachineSetStatus(...)"] -->enqueueMachineSetAfter["machineSetQueue.AddAfter(msKey, 10m)"] AddFinalizersIMissing-->GetAllMS
claimMachines
claimMachines
tries to take ownership of a machine - it associates a Machine
with a MachineSet
by setting machine.metadata.OwnerReferences
and releasets the Machine
if the MS's deletion timestamp has been set.
- Initialize an empty
claimedMachines []Machine
slice - Initialize an empty
errlist []erro
- Iterate through
allMachines
and Get theownerRef
(the first element inOwnerReferences
slice) - If the
ownerRef
is notnil
- if the
ownerRef.UID
is diff from themachineSet
sUUID
skip the claim and continue. (Since the machine belongs to another machine set) - If the machine
selector
matches the labels of themachineSet
, add toclaimedMachines
and continue - If the
machineSet.DeletionTimestamp
is set, skip and continue - Release the
Machine
by removing itsownerReference
- if the
- If the
ownerRef
isnil
- If the
machineSet.DeletionTimestamp
is set or if the machineselector
does not mach themachineSet
, skip and continue. - If the
machine.DeletionTimestamp
is set, skip and continue. - Adopt the machine, ie. set the
ownerReference
to themachineSet
and add toclaimedMachines
ownerReferences: - apiVersion: machine.sapcloud.io/v1alpha1 blockOwnerDeletion: true controller: true kind: MachineSet name: shoot--i034796--aw2-a-z1-8c99f uid: 20bc03c5-e95b-4df5-9faf-68be38cb8e1b
- If the
- Returned
claimedMachines
.
synchronizeMachineNodeTemplates
func (c *controller) syncMachinesNodeTemplates(ctx context.Context,
claimedMachines []*Machine, machineSet *MachineSet) error
- This iterates through the
claimeMachines
and copies themachineset.Spec.Template.Spec.NodeTemplateSpec
to themachine.Spec.NodeTemplateSpec
- NOTE: Seems useless IO busy-work to me. When MC launches the
Machine
, it might as well access the owningMachineSet
and get theNodeSpec
. - The only reason to do this is to support independent
Machines
without owningMachineSets
. We will need to see whether such a use-case is truly needed.
NOTE: NodeTemplate
describes common resource capabilities like cpu
, gpu
, memory
, etc in terms of k8s.io/api/core/v1.ResourceList. This is used by the cluster-autoscaler
for scaling decisions.
syncMachinesConfig
Copies machineset.Spec.Template.Spec.MachineConfiguration
to machine.Spec.MachineConfiguration
for all claimedMachines
.
See MachineConfiguration
inside MachineSpec
syncMachinesClassKind
NOTE: This is useless and should be removed since we only have ONE kind of MachineClass
. TODO: Discuss with Himanshu/Rishabh.
func (c *controller) syncMachinesClassKind(ctx context.Context,
claimedMachines []*Machine, machineSet *MachineSet) error
Iterates through claimedMachines
and sets machine.Spec.Class.Kind = machineset.Spec.Template.Spec.Class.Kind
if not already set.
manageReplicas (scale-out / scale-in)
func (c *controller) manageReplicas(ctx context.Context,
claimedMachines []Machine, machineSet *MachineSet) error
%%{init: { 'themeVariables': { 'fontSize': '11px'},"flowchart": {"defaultRenderer": "elk"}} }%% flowchart TD Begin((" ")) -->Init["activeMachines :=[], staleMachines:=[]"] -->IterCLaimed["machine := range claimedMachines"] --loop-->IsActiveOrFailed{"IsMachineActiveOrFailed(machine)"} IsActiveOrFailed-->|active| AppendActive["append(activeMachines,machine)"] IsActiveOrFailed-->|failed| AppendFailed["append(staleMachines,machine)"] IterCLaimed--done-->TermStaleMachines["terminateMachines(staleMachines,machineSet)"] TermStaleMachines-->Delta["diff := len(activeMachines) - machineSet.Spec.Replicas"] Delta-->ChkDelta{"diff < 0?"} ChkDelta-->|yes| ScaleOut["numCreated:=slowStartBatch(-diff,..) // scale out"] ScaleOut-->Log["Log numCreated/skipped/deleted"] ChkDelta-->|no| GetMachinesToDelete["machinesToDel := getMachinesToDelete(activeMachines, diff)"] GetMachinesToDelete-->TermMachines["terminateMachines(machinesToDel, machineSet)"] -->Log-->ReturnErr["return err"]
terminateMachines
func (c *controller) terminateMachines(ctx context.Context,
inactiveMachines []*Machine, machineSet *MachineSet) error {
- Invokes
controlMachineClient.Machines(namespace).Delete(ctx, machineID,..)
for eachMachine
ininactiveMachines
and records an event. - The
machine.Status.Phase
is also set toTerminating
. - This is done in parallel using
go-routines
aWaitGroup
on length ofinactiveMachines
slowStartBatch
func slowStartBatch(count int, initialBatchSize int, createFn func() error) (int, error)
- Initializes
remaining
tocount
andsuccesses
as0
. - Method executes
fn
(which creates aMachine
object) in parallel with number of go-routines starting withbatchSize := initialBatchSize
and then doublingbatchSize
size after the call tofn
.- For each batch iteration, a
wg sync.WaitGroup
is constructed withbatchSize
. Each batch execution waits for batch to be complete usingwg.Wait()
- For each batch iteration, an
errCh
is constructed with size asbatchSize
batchSize
go-routines executefn
concurrently, sending errors onerrCh
and invokingwg.Done()
when complete.numErrorsInBatch = len(errCh)
successes
isbatchSize
minusnumErrorsInBatch
- if
numErrorsInBatch > 0
, abort, returningsuccesses
and first error fromerrCh
remaining
is decremented by thebatchSize
- Compute
batchSize
asMin(remaining, 2*batchSize)
- Continue iteration while
batchSize
is greater than0
. - Return
successes, nil
when done.
- For each batch iteration, a
fn
is a lambda that creates a newMachine
in which we do the below:- Create an
ownerRef
with themachineSet.Name
andmachineSet.UID
- Get the machine spec template using machineSet.Spec.Template
- Then create a
Machine
obj setting the machine spec andownerRef
. Use themachineSet
name as the prefix forGenerateName
in theObjectMeta
. - If any
err
return the same ornil
if no error. - New
Machine
objects are persisted usingcontrolMachineClient.Machines(namespace).Create(ctx, machine, createOpts)
- Create an