OperationJob
The OperationJob Workload is responsible for performing one-shot operational tasks on a batch of Pods, providing scaffolding for Pod operation scenarios to reduce user development costs.
OperationJob offers an abstract interface layer for Pod operational capabilities, supporting developers to implement operational functions as plugins.
Each operational plugin will be presented as a type of Action API in OperationJob, such as Replace.
Additionally, it optionally facilitates seamless integration with the PodOpsLifecycle to ensure lossless traffic changes during operations.
Example
Following docs will guide you to play with OperationJob, and to implement OperationJob action plugin.
Replace
Prepare Pods
Given that a CollaSet with more than 2 replicas is presented in your kubernetes cluster.
$ kubectl get cls
NAME     DESIRED   CURRENT   AVAILABLE   UPDATED   UPDATED_READY   UPDATED_AVAILABLE   CURRENT_REVISION   UPDATED_REVISION   AGE
foo      2         2         2           2         2               2                   foo-7bdb974bc7     foo-7bdb974bc7     7s
$ kubectl get pod
NAME           READY   STATUS    RESTARTS      AGE
foo-752sz      1/1     Running   0             41s
foo-jttd5      1/1     Running   0             41s
Create OperationJob
The following operationjob.yaml file describes a Replace OperationJob, which will replace pods in targets.
For each replace operation, a new pod will be created to replace the target pod, which will not be deleted until new pod is ServiceAvailable.  
apiVersion: apps.kusionstack.io/v1alpha1
kind: OperationJob
metadata:
  name: opj-replace
  namespace: default
spec:
  action: Replace # Operation type is replace
  activeDeadlineSeconds: 3600 # job will be forced failed after 3600s since startTime
  TTLSecondsAfterFinished: 18000 # job will be deleted after 18000s since job failed or succeeded
  partition: 1 # replace 1 pod at this time
  targets:
  - name: foo-jttd5
  - name: foo-752sz
Create OperationJob opj-replace to replace target pods.
$ kubectl apply -f operationjob.yaml
operationjobs.apps.kusionstack.io/opj-replace created
Replace Pods
The status of OperationJob is updated, and target pod foo-jttd5 is replaced by foo-mpl7n.
$ kubectl get opj
NAME          PROGRESS     AGE
opj-replace   Processing   11s
$ kubectl get pod
NAME        READY   STATUS              RESTARTS   AGE
foo-752sz   1/1     Running             0          92s
foo-jttd5   1/1     Running             0          92s
foo-mpl7n   0/1     ContainerCreating   0          4s
$ kubectl get opj opj-replace -o yaml | grep -A20 status
status:
  observedGeneration: 1
  progress: Processing # job is processing
  succeededPodCount: 1
  targetDetails:
  - extraInfo:
      NewPod: foo-mpl7n 
    name: foo-jttd5
    progress: Succeeded # foo-jttd5 is replaced by foo-mpl7n suceeded
  - name: foo-752sz
    progress: Pending # replace is pending
  totalPodCount: 2
The status.progress can be:
- Pending: operationJob is waiting to be processed
- Processing: operationJob is being processed
- Failed: some target pods have failed to operate
- Succeeded: all target pods have succeeded to operate
Note that if a target pod has failed to operate, status.targetDetails[x].error will show the reason and message for failure.
And if it has succeeded to operate, the error status will be cleared.
The status.targetDetails[x].progress can be:
- Pending: target pod is waiting to be operated
- Processing: target pod is being operated
- Failed: target pod has failed to operate
- Succeeded: target pod has succeeded to operate
Edit opj-replace to replace the other target pod.
$ kubectl edit opj opj-replace
# operationjob.yaml
# Edit partition to 2 to replace all pods
...
spec:
  ...
  partition: 2
All pods replaced.
$ kubectl get pod
NAME        READY   STATUS              RESTARTS   AGE
foo-752sz   1/1     Running             0          8m5s
foo-mpl7n   1/1     Running             0          6m37s
foo-rgxbl   0/1     ContainerCreating   0          5s
$ kubectl get opj opj-replace -o yaml | grep -A20 status
status:
  endTimestamp: "2024-09-13T08:47:43Z"
  observedGeneration: 2
  progress: Succeeded # all pods are replaced, job is suceeded
  succeededPodCount: 2
  targetDetails:
  - extraInfo:
      NewPod: foo-mpl7n
    name: foo-jttd5
    progress: Succeeded
  - extraInfo:
      NewPod: foo-rgxbl
    name: foo-752sz
    progress: Succeeded # foo-752sz is replaced by foo-rgxbl suceeded
  totalPodCount: 2
$ kubectl get opj
NAME          PROGRESS    AGE
opj-replace   Succeeded   6m42s
The status.targetDetails[x].extraInfo is a key-value string map, which is used to store operate information for target.
Developers can define and utilize specified extraInfos for their action plugins.
Tutorial
Action Plugin
Developers implement and register action plugin, then OperationJob controller is responsible for running it:

Action plugin is formulated as golang adapter ActionHandler, which consists 4 idempotent functions:
- Setupsets up action in- AddToMgr, i.e., watch resources for action, register cache index
- OperateTargetoperates what you want to the target pod
- GetOpsProgressgets current operation status of target pod, i.e., "Processing", "Failed" and "Succeeded"
- ReleaseTargetcleans up target pod and operating environment when operation finished or job deleted
...
type ActionHandler interface {
    // Setup sets up action with manager in AddToMgr, i.e., watch, cache...
    Setup(controller.Controller, *mixin.ReconcilerMixin) error
    // OperateTarget do real operation to target
    OperateTarget(context.Context, *OpsCandidate, *appsv1alpha1.OperationJob) error
    // GetOpsProgress returns target's current opsStatus, e.g., progress, reason, message
    GetOpsProgress(context.Context, *OpsCandidate, *appsv1alpha1.OperationJob) (progress ActionProgress, err error)
    // ReleaseTarget releases the target from operation when the operationJob is deleted
    ReleaseTarget(context.Context, *OpsCandidate, *appsv1alpha1.OperationJob) error
}
Register Action
Developers can register implemented action plugins by calling RegisterAction before OperationJob controller AddToMgr is called.
The register function consists 3 parameters: 
- action: string, name of action plugin, showed in- spec.action
- hander: ActionHandler, the implemented adapter
- enablePodOpsLifecycle: bool, if true, target pods will be operated in the manner of PodOpsLifecycle
...
// RegisterAction will register an operationJob action with handler and lifecycleAdapter
// Note: if enablePodOpsLifecycle=false, this operation will be done directly, ignoring podOpsLifecycle
func RegisterAction(action string, handler ActionHandler, enablePodOpsLifecycle bool) {...}
Example
As an example, OperationJob natively supports Replace action. The Replace ActionHandler is implemented and registered before OperationJob controller added in main function.