diff --git a/docs/design/proposals/multi-version/images/Kubernetes-Admission-controllers-01-flow-diagram.jpg b/docs/design/proposals/multi-version/images/Kubernetes-Admission-controllers-01-flow-diagram.jpg new file mode 100644 index 0000000000..e7f95983af Binary files /dev/null and b/docs/design/proposals/multi-version/images/Kubernetes-Admission-controllers-01-flow-diagram.jpg differ diff --git a/docs/design/proposals/multi-version/multi-version.md b/docs/design/proposals/multi-version/multi-version.md new file mode 100644 index 0000000000..55cb3b72f7 --- /dev/null +++ b/docs/design/proposals/multi-version/multi-version.md @@ -0,0 +1,361 @@ +# ACK Multi-Version CRD support + +## Background + +### ACK and upgrade issues + +AWS Controllers for Kubernetes (ACK) is a set of open-source controllers for +managing AWS resources in Kubernetes. It's designed to simplify the management +of AWS resources in Kubernetes by taking advantage of the native kubernetes +management system known as the Control loop or Reconcilliating systems. + +Currently, ACK uses a code-genrator to generate everything that is need to +build and run a controller: API definitions, CRDs, AWSResourceManager, helm +charts etc... 98% of the controllers code is generated by the ACK community +members. When a new service team introduces a new field, resource or +operation all we do on the ACK side is bumping the aws-sdk-go version and +regenerate the controller. :magic: + +However, when service teams/ack developers have to rename a field or make +a field a kubernetes secret reference, they make significant changes to +the CRD that are backward incompatible. In kubernetes world, CRD schemas +should not be modified but rather introduced in a new CRD version. + +### Kubernetes and multi-version CRDs + +In a multi-version CRDs, each version of the resource is identified by a +distinct version string (v1alpha1, v1beta2, v1 etc...). Kubernetes uses the +version string to route API requests to the appropriate version of the +resource. + +Even when dealing with multiple versions, we still need one controller handling +the main "Hub" version, and use some "mutating" mechanisms to transform the +non-main "Spoke" version to the hub version. This mechanism is known as the +mutating webhooks. + +Mutating webhooks are a type of admission controller that allows you to modify +incoming requests to Kubernetes API server before they are persisted. They can +be used to modify or validate the content of a Kubernetes object before it's +created or updated. There are three known type of admission controllers: +- Mutating Webhooks +- Defaulting Webhooks +- Validating Webhooks + +Mutating webhooks work by intercepting incoming requests to the Kubernetes +API server and forwarding them to a webhook endpoint. The webhook endpoint +can then modify the request payload, add, remove or change fields or validate +the request payload before it's persisted in etcd storage. Once the webhook +endpoint modifies the request payload, the admission controller sends the +updated payload to the API server for storage. + +In the context of multi-version CRDs, mutating webhooks can be used to +automatically update older versions of custom resources to the latest version. +The mutating webhook can modify the payload of an older version of the +resource to match the new version of the resource. This can help you maintain +backwards compatibility when you need to change the schema of a custom resource. + +[TODO Add schema explaining the admission webhooks + api-server + controller +interactions] + +## Motifivation + +The purpose of this design document is to outline a plan for adding support +for multi-versioning and generating mutating webhooks for ACK Controllers. +Multi-versioning and mutating webhooks are important features that can help +improve the reliability and maintainability of custom resources in Kubernetes. + +This document will provide an overview of the current state of ACK code-generator +and existing controllers, and the need for multi-versioning and generating +mutating webhooks. +It will also propose a solution for adding these features to ACK and outline +the technical challenges involved in implementing them. The document will then +provide a high-level plan for how to implement multi-versioning and mutating +webhooks in ACK and discuss best practices for testing and validation. + +## Goals + +- Provide a clear understanding of the need for multi-versioning and + generating mutating webhooks for ACK Controllers. +- Propose a solution for adding these features to ACK +- Outline the technical challenges involved in implementing multi-versioning + and generating mutating webhooks in ACK. +- Provide a high-level plan for how to implement multi-versioning and + generating mutating webhooks in ACK +- Discuss best practices for testing and validation of the new features + + +## Non-Goals + +- Provide a comprehensive overview of all features of ACK +- Provide detailed instructions for using ACK +- Discuss implementation details outside of the scope of multi-versioning +and generating mutating webhooks in ACK + +## Prior work + +Prior to this design, several components were built to support +multi-versioning in Amazon Controllers for Kubernetes (ACK). The focus +of these efforts was mainly on modifying the ACK/code-generator tool +to generate API definitions for different API versions and enabling +the webhook server through controller flags. A key contribution was the +generalization of webhook generation beyond just mutating webhooks, +as described in the webhook PR. + +Github [issue][mv-gh-issue] and +code-generator PRs: +- https://github.com/aws-controllers-k8s/code-generator/pull/108 +- https://github.com/aws-controllers-k8s/code-generator/pull/111 +- https://github.com/aws-controllers-k8s/code-generator/pull/121 +- https://github.com/aws-controllers-k8s/code-generator/pull/122 + +### Issue with prior work + +However, one issue that prevented completion of the webhook PR was the +difference between how ACK views CRD versions and how they are handled +in Kubernetes. In ACK, all resources are versioned with the same APIVersion, +while in Kubernetes, resources can have multiple versions and can deprecate +at various stages and levels. This makes it challenging to implement +webhooks that can handle multiple versions and deprecations effectively. + +## Overview of the design + +The proposed design aims to address the challenges of multi-versioning +and generating mutating webhooks in Amazon Controllers for Kubernetes (ACK). +It involves modifications to the ACK/code-generator tool and the implementation +of a webhook server to handle multiple versions and deprecations of Custom +Resource Definitions (CRDs). The solution uses a combination of Kubernetes API +machinery and controller-runtime libraries, and the document outlines a +high-level plan for implementation, including updating the code-generator +tool and modifying ACK's controllers. Additionally, the document discusses +the testing strategies that will be employed to ensure backward compatibility. + +### Rethinking the ACK versioning + +Before jumping to any special topic, we need to answer two important questions: +- Do we still want to bump all the CRD versions all at once? meaning that + if we rename a field in CRD A, CRD B C D and E will also need a version bump + (and a mutating webhook handler) even if they are not modified? +- How do we deprecate/remove versions in a programatical way? + +#### CRDs Mono Version vs Group Version + +The answer to the questions above will impact the design of [`metadata.yaml`] +[iam-metadata.yaml]. And how we think about ACK CRD versioning and how +version deprecation happens. Do we deprecate a set of CRD versions? or only +one specific CRD version? unfortunetaly picking one of those solutions is kinda +a hard one way door solution to say the least. + +CRD Group versioning: + +- Pros: +1. Current ACK code-generator works this way +2. Easier code generation (import paths, packaging etc...) +3. Easier CRD Version difference calculation +4. Doesn't enormously impact the rest of the features (references etc...) +5. No need to change `metadata.yaml` + +- Cons: +1. Confusing for users +2. Little CRD Changes means we will bump all the CRDs versions +3. Not very kubernetes style. + +CRD Mono versioning: + +- Pros: +1. Similar to k8s style +2. Easier deployments (risk reduced when upgrading CRDs) +3. Less error zone as we will target resources one by one + +- Cons: +1. Very tricky in code generation +2. Needs a rethinking of generator.yaml + metadata.yaml +3. We don't know the impact on existing features (references etc...) + +#### CRDs version deprecation + +Curently we only have `metadata.yaml` as a tool to deprecate versions, see +APIInfoStatuses contants in [code-generator/pkg/metadata][code-gen/pkg/md]. Which +works very well with `pkg/model/multiversion` package. In case we want to go with +CRD Mono Version way, we will have to change this model to be Resource specific +rather than version specific. + +### Webhook server + +Controller runtime already offers webhook server setup as a builtin mechanisms, and +it was ported to ACK runtime/code-generator earlier in [this PR][ack-webhook-pr]. +The missing part here is a certificate issuer like Cert Manager - behind the scenes +the webhook server needs to authenticate the controllers in order to allow them to +mutate/validate CRs. +The Cert Manager part will be discuss bellow. + +### Code Generation + +```go +// APIVersion represents an API version of the generated CRD. +type APIVersion struct { + // Name of the API version, e.g. v1beta1 + Name string `json:"name"` + // Served whether this version is enabled or not + Served *bool `json:"served,omitempty"` + // Storage whether this version is the storage version. + // One and only one version can be set as the storage version. + Storage *bool `json:"storage,omitempty"` +} +``` + +https://github.com/aws-controllers-k8s/code-generator/pull/122 +[TODO AMINE EXPAND THIS SECTION] + +### Cert manager + +Cert-manager plays a critical role in securing and authenticating communication between +different components of the Cluster. It is essential to provide valid certificates for the +webhook server to ensure secure communication between the webhook server and the Controllers +API server. Without valid certificates, the webhook server cannot be authenticated, which +can result in a failure to register the webhooks. Therefore, it is imperative to use +cert-manager to ensure that the webhook server has a valid certificate before registering +any webhooks. + +Cert manager will probably be added as an optional install to the helm charts, and will +be installed by default when running multi-version e2e tests. + +# Implementation details + +Goal is whatever generated code it should keep things backward compatible + +## How to use multi-version with code-gen + +Scenarios: +Renaming a field +```yaml +resources: + FunctionURLConfig: + apiVersions: + # introduce a new version for FunctionURLConfigs + - name: v1alpha2 + served: true # api is still served by the controller + storage: true # api is storage version of the CRD (usually it should be the last one) + # Renaming FunctionName to Function + renames: + operations: + CreateFunctionURLConfig: + input_fields: + FunctionName: Function + ... +``` + +How it should work. +1. Pick field from old name + +Generated code +```go +``` + + +Removing a field +```yaml +``` +generated code + +Adding fields: +1. Going-up: Look up the field values from annotations +2. Going-down: Store new value in annotation + +Making a field a secret +```yaml +``` + +spec.Password + +1. Going-up: Look up the field values from annotations +2. Going down: Store secret name/namespace in annotations + +lambda.services.k8s.aws/field-name:{ + "v1alpha2-v1alpha1": { + "name": "***" + "namespace": "***" + "key": "***" + }, + "v1alpha2-v1alpha1": { + "name": "***" + "namespace": "***" + "key": "***" + } +} + +lambda.services.k8s.aws/multi-version:{ + "field-name": { + "v1alpha2-v1alpha1": { + "name": "***" + "namespace": "***" + "key": "***" + }, + }, + { + "field-2": + "v1alpha2-v1alpha1": { + "name": "***" + "namespace": "***" + "key": "***" + } + } +} + +backward compatibility + +Making a field a reference +```yaml +``` + + +## Testing + +### Unit tests + +To ensure the correctness of the code generated by `ack-generate`, unit tests will be +written to validate the generated codes behavior. These tests will ensure that the +code generated for each version of the API is correct and functional. It should not target +all the CRDs of one controller, but try to cover all the possible struct/shape combinations, +like ApiGateway, DynamoDB, RDS, etc... + +These tests will be run as part of the ack codegen prow job. + +### E2E tests + +The E2E tests will ensure that the generated webhooks are registered correctly and +function as expected. The e2e tests should be able to test multiple stages from the +controller startup to the resources being handled in multiple versions. + +To test the multi-version feature, we will create multiple CRDs with different versions +and test the controller's behavior when handling them. The E2E tests will also test the +webhook behavior when a new version of the CRD is introduced, ensuring that the webhook +correctly converts the old version to the new version. + +To test the generated mutating webhooks, we will create test cases where the webhook is +expected to mutate the incoming request. These tests will ensure that the generated +webhook correctly mutates the request and that the resulting object is valid and conforms +to the new version's schema. + +#### Requirements + +The process of spining a controller, upgrading it and it's CRDs require a good rework of +our existing prow e2e testing framework, part of that work is described in this [design proposal] +[test-infra-proposal]. + +## Alternatives + +Discuss why cuelang can be of help in the future. + +--- + +[aws-sdk-go]: +[aws-sdk-go-version]: +[kubernetes-docs]: +[kubernetes-versioning-docs]: +[ack-webhook-pr]: https://github.com/aws-controllers-k8s/code-generator/pull/111 +[ack-webhook-pr-rt]: https://github.com/aws-controllers-k8s/runtime/pull/29 +[mv-gh-issue]: https://github.com/aws-controllers-k8s/community/issues/835 +[webhook-pr]: https://github.com/aws-controllers-k8s/code-generator/pull/122 +[iam-metadata.yaml]: https://github.com/aws-controllers-k8s/iam-controller/blob/main/metadata.yaml#L6-L8 +[code-gen/pkg/md]: https://github.com/aws-controllers-k8s/code-generator/blob/main/pkg/metadata/api_info.go#L18-L22 +[test-infra-proposal]: https://github.com/aws-controllers-k8s/test-infra/pull/310 \ No newline at end of file