Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automation of GCP carbon transfer config is currently not possible #10

Open
JohannesRudolph opened this issue Sep 5, 2022 · 0 comments

Comments

@JohannesRudolph
Copy link
Member

JohannesRudolph commented Sep 5, 2022

We added experimental code in 376df06 to create the GCP Carbon Footprint Export dataset and a data transfer job to pull the data into a big query dataset.

Unfortunately, the permission model used by GCP Big Query transfers is a bit twisted. Here's what I could gather from GCP's documentation

  • GCP uses a managed service SA to run BigQuery Data Transfers e.g. service-${projectnumber}@gcp-sa-bigquerydatatransfer.iam.gserviceaccount.com
  • this managed service SA does not naturally have permission to access the target dataset, so it needs to be authorized either to act as a user or service account that has access
    • by default the GCP Console will choose user authorization. This means that users will be prompted to allow Big Query access to their account (which works only in the browser). The big query transfer will run as this user
    • some data sources support running as a service account, however the Cloud Carbon Footprint Export does not

We can control with terraform how the transfer config gets created

  • by default the resource google_bigquery_data_transfer_config uses the authenticated principal of the google provider
  • when setting resource google_bigquery_data_transfer_config offers the service_account_name the transfer config uses the explicitly configured service account

Now, when we tested this module we always ran it as users that already performed a manual setup of the big query data transfer, i.e. who already authorized the big query service access to their account. In this case, provisioning from terraform succeeded. hashicorp/terraform-provider-google#4449 describes a similar case.

However, the module now had two important failure modes that are bad for the first time experience as they add unpredictability

  • as an operator of the module I'm not made explicitly aware that the code that we provision will keep using my user account as part of the transfer config
  • provisioning the transfer config may fail, with non-descript errors like
  module.meshplatform.module.carbon_export[0].google_bigquery_data_transfer_config.carbon_footprint_transfer_config Creating...
│ Error: Error creating Config: googleapi: Error 400: Request contains an invalid argument.
  • we cannot deploy terraform-gcp-meshplatform when the google provider is configured to use a service account, which is common for customers deploying this module from CI/CD

I looked into the alternative of providing a separate google provider via configuration_aliases but ultimately that adds complexity to every consumer of the terraform-gcp-meshplatform - even if they're not using the carbon footprint export. (Sidenote: i could not figure out how to make an "optional provider" which could solve this problem). Furthermore, there's already no terraform export for setting up the GCP billing export and operators have to set it up manually anyway. I thus feel its best to keep this consistent and require manual steps to set up the billing and carbon footprint exports.

JohannesRudolph added a commit that referenced this issue Sep 5, 2022
ghost pushed a commit that referenced this issue Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant