Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Containerize test generation and run #257

Open
mradbourne opened this issue Jul 25, 2024 · 6 comments
Open

Containerize test generation and run #257

mradbourne opened this issue Jul 25, 2024 · 6 comments

Comments

@mradbourne
Copy link
Contributor

mradbourne commented Jul 25, 2024

Challenges

  • Setup scripts target Ubuntu, which means a setup overhead for local development on Mac and Windows.
  • Conformance testing requires the configuration of multiple platforms. Lack of isolation around this causes overhead for local development, especially if the developer is using the same tooling for multiple projects.
  • The dependency specification is not centralized. For example, dependency installation commands are included in the test generation scripts. This makes getting a clear overview of dependencies more difficult.

Proposed solution

Use Docker to containerize the test generation and run environment for consistency across local and GitHub Actions pipeline.
Specify all dependency installation and setup in the Docker image definition.
Provide a Python CLI as the main entry point for the project, which abstracts the Docker complexity.

Containerization can be introduced incrementally:

  1. Provide a Dockerfile that matches the environment specified in .github/workflows/e2e.yaml and setup.sh.
  2. Create a simple Python script to easily run existing test scripts in a Docker container. For example:
$ python conformance.py run genData100.sh
  1. Move all existing installation commands into the Dockerfile.
  2. Allow run configuration files to be passed to the new Python CLI (i.e. run_config.json support).
  3. Use the same Docker image and Python CLI in GitHub Actions pipeline.
  4. Move all existing test scripts into the Python CLI and add params or flags for executors and ICU versions. For example:
$ python conformance.py test --icu_versions=icu74 --test_types=rust

Impact

This solution would introduce Docker as a dependency for the project.
Consolidating the dependency list and setup into the Dockerfile would simplify the project but it would also make Docker an integral part of the setup. There is room in this proposal to keep an uncontainerized version of the setup at the expense of some added complexity.

@mradbourne
Copy link
Contributor Author

If this seems like a good approach, I would be happy to work on implementing the solution.
All thoughts and feedback welcome.
cc @echeran

@echeran
Copy link
Collaborator

echeran commented Jul 25, 2024

That sounds great to me! I like the listing of the separate incremental steps to make this change (easier to review & adapt to). Overall, this ought to make the project easier to use and maintain.
FYI @sffc @sven-oly

@sffc
Copy link
Member

sffc commented Jul 26, 2024

Things that are very architecture-specific, such as the C++ executor, would be good to containerize. Maybe some of the other executors like Rust or Dart, especially if they depend on very specific compiler setups.

However, I don't think it's a good idea to shove the entire runtime into a container (I'm not sure exactly the scope of what you were thinking). I do still think it is good to have our Python scripts (or the thin .sh files around them) be the entrypoints. But, it makes sense if Python invokes a container as an executor and treats it like a native executor (interacting with stdin/stdout).

@mradbourne mradbourne changed the title Add containerization Containerize test generation and run Jul 31, 2024
@mradbourne
Copy link
Contributor Author

Thanks for the feedback @sffc . Yes, I'm proposing the containerizing of the Python scripts etc for test generation and running, not just the executors. I was planning for the executors to run in the same container as the scripts (at least initially), similar to how GitHub Actions runs them currently. Containerizing individual executors sounds interesting though and could be decided separately. My proposal is just taking the existing test generation+run setup (including its single environment execution) and making it more transparent and easily reproducible - on a developer's machine, for example.

I do still think it is good to have our Python scripts (or the thin .sh files around them) be the entrypoints.

Yes, totally agree. I think that a simple entrypoint to test generation and running would make sense, especially for those without Docker experience. I also think a way to opt out of containerization would be a good idea for developers that have all dependencies locally and don't use Docker. e.g.:

$ python conformance.py test --no-container

This is the basic scope of what I was thinking:
image

(Your suggestion of containerizing the individual executors would mean that the 'Executor deps' could be moved out of this container.)

Does this address any of your concerns about putting the run into the container?

@sffc
Copy link
Member

sffc commented Aug 26, 2024

@echeran, @sven-oly and I discussed this.

My main concern is that I want to avoid a container in the code-build-debug loop for writing and testing executors. However, I acknowledge that the current scripts we have are very Ubuntu-specific, which isn't great. It would be very nice to remove this barrier to contributors to the project.

So what I think we should do is:

  1. Make a Setup Containerfile that installs the dependencies and sets up the environment. Replace the Linux-specific installation commands with the Containerfile and point docs to say that you can follow those steps to install the tools locally.
  2. Try to make the Python scripts and executors work in a cross-platform way. In Python and most executors, this should be easy. Might not be feasible in C++ and that's okay for now. We could eventually move toward docker-in-docker for this (Python spawning a container when invoked directly) but that doesn't need to be done right away.
  3. Make a Runner Containerfile that extends the Setup Containerfile but also interfaces with the Python scripts. Make CI use this same Containerfile.
  4. Build a Python CLI that either invokes the Containerfile or runs the Python scripts locally (outside of the container).

This is mostly consistent with what @mradbourne proposed in the OP except:

  • Let's keep the Python scripts runnable locally and support it as a first-class feature if you're in an environment that supports them.
  • Let's get rid of the Python Linux setup script. Pointing to a Containerfile is cleaner.

We're also happy to schedule a meeting to discuss this further.

@mradbourne
Copy link
Contributor Author

That's all clear. Thanks for discussing.
I hear your concern around some container-build step slowing down executor development - let's avoid that 👍
I can raise an initial PR to capture the items listed above and act as a prompt for the more detailed discussion.

@sffc sffc added this to the Priority ⟨P2⟩ milestone Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants