Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SDK Docker build scripts, update Emscripten #172

Merged
merged 4 commits into from
Oct 10, 2024

Conversation

krinkinmu
Copy link
Contributor

The way the script is currently it does not work (anymore). I'm aware of two issues with the current setup:

  1. emsdk underneath installs and uses Node JS and the version of Node JS is not pinned by the script and isn't tied to the version of the SDK used.

    As a result, emsdk moved to a newer version of Node JS than the one that was probably used when the scripts were created. This new version (it seems, the version currently being used is 18.20.3) requires a relatively fresh version of glibc which just isn't available in Ubuntu Bionic used as a base image. That results in obscure errors from CMake (see docker build fails with "Could NOT find Threads (missing: Threads_FOUND)" #170)

  2. for whatever reason, the version of the protobuf static libraries currently in the repo, don't seem to work. I don't really know how the issue happened, but there were at least 2 users that faced the problem (see Error building building example WASM filter  #161).

To resolve the first issue, we have a bunch of options:

  1. Fix the version of emsdk
  2. Fix the version of Node JS
  3. Update the version of glibc

I figured I can combine options 1 and 3 together for the following reasons:

  1. Fixing the version of emsdk fixes the problem
  2. The problem arised from the fact that the versions of software used in the build script are a bit old, so an update might be in order, even though we have other solutions to the problem.

NOTE: It's my understanding that fixing emsdk version should also
pin Node JS version, so if we deploy option 1, option 2 seem
redundant.

For the second problem, I think there is only one ultimate solution

  • not store binary artifacts in the repository and instead build them from the sources in the repo.

It seems that in the past there was a concern that building protobuf libraries takes a long time - it's still certainly the case. However, I think we can compensate for that in two ways:

  1. Drop WAVM - it does not seem like WAVM is still needed (the project itself appears to be dead and hasn't had any updates for at least 2 years), but also one of the comments in Expand C++ SDK documentation #158, that removed the WAVM from the docs, also suggests that three doesn't seem to be a good reason to keep WAVM in the SDK build script.
  2. Take advantage of potential hardware threads in the system when calling make - most laptops or servers these days have multiple cores, so if we have to build protobuf libraries twice, we can speed it up by using more cores.

With all those changes made, the build time adds up to something like 32 minutes on my laptop, compared to the 48m without building protobuf libraries from sources (and without adding -j option to make). So I think the additional time spent on building protobuf library is compensated by other changes.

@martijneken
Copy link
Contributor

martijneken commented Oct 1, 2024

Thanks for the PR! Few thoughts from my end:

@krinkinmu
Copy link
Contributor Author

Thanks for the PR! Few thoughts from my end:

Thank you! I'm going to take a look at the Bazel toolchain update, I will post in the PR, once it's ready for review.

@krinkinmu
Copy link
Contributor Author

krinkinmu commented Oct 2, 2024

@martijneken I think I have a working change that updates Bazel build files to the newer version of emsdk, but I hit a bit of an issue.

I can see that currently Bazel configuration patches emsdk. Digging through the history, it seems like the patch was added to work around some build issues (see #149). Reading through the issue it seems like it manifested when building some proxy-wasm-cpp-host tests (#149 (comment)).

With that in mind, I tried to reproduce the issue, so that I can generate and verify the refreshed version of the patch. However, I don't seem to be able to reproduce the issue. It's a long shot, but by any chance do you remeber under what circumstances #149 manifested?

NOTE: I will still try to reproduce it myself, I'm just throwing this question in in case you might remember the context.

UPDATE: Didn't reproduce it yet, but I think the build failure happened when building from Envoy, so to reproduce the error we need Envoy with external dependency on proxy-wasm-cpp-host that does not contain this patch, working on reproducing the issue through Envoy now.

@krinkinmu
Copy link
Contributor Author

@martijneken I added emsdk update for Bazel build scripts as well now. PTAL.

One thing that I want to call out. I tried to reproduce the problem reported in #149 locally to test a new emsdk patch, however, I wasn't able to do that.

I don't think that the patch is needed since emscripten-core/emsdk#1045 which removes the npm modules as a dependency for the toolchain (among other things) and that should uktimately achieve the same effect. That being said, given that I couldn't reproduce the original issue, I wasn't able to explicitly confirm that.

The way the script is currently it does not work (anymore). I'm aware
of two issues with the current setup:

1. emsdk underneath installs and uses Node JS and the version of Node JS
   is not pinned by the script and isn't tied to the version of the SDK
   used.

   As a result, emsdk moved to a newer version of Node JS than the one
   that was probably used when the scripts were created. This new
   version (it seems, the version currently being used is 18.20.3)
   requires a relatively fresh version of glibc which just isn't
   available in Ubuntu Bionic used as a base image. That results in
   obscure errors from CMake (see
   proxy-wasm#170)
2. for whatever reason, the version of the protobuf static libraries
   currently in the repo, don't seem to work. I don't really know
   how the issue happened, but there were at least 2 users that faced
   the problem (see
   proxy-wasm#161).

To resolve the first issue, we have a bunch of options:

1. Fix the version of emsdk
2. Fix the version of Node JS
3. Update the version of glibc

I figured I can combine options 1 and 3 together for the following
reasons:

1. Fixing the version of emsdk fixes the problem
2. The problem arised from the fact that the versions of software
   used in the build script are a bit old, so an update might be
   in order, even though we have other solutions to the problem.

> NOTE: It's my understanding that fixing emsdk version should also
> pin Node JS version, so if we deploy option 1, option 2 seem
> redundant.

For the second problem, I think there is only one ultimate solution
- not store binary artifacts in the repository and instead build
them from the sources in the repo.

It seems that in the past there was a concern that building protobuf
libraries takes a long time - it's still certainly the case. However,
I think we can compensate for that in two ways:

1. Drop WAVM - it does not seem like WAVM is still needed (the project
   itself appears to be dead and hasn't had any updates for at least 2
   years), but also one of the comments in
   proxy-wasm#158, that
   removed the WAVM from the docs, also suggests that three doesn't
   seem to be a good reason to keep WAVM in the SDK build script.
2. Take advantage of potential hardware threads in the system when
   calling make - most laptops or servers these days have multiple
   cores, so if we have to build protobuf libraries twice, we can
   speed it up by using more cores.

With all those changes made, the build time adds up to something like
32 minutes on my laptop, compared to the 48m without building
protobuf libraries from sources (and without adding -j option to make).
So I think the additional time spent on building protobuf library is
compensated by other changes.

Signed-off-by: Mikhail Krinkin <[email protected]>
This CL mostly follows the instructions in
https://github.com/emscripten-core/emsdk/blob/main/bazel/README.md.
Even so, there are a few things that are worth mentioning:

1. I tested the changes locally and in my tests
   incompatible_enable_cc_toolchain_resolution bazel flag made no
   difference (e.g. everything builds with or without the flag);
   I still keep it though because instructions say that it should
   be there and it does not cause any harm.
2. I don't think that we still need to patch emsdk to exclude npm
   modules from the toolchain, because it appears that upstream
   already removed those as a toolchain dependency in
   emscripten-core/emsdk#1045.

It's worth noting, that even though I don't think that emsdk patch
is still needed, I actually wasn't able to reproduce the problem
reported in proxy-wasm#149
locally without the emsdk.patch even with the current version of
emsdk used by C++ SDK.

Signed-off-by: Mikhail Krinkin <[email protected]>
@martijneken
Copy link
Contributor

martijneken commented Oct 9, 2024

SDK changes LGTM. I am running some integration tests:

any chance do you remeber under what circumstances #149 manifested?

Don't recall; it might have been gcc specific.

@krinkinmu
Copy link
Contributor Author

@martijneken if you note somewhere what tests you're running manually, I will make sure to run them myself going forward (and include the results) and maybe automate them as well.

@martijneken
Copy link
Contributor

martijneken commented Oct 9, 2024

what tests you're running manually

More or less trying to replicate the CI builds and tests in 1 or 2 local configurations:

  • for CppHost I did: bazelisk test --config=clang --define engine=null //test/... and engine=v8 for real wasm
  • for Envoy I did: bazelisk build --config=libc++ -c opt //source/exe:envoy-static and bazelisk test --config=libc++ -c opt //test/extensions/...

Envoy and v8 builds take a long time, still running. Will send the Envoy PR if/when this succeeds.

@martijneken
Copy link
Contributor

martijneken commented Oct 10, 2024

All looks good locally. CI tests:

Looks like you've done it! I'm doing some final integration testing with Abseil and RE2. Stay tuned.

@martijneken martijneken changed the title Refresh and fix SDK Docker image build scripts Fix SDK Docker build scripts, update Emscripten Oct 10, 2024
Copy link
Contributor

@martijneken martijneken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update: https://github.com/proxy-wasm/proxy-wasm-cpp-sdk/blob/main/docs/building.md

  • update the emsdk version
  • remove the libprotobuf.a section

I will send a follow-up PR to add full support for Abseil and RE2.

martijneken added a commit to martijneken/proxy-wasm-cpp-sdk that referenced this pull request Oct 10, 2024
Depends on Emscripten update in proxy-wasm#172.

Signed-off-by: Martijn Stevenson <[email protected]>
@martijneken
Copy link
Contributor

LGTM after docs fix.

I have a follow-up PR to get full Abseil/RE2 support \o/ #173

This is a followup to the previous commits that refreshed the build
scripts and emsdk. This change reconciles the docs with the current
state of the build scripts.

Signed-off-by: Mikhail Krinkin <[email protected]>
@krinkinmu
Copy link
Contributor Author

Can you update: https://github.com/proxy-wasm/proxy-wasm-cpp-sdk/blob/main/docs/building.md

  • update the emsdk version
  • remove the libprotobuf.a section

I will send a follow-up PR to add full support for Abseil and RE2.

I updated the docs with the new emsdk version. I didn't completely remove protobuf section though. I might be missing something, but the way I understand the docs it tries to explain how to prepare all the dependencies without using Docker container or relying on Bazel. So I updated the wording in the docs, but kept the section about protobuf libraries in it.

Copy link
Contributor

@mpwarres mpwarres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great--thanks for the contribution!

@martijneken
Copy link
Contributor

@krinkinmu I don't recall how the repo permissions are set. If you need us to merge this PR, give a shout when ready.

It's a followup for the previous changes that updated emsdk version
and refreshed the build scripts. Testing confirms that patches for
the protobuf library sources are not necessary anymore, so I'm
updating this Makefile to remove the patching logic that was already
removed from all other places.

Signed-off-by: Mikhail Krinkin <[email protected]>
@krinkinmu
Copy link
Contributor Author

@martijneken it seems that I cannot merge it myself. Changing protection rules is welcome (to allow merging approved PRs), but in the meantime would you mind pushing a button for me?

@martijneken
Copy link
Contributor

ACK, I will look at the repo rules. Thanks for your PR, this is awesome!

@martijneken martijneken merged commit 6b3dc93 into proxy-wasm:main Oct 10, 2024
5 checks passed
martijneken added a commit to martijneken/proxy-wasm-cpp-sdk that referenced this pull request Oct 10, 2024
Depends on Emscripten update in proxy-wasm#172.

Signed-off-by: Martijn Stevenson <[email protected]>
@martijneken
Copy link
Contributor

Something we forgot about in review: there have been some changes to Emscripten memory management:

  • default stack size is now 64 KiB (STACK_SIZE)
  • default heap size is now 16 MiB (INITIAL_HEAP) and does not grow by default (ALLOW_MEMORY_GROWTH)

We will send a follow-up PR to tweak these values a bit for server environments.

@mpwarres
Copy link
Contributor

We will send a follow-up PR to tweak these values a bit for server environments.

Sent #174, PTAL

@krinkinmu
Copy link
Contributor Author

@martijneken sorry my bad, I didn't check it - I can send a follow up PR shortly.

I've attempted to trace what the previous values were and here is what I found so far:

  1. STACK_SIZE - 5MiB (changed in emscripten-core/emscripten@157fcd4)
  2. INITIAL_HEAP - this seem to be a relatively new option (introduced in emscripten-core/emscripten@22a1c04) and before that INITIAL_MEMORY was used and was 16MiB for a long time
  3. ALLOW_MEMORY_GROWTH seem to have been disabled since the time it was introduced.

It looks like STACK_SIZE is the only thing that changed, unless I'm missing something.
It does seem like allowing memory to grow is a more sensible default for the servers.

As for INITIAL_HEAP, I'm not certain what value it should be and how harsh the penalty for setting it too low and growing the heap later. Would it make sense to leave it as-is for now, assuming my understanding that it didn't change is correct?

It looks like these options are not currently configurable from proxy_wasm_cc_binary rule. I can expose those as configuration options along the way?

@PiotrSikora
Copy link
Member

Regarding removed patches for node_modules (see: envoyproxy/envoy#21487), the reason for them was two-fold:

  1. Downloading them completely broke Envoy CI at the time due to flaky origin for those modules.
  2. Emscripten SDK tarball already contains those modules, so reinstalling them is completely unnecessarily.

Separately, you should either cc or add @kyessenov to CODEOWNERS as a gatekeepr (assuming it's fine with him), since he knows a lot of history behind those changes, and because Istio is probably still the biggest consumer of this SDK.

@krinkinmu krinkinmu mentioned this pull request Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants