Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GLIBC_2.27 not found #78

Open
samhh opened this issue Feb 7, 2021 · 35 comments
Open

GLIBC_2.27 not found #78

samhh opened this issue Feb 7, 2021 · 35 comments

Comments

@samhh
Copy link

samhh commented Feb 7, 2021

Have I misinterpreted the instructions regarding viability of local invocation or is this a bug?:

$ sam local invoke
<home>/.local/lib/python3.9/site-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.3) or chardet (3.0.4) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
Invoking NOT_USED (provided)
Skip pulling image and use local one: amazon/aws-sam-cli-emulation-image-provided:rapid-1.17.0.

Mounting <project>/.stack-work/docker/_home/.local/bin as /var/task:ro,delegated inside runtime container
START RequestId: 5495b4de-5771-4cf8-b15c-47aa02989eb9 Version: $LATEST
/var/task/bootstrap: /lib64/libm.so.6: version `GLIBC_2.27' not found (required by /var/task/bootstrap)
time="2021-02-07T17:22:47.318" level=error msg="Init failed" InvokeID= error="Runtime exited with error: exit status 1"
time="2021-02-07T17:22:47.318" level=error msg="INIT DONE failed: Runtime.ExitError"

This is on Arch with everything up to date. The build appears to succeed and stack run gives no such error (just missing AWS environment as you'd expect).

@IamfromSpace
Copy link
Collaborator

You should be able to run locally, just like you did with “sam local invoke.”

My shot in the dark guess is that the build environment has a later version of GLIBC than sam local’s emulator image based on this SO answer:

In your case, you built on a newer (GLIBC-2.28) system, and are trying to run on an older one (GLIBC-2.27). That is not guaranteed to work (although it might for sufficiently simple programs).

Can you tell me more about how you built the bootstrap executable? Which stack LTS (or otherwise) are you using? Are you building inside a docker image? If so which one?

Hopefully we can track it down quickly.

@samhh
Copy link
Author

samhh commented Feb 7, 2021

Here's my stack.yaml:

resolver: lts-16.31

extra-deps:
  - envy-1.5.1.0
  - hal-0.4.6
  - megaparsec-9.0.1

docker:
  enable: true

Built with stack build --copy-bins. Doing that, I can see .stack-work/docker/_home/.local/bin/bootstrap, which my template points to.

I'm not interacting with Docker at all beyond the configuration flag to Stack above.

If it helps, some versions on my host machine:

  • SAM: 1.17.0
  • Stack: 2.5.1
  • glibc: 2.33

@IamfromSpace
Copy link
Collaborator

All that makes sense, in that case the build image should be fpco/stack-build:lts-16.31 as I think it will default to that unless you explicitly specify one (I think that will print to console when building, so you can sanity check it if you’d like)

I think the next question is what version of glibc is on that image. The host machine shouldn’t matter based on the config you’ve shown.

Can you try:

docker run fpco/stack-build:lts-16.31 ldd --version

I can give it a go myself later, but I’m restricted to my phone for the moment.

@samhh
Copy link
Author

samhh commented Feb 7, 2021

$ docker run fpco/stack-build:lts-16.31 ldd --version                                                                                   ~
ldd (Ubuntu GLIBC 2.27-3ubuntu1.2) 2.27
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

@samhh
Copy link
Author

samhh commented Feb 7, 2021

As a sanity check I've deleted dist-newstyle and .stack-work and rerun, no change.

Is this part of the sam error relevant?

Skip pulling image and use local one: amazon/aws-sam-cli-emulation-image-provided:rapid-1.17.0

@samhh
Copy link
Author

samhh commented Feb 7, 2021

stack build --verbose printout includes the string "fpco/stack-build:lts-16.31" so that's probably being used as expected.

@IamfromSpace
Copy link
Collaborator

Is this part of the sam error relevant?

Ah, yes, so that’s the image that the lambda is then running in, which sam local is selecting automatically. We should then run the same command for that image and see if it’s a mismatch—specifically if it’s a lower version.

@samhh
Copy link
Author

samhh commented Feb 7, 2021

$ docker run amazon/aws-sam-cli-emulation-image-provided:rapid-1.17.0 ldd --version                                                     ~
ldd (GNU libc) 2.17
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

@samhh
Copy link
Author

samhh commented Feb 7, 2021

Tried this just in case:

$ docker pull amazon/aws-sam-cli-emulation-image-provided                                                                               ~
Using default tag: latest
latest: Pulling from amazon/aws-sam-cli-emulation-image-provided
Digest: sha256:8784f36ae6f73a78479711556b4dac67c3c956df3707a46d45483a2d701ef4d2
Status: Image is up to date for amazon/aws-sam-cli-emulation-image-provided:latest
docker.io/amazon/aws-sam-cli-emulation-image-provided:latest

@IamfromSpace
Copy link
Collaborator

Ah, yeah, that’s very likely our culprit then. Thank you for checking all this btw!

For next steps, we’ll need to see about setting up an image that’s a better match. First, to validate the issue, and second to get some guidance and docs on how to resolve it. That’s unfortunate though, definitely an unfun wrinkle.

I think the best bet is to use the amazon image as a base and then add stack. Trying to downgrade the fpco image probably just leaves the door open for the next mismatch.

I’ve been steadily working to get hal more broadly compatible and get it into current stack LTSes, so definitely a problem I’ll be prioritizing. If you want to continue working on it and tag team a branch, the help would of course be welcome!

@IamfromSpace
Copy link
Collaborator

Also worth noting that if you just need a quick fix and you can use an older stack LTS, you might try downgrading. I haven’t seen this issue before with older LTS images.

Not a recommendation I’m happy with, haha, but if it unblocks you, it might be worth it until there’s a cleaner solve.

@IamfromSpace
Copy link
Collaborator

Much downloading later I've (unsurprisingly) replicated this issue. I was able to create a Dockerfile that seems to build with the same version of glibc. This seems to get beyond that error and execute the binary. I'm getting some new error in decoding Context, but I'm not totally sure yet why.

You can try it out on this branch #fix/sam-local-glibc-version-error now.

  1. build the image via docker build -t "${NAME}:${TAG}" . from the root
  2. Add image: ${NAME}:${TAG} to your docker settings in stack.yaml
  3. stack build

Still needs some digging and then quite a bit of polish, but some progress.

@samhh
Copy link
Author

samhh commented Feb 8, 2021

Really appreciate your support on this mate, cheers.

Yeah, I'd be perfectly content to downgrade my Stack LTS for now, do you have a known good version I can target?

@IamfromSpace
Copy link
Collaborator

I’m beginning to think this is more an issue with sam/its version rather than the stack image.

My Linux host has glibc 2.23 and runs just fine on actual lambda when compiled without docker.

On MacOS, it seems to work without issue with lts-13.22, where the stack build also uses glibc 2.23. I seem to be using an older lambda image, back from lambci/lambda:provided.

Since the current Amazon images only use 2.17, it doesn’t seem like a downgrade does the trick. I’ll dig more into sam though and see what I find. Definitely a priority to support its newest versions.

@IamfromSpace
Copy link
Collaborator

Hmm, so unfortunately there seems to be a string of issues. I don't think sam has been supported since 1.0--though I'm not sure on the exact timeline. Even if the glibc issue where resolved (which looks a bit more complicated, due to the docker images being built somewhat dynamically), then sam currently doesn't pass in all the seemingly required values to build the context.

Notably, the officially supported Rust runtime seems to have the same issues and not support sam after 1.0, so there seems to be a bit of the right hand not knowing what the left is doing. Other users have a similar glibc issue #awslabs/aws-lambda-rust-runtime#17. And both this and the Rust runtime expect that the traceId, functionArn, and deadline are passed in as headers, but sam does not send them. I'm not sure they've even realized that shortcoming yet. For both, it would be a breaking change or special branching when running locally.

Notably, Rust's docs don't mention sam, and steer users instead to directly use the image that sam<1.0 used to use. I think for the time being, this probably needs to recommend the same while chipping away at some of the above issues. Here's their README section on docker usage: https://github.com/awslabs/aws-lambda-rust-runtime#docker

You should be able to use the following to as a local invocation:

echo $MY_EVENT_JSON | docker run -i \
    -e DOCKER_LAMBDA_USE_STDIN=1 \
    --rm \
    -v ${PATH_TO_UNZIPPED_BOOTSTRAP_DIR}:/var/task \
    lambci/lambda:provided

@samhh
Copy link
Author

samhh commented Feb 8, 2021

Running that command gives me:

/var/task/bootstrap: /lib64/libm.so.6: version `GLIBC_2.27' not found (required by /var/task/bootstrap)

The container seems to hang at this point. It did on the first run pull the image anew (sha256:cb4cf37c22d7ae7017193db7fed18dcb9418ddff3af14cd494a1c30637c69875).

@IamfromSpace
Copy link
Collaborator

Huh, I'm a bit stumped. Can you try building the binary with LTS 13.22?

I checked the glibc version in lambci/lambda:provided and I'm getting 2.17. But that doesn't make sense to me because I know none of the images I've used to build go that low. Also, that image is supposed to be an exact mirror, and I'm absolutely running real lambdas built on a host machine with 2.23. Maybe 2.23 is available works despite the ldd reported version?

(I'm downloading the LTS 13.22 image now, but it's quite large, so I'll also post the result of building with it if my DL completes first).

@samhh
Copy link
Author

samhh commented Feb 9, 2021

Success! LTS 13.22 successfully runs locally with the above docker run command. 🎊

Curiously sam local invoke now fails with:

{"stackTrace":[],"errorType":"User","errorMessage":"Runtime Error: Unable to decode Context from event response."}

(Midnight where I am, catch you tomorrow 🙂)

@IamfromSpace
Copy link
Collaborator

IamfromSpace commented Feb 9, 2021

Awesome! I had just arrived at the same result myself 🥳

I feel a bit silly, I did know that ldd --version was just a proxy, but still got burned.

Interestingly enough, the error you got locally (the GLIBC version error) also likely is meaningful, in that I'd expect you'd see the same error if you tried to run it on Lambda.

The issue you see with sam local makes sense from some of the other avenues of investigation. Current sam doesn't supply expected headers to the runtime (but the lambci/lambda image does), so the runtime client fails to construct the Context.

Summary:

  • Newer stack LTS docker images aren't compatible with lambda (probably entirely, TBD). Need documentation on how to build a compatible binary with docker when using a newer LTS
  • sam local is currently unsupported, documentation needs to be updated reflect that and point to a mechanism for local invocation (via docker run)
  • Need a path to support for sam local:
    1. sam local provides these values
    2. the runtime fills in these values if it notices you're using sam local 😢
    3. a breaking change makes these fields a Maybe so the values are optional 😢
  • Better error message on failed Context construction wouldn't hurt

Glad we've got you running locally for the moment at least! Thanks much for the patience and troubleshooting with me :)

@IamfromSpace
Copy link
Collaborator

Alright, got one README update about gotchas done, and #81 adds a Dockerfile and some docs to use for building out the latest LTSes. I'd be curious to get your review and see if that image works for you!

@samhh
Copy link
Author

samhh commented Feb 16, 2021

I believe there's an issue with the documentation update where docker -t <etc> should be docker build -t <etc>.

Aside from that, the good news: I was able to successfully build and invoke with 8.6.5 (LTS 13.22).

The bad news: 8.10.4 (LTS 17.4) gives me the following build error:

cryptonite       >
cryptonite       > /tmp/stack-1f53bb49ebf9de41/cryptonite-0.27/In file included from cbits/aes/x86ni.h:38:0: error:
cryptonite       >     0,
cryptonite       >                      from cbits/aes/gf.c:35:
cryptonite       >
cryptonite       > /tmp/stack-1f53bb49ebf9de41/cryptonite-0.27//usr/lib/gcc/x86_64-amazon-linux/4.8.5/include/wmmintrin.h:34:3: error:
cryptonite       >      error: #error "AES/PCLMUL instructions not enabled"
cryptonite       >      # error "AES/PCLMUL instructions not enabled"
cryptonite       >        ^
cryptonite       >    |
cryptonite       > 34 | # error "AES/PCLMUL instructions not enabled"
cryptonite       >    |   ^
cryptonite       >
cryptonite       > /tmp/stack-1f53bb49ebf9de41/cryptonite-0.27/In file included from cbits/aes/x86ni.h:39:0: error:
cryptonite       >     0,
cryptonite       >                      from cbits/aes/gf.c:35:
cryptonite       >
cryptonite       > /tmp/stack-1f53bb49ebf9de41/cryptonite-0.27//usr/lib/gcc/x86_64-amazon-linux/4.8.5/include/tmmintrin.h:31:3: error:
cryptonite       >      error: #error "SSSE3 instruction set not enabled"
cryptonite       >      # error "SSSE3 instruction set not enabled"
cryptonite       >        ^
cryptonite       >    |
cryptonite       > 31 | # error "SSSE3 instruction set not enabled"
cryptonite       >    |   ^
cryptonite       > `gcc' failed in phase `C Compiler'. (Exit code: 1)cryptonite       >
cryptonite       > /tmp/stack-1f53bb49ebf9de41/cryptonite-0.27/In file included from cbits/aes/x86ni.h:38:0: error:
cryptonite       >     0,
cryptonite       >                      from cbits/aes/gf.c:35:
cryptonite       >
cryptonite       > /tmp/stack-1f53bb49ebf9de41/cryptonite-0.27//usr/lib/gcc/x86_64-amazon-linux/4.8.5/include/wmmintrin.h:34:3: error:
cryptonite       >      error: #error "AES/PCLMUL instructions not enabled"
cryptonite       >      # error "AES/PCLMUL instructions not enabled"
cryptonite       >        ^
cryptonite       >    |
cryptonite       > 34 | # error "AES/PCLMUL instructions not enabled"
cryptonite       >    |   ^
cryptonite       >
cryptonite       > /tmp/stack-1f53bb49ebf9de41/cryptonite-0.27/In file included from cbits/aes/x86ni.h:39:0: error:
cryptonite       >     0,
cryptonite       >                      from cbits/aes/gf.c:35:
cryptonite       >
cryptonite       > /tmp/stack-1f53bb49ebf9de41/cryptonite-0.27//usr/lib/gcc/x86_64-amazon-linux/4.8.5/include/tmmintrin.h:31:3: error:
cryptonite       >      error: #error "SSSE3 instruction set not enabled"
cryptonite       >      # error "SSSE3 instruction set not enabled"
cryptonite       >        ^
cryptonite       >    |
cryptonite       > 31 | # error "SSSE3 instruction set not enabled"
cryptonite       >    |   ^
cryptonite       > `gcc' failed in phase `C Compiler'. (Exit code: 1)

[...]

--  While building package cryptonite-0.27 (scroll up to its section to see the error) using:
      $HOME/.stack/setup-exe-cache/x86_64-linux-dkbd094c7ae28d0fb17a939b9c1227ee27/Cabal-simple_mPHDZzAJ_3.2.1.0_ghc-8.10.4 --builddir=.stack-work/dist/x86_64-linux-dkbd094c7ae28d0fb17a939b9c1227ee27/Cabal-3.2.1.0 build --ghc-options " -fdiagnostics-color=always"
    Process exited with code: ExitFailure 1

@IamfromSpace
Copy link
Collaborator

Ah, good catch on the missing build argument. I've added a comment to make sure that gets fixed.

I did see the cryptonite error with 17.2 as well. And there's some guidance around this in cryptonite's README (and it's popped up in a bunch of their issues #haskell-crypto/cryptonite#324, #haskell-crypto/cryptonite#326, #haskell-crypto/cryptonite#332) so it appears like it's had a fairly broad effect.

Seems like the docker image either needs gcc >4.9 (which it at least is a build only dependency), or we need to pass the flag to cabal. Seem like stack supports this: https://docs.haskellstack.org/en/stable/nonstandard_project_init/#passing-flags-to-cabal

Let me know if this help uncover anything; I'll do some more investigation shortly too.

@IamfromSpace
Copy link
Collaborator

I was able to get 17.4 to work by adding the cryptonite flags to my stack.yaml:

flags:
  cryptonite:
    use_target_attributes: false

However, this isn't ideal, because it's yet another thing for a user to have to do themselves.

I'm also planning to look into getting the Dockerfile to use gcc 4.9.x (vs 4.8.5 that it gets via yum), which seems like it would be less work overall.


Then while writing this I realized that cryptonite really shouldn't be needed by hal at all, since all runtime interactions use HTTP (not S) since the connection is totally local. It appears that cryptonite comes in through http-conduit, and I think at this point only sill convenience methods are left over from the original work to dislodge the manager with the default timeout.

So the easiest answer yet may be to completely remove cryptonite as a dependency. However, this won't help anyone who actually does need it. Hmm.

@samhh
Copy link
Author

samhh commented Feb 17, 2021

Can confirm that the cryptonite flag succeeds as a workaround on my end, successfully built and invoked on 17.4! 😃

@samhh samhh closed this as completed Feb 17, 2021
@samhh samhh reopened this Feb 17, 2021
@samhh
Copy link
Author

samhh commented Feb 17, 2021

(Didn't mean to close 😅)

@samhh
Copy link
Author

samhh commented Feb 17, 2021

Would you have any interest in pushing that Dockerfile as a built image to Docker Hub? I've realised I need to use it in CI as well, so I could push it myself but I haven't actually needed to make any changes to it, and others may also find it useful.

@IamfromSpace
Copy link
Collaborator

Just opened #83 which should also help a bit (and is just generally a good idea).

It's not really going to be possible to predict how to make any and every possible package build in a docker container, so I'm not sure how tractable it is to try and keep up with it in this project. Though, something like cryptonite will be broadly used, so it still may make sense. It also may just make sense to add a section in the README with tricks like adding flags for common packages with built trouble.

I'm not sure I'm ready to go much further than simply providing an example Dockerfile at this point. I think that's clearly pretty important at this point, so people are steered towards success, but I'm not sure I want to commit to maintaining such an effort until I have a better idea of what kind of commitment it would require to do well.

@MarcCoquand
Copy link

Hi,

I'm trying to apply the use_target_attributes: false using lts-16.12 (latest fpco/stack docker image) and it's giving me the error

- Package 'cryptonite' does not define the following flags (specified in stack.yaml):

Any hints on how to solve this 😓 ?

@IamfromSpace
Copy link
Collaborator

Hi @MarcCoquand in Stack LTS 16.12 cryptonite is version 0.26, which shouldn’t need the flag to build correctly (and apparently can’t be included at all).

Do you see an error if you omit the flag?

@MarcCoquand
Copy link

MarcCoquand commented Mar 15, 2021

Ah no, I scrolled through this thread too quickly without reading and thought this was a fix for glibc issue. I am having the error with GLIBC_2.27 not working so I downgraded to LTS-13.22 and then it worked again. Sadly, this broke haskell-language-server so I was looking for ways to change to the latest version again.

I guess in order to make HAL work with the latest version of GHC, I'll need to publish a dockerfile with a distribution that comes with glibc2.27 that I can then point stack to?

@samhh
Copy link
Author

samhh commented Mar 15, 2021

As an aside, does HLS work for you with a Dockerised Stack setup? I was under the impression there wasn't support for that (yet).

@IamfromSpace
Copy link
Collaborator

Understood! Lots of troubleshooting in this thread, so the conclusion is sort of hard to draw. The README/etc should hopefully get everything learned, but probably best to eventually close this with a short summary for other folks who end up here after finding the issue.

I'll need to use a normal dockerfile with a distribution that comes with glibc2.27?

Interestingly enough, it’s sort of the opposite. The lambda env doesn’t have 2.27, so we need a docker file that also doesn’t (unlike the default stack build images).

There’s an example Dockerfile in progress on #81 that should do the trick.

@MarcCoquand
Copy link

Awesome, can confirm that with the new dockerfile it built without a problem.

Now my issue with haskell-language-platform and stack docker still persists though... But that's an issue with hlp.

@AlexeyRaga
Copy link

We are using somehow simpler solution.

GLIBC problem doesn't exist if an "older" version of OS is used, for example, ubuntu 16.04.

So we use a build container that is based on 16.04, for example, quay.io/haskell_works/ghc-8.10.1:ubuntu-16.04 and the binary can then be used in AWS Lambda directly.

Take this example as an illustration:

$ mkdir ./bin
$ docker pull quay.io/haskell_works/ghc-8.10.1:ubuntu-16.04
$ docker run --rm \
            - v .bin:/opt/bin \
            -v $PWD:$PWD \
            -it quay.io/haskell_works/ghc-8.10.1:ubuntu-16.04 \
            /bin/sh -c "cd $PWD && cabal update && cabal install --installdir /opt/bin --install-method=copy"

$ echo $MY_EVENT_JSON | docker run --rm -i -e DOCKER_LAMBDA_USE_STDIN=1 -v $PWD/bin:/var/task lambci/lambda:provided

START RequestId: 56bc0919-8692-1775-01c7-fb7cad59e3d2 Version: $LATEST

I hope that it helps.

@IamfromSpace
Copy link
Collaborator

I'd still really like to get a simple elegant approach to this long term, as I feel like does have an ergonomic impact. One thing I wanted to get down is that copyright/copyleft issues of complete static linking (like including glibc, musl, or etc) shouldn't be an issue for lambdas. This does get controversial, but practically, the copy right concerns are only a factor in distribution of statically linked software. Anyone running a lambda is just running the statically linked software, which is within the license. This means that with static linking, a brief warning about copyright concerns is probably above and beyond. More to do in looking down this path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants