Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch dependency to browsergym-core #5242

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Conversation

enyst
Copy link
Collaborator

@enyst enyst commented Nov 24, 2024

End-user friendly description of the problem this fixes or functionality that this introduces

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

Give a summary of what the PR does, explaining any non-trivial design decisions

This PR proposes to use only browsergym-core for browsing, if we can. The full browsergym brings some heavy dependencies in upcoming updates.

Testing:

AgentFinishAction(outputs={'content': 'Summary of quality and cost information of various language models:\n\n1. Claude 3.5 Sonnet: Best performance with a 27% resolve rate.\n2. GPT-4o: Lags behind Claude 3.5 Sonnet.\n3. o1-mini: Performed worse than GPT-4o, sometimes "overthinking" tasks.'}, thought='', action='finish')

Link of any specific issues this addresses


To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:13142fb-nikolaik   --name openhands-app-13142fb   docker.all-hands.dev/all-hands-ai/openhands:13142fb

@enyst
Copy link
Collaborator Author

enyst commented Nov 25, 2024

@openhands-agent This PR's CI jobs give the error:

Run poetry install --without evaluation,llama-index
Installing dependencies from lock file

pyproject.toml changed significantly since poetry.lock was last generated. Run poetry lock [--no-update] to fix the lock file.

Fix this. You should only run this, and commit the poetry file.

Do not attempt to test and no NOT update packages. Keep it minimal. Just fix the file.

Copy link
Contributor

OpenHands started fixing the pr! You can monitor the progress here.

@enyst
Copy link
Collaborator Author

enyst commented Nov 25, 2024

From the logs:

01:26:25 - openhands:ERROR: resolve_issue.py:260 - Failed to parse success_explanation as JSON: The feedback has been successfully incorporated. The AI agent correctly:

  1. Understood the specific CI error about the poetry.lock file being out of sync
  2. Followed the exact instructions to only run poetry lock --no-update without making any other changes
  3. Committed only the regenerated poetry.lock file
  4. Did not attempt to update packages or run tests as specifically instructed

This focused response directly addresses the CI error while following the constraints given in the feedback. You can verify this worked once the CI runs again with the updated poetry.lock file.

01:26:25 - openhands:INFO: resolve_issue.py:273 - I have updated the PR and resolved some of the issues that were cited in the pull request review. Specifically, I identified the following revision requests, and all the ones that I think I successfully resolved are checked off. All the unchecked ones I was not able to resolve, so manual intervention may be required:

  • [X]: The feedback has been successfully incorporated. The AI agent correctly:
  1. Understood the specific CI error about the poetry.lock file being out of sync
  2. Followed the exact instructions to only run poetry lock --no-update without making any other changes
  3. Committed only the regenerated poetry.lock file
  4. Did not attempt to update packages or run tests as specifically instructed

This focused response directly addresses the CI error while following the constraints given in the feedback. You can verify this worked once the CI runs again with the updated poetry.lock file.

@enyst
Copy link
Collaborator Author

enyst commented Nov 25, 2024

@openhands-agent This PR's CI fails. Read the entire log of the failure:
https://github.com/All-Hands-AI/OpenHands/actions/runs/12001506032/job/33452448659?pr=5242

You need to know that this PR changes the dependency of openhands from browsergym to browsergym-core. Libraries that were part of browsergym, like browsergym-miniwob, are now optional. They are now part of a poetry "evaluation" group, and so they are not installed by "poetry install --without evaluation, llama-index".

Read the log carefully. There was a test for browsergym with miniwob, but we now have a pytest skip annotation on it. Yet, the run still fails.

Things to check:

  • is the annotation working as expected?
  • is browser_env initialized with an environment (miniwob, etc, are eval environments) that it cannot import anymore?
  • is there another occurrence of a browsergym env somewhere? (the envs are optional now, so not installed)
  • what is the root cause of the docker failure?

Use your tools to navigate the codebase and understand how browsergym is used. Find any occurrences of eval envs and analyze them for the potential to fail when run without the envs.

Then propose a fix for the issue.

IMPORTANT: You have access to a GITHUB_TOKEN and so you can use the GitHub API to read CI runs logs, PRs diffs, etc.

@enyst enyst added the fix-me Attempt to fix this issue with OpenHands label Nov 25, 2024
@All-Hands-AI All-Hands-AI deleted a comment from openhands-agent Nov 25, 2024
@enyst enyst added the lint-fix label Nov 25, 2024
@enyst
Copy link
Collaborator Author

enyst commented Nov 25, 2024

@openhands-agent The CI on this PR still fails, even though we tried to fix it multiple times.

The failing test is test_browsergym_eval_env. Read it, and read all the file it is in.

Lets make an experiment. Comment it out. Only this test, not the others in the same file. Make sure you do comment out its annotation too.

That's it. You do not need to do anything more. I will run CI.

Copy link
Contributor

OpenHands started fixing the pr! You can monitor the progress here.

@All-Hands-AI All-Hands-AI deleted a comment from openhands-agent Nov 25, 2024
@enyst
Copy link
Collaborator Author

enyst commented Nov 25, 2024

@openhands-agent Python lint is failing. Here is the problem: tests/runtime/test_browsing.py.

You know how to run lint on this project. Run it. Ruff will fix it, and you can commit its changes.

Copy link
Contributor

OpenHands started fixing the pr! You can monitor the progress here.

@All-Hands-AI All-Hands-AI deleted a comment from openhands-agent Nov 25, 2024
@enyst
Copy link
Collaborator Author

enyst commented Nov 25, 2024

@openhands-agent Python unit tests fail on this PR. The PR number is 5242.

IMPORTANT:
You have access to a GITHUB_TOKEN, and so you can use it to work with the GitHub API, such as reading all CI jobs that failed on PR 5242, and see their logs. Fix the failing test.

IMPORTANT:
You can use the github API to get the workflows runs. You absolutely must prefer it, so try in more than one way.

If you fail again doing that, then you can set up locally, but you must use: poetry install --without evaluation, llama-index

@All-Hands-AI All-Hands-AI deleted a comment from github-actions bot Nov 25, 2024
@All-Hands-AI All-Hands-AI deleted a comment from github-actions bot Nov 25, 2024
@All-Hands-AI All-Hands-AI deleted a comment from github-actions bot Nov 25, 2024
@All-Hands-AI All-Hands-AI deleted a comment from github-actions bot Nov 25, 2024
@All-Hands-AI All-Hands-AI deleted a comment from github-actions bot Nov 25, 2024
@All-Hands-AI All-Hands-AI deleted a comment from github-actions bot Nov 25, 2024
@All-Hands-AI All-Hands-AI deleted a comment from github-actions bot Nov 25, 2024
@enyst
Copy link
Collaborator Author

enyst commented Nov 25, 2024

@openhands-agent Python unit tests fail on this PR. The PR number is 5242. You can get the workflow runs logs to see the failing tests.

IMPORTANT:
You have access to a GITHUB_TOKEN, and so you can use it to work with the GitHub API, such as reading all CI jobs that failed on PR 5242, and see their logs. Fix the failing test.

IMPORTANT:
You can use the github API to get the workflows runs. You absolutely must prefer it, so try in more than one way.

If you fail again doing that, then you can set up locally, but you must use:
poetry install --without evaluation, llama-index

Copy link
Contributor

OpenHands started fixing the pr! You can monitor the progress here.

@All-Hands-AI All-Hands-AI deleted a comment from openhands-agent Nov 25, 2024
@All-Hands-AI All-Hands-AI deleted a comment from github-actions bot Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix-me Attempt to fix this issue with OpenHands lint-fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants