Skip to content

Alpha v0.2

Pre-release
Pre-release
Compare
Choose a tag to compare
@LouisCastricato LouisCastricato released this 21 Oct 22:20
06cd30f

Complete revamp of our initial release.

New features:

  • Hydra models, 20x faster than vanilla PPO with minimal performance hits at large scales
  • Massively revamped API, significantly less boiler plate.
  • Save/load callbacks.
  • Greatly improved orchestrator.
  • Better commented RL code, easier to understand whats going on.
  • Cool examples, including architext and simulacra.
  • Better extendability, and standardized styling.

Features coming soon:

  • Megatron support! we're already working on this.
  • More interesting examples that are relevant to production use cases of TRLX.
  • Better integration of W&B, including sweeps.
  • Evaluation and benchmarking.

:)

Autogenerated release notes below:

What's Changed

New Contributors

Full Changelog: https://github.com/CarperAI/trlx/commits/v0.2