Skip to content

llamafile v0.8.16

Latest
Compare
Choose a tag to compare
@jart jart released this 02 Nov 03:42
· 56 commits to main since this release
011d720
  • Add Julia syntax highlighting support
  • Fix possible crash on Windows due to MT bug
  • Improve accuracy of chatbot context window management
  • The new llamafiler server now supports GPU. Pass the -ngl 999 flag.
  • The new llamafiler server's /v1/chat/completions endpoint now supports prompt caching. It may be configured using the --slots COUNT and --ctx-size TOKENS flags.