You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On certain prompts, the LLM can spiral into an infinite loop providing the same item repeatedly, until stopped by max_tokens parameter.
In that case, the JSON will fail with an exception as being invalid, without returning any result.
Llama.cpp and MLX-LM have parameters to penalize repetition and thus preventing it.
While Outlines accept additional parameters to pass to Llama.cpp, it does not for MLX-LM,
resulting in prompt failure.
Describe the issue as clearly as possible:
On certain prompts, the LLM can spiral into an infinite loop providing the same item repeatedly, until stopped by max_tokens parameter.
In that case, the JSON will fail with an exception as being invalid, without returning any result.
Llama.cpp and MLX-LM have parameters to penalize repetition and thus preventing it.
While Outlines accept additional parameters to pass to Llama.cpp, it does not for MLX-LM,
resulting in prompt failure.
long_42k_llm_prompt.md
Steps/code to reproduce the bug:
Expected result:
Error message:
No response
Outlines/Python version information:
Version information
Context for the issue:
No response no
The text was updated successfully, but these errors were encountered: