Infinite repetitions and invalid JSON - Outlines with MLX #1131

ea167 · 2024-09-05T21:55:13Z

Describe the issue as clearly as possible:

On certain prompts, the LLM can spiral into an infinite loop providing the same item repeatedly, until stopped by max_tokens parameter.

In that case, the JSON will fail with an exception as being invalid, without returning any result.

Llama.cpp and MLX-LM have parameters to penalize repetition and thus preventing it.
While Outlines accept additional parameters to pass to Llama.cpp, it does not for MLX-LM,
resulting in prompt failure.

long_42k_llm_prompt.md

Steps/code to reproduce the bug:

RESULTS_JSON_SCHEMA = """{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
 "results": {
  "type": "array",
  "items": {
   "type": "string"
  }
 }
},
"required": ["results"],
"additionalProperties": false
}"""
 
 
from outlines import models, generate, samplers
import json
 
model = models.mlxlm("mlx-community/Meta-Llama-3.1-8B-Instruct-4bit")
sampler = samplers.multinomial( top_p=0.1 )
generator = generate.json( model, RESULTS_JSON_SCHEMA, sampler )
 
json_answer = generator( long_42k_llm_prompt, max_tokens=1000 )
print( json.dumps( json_answer, indent=4 ) )

Expected result:

List without endless repetition at the end.

When running directly MLX-LM, we get an infinite loop, stopped by max_tokens only

python -m mlx_lm.generate --model mlx-community/Meta-Llama-3.1-8B-Instruct-4bit --prompt "$(< ~/Downloads/long_42k_llm_prompt.md)" --max-tokens 5000

...
687. **Methodist Hospital**
688. **Methodist Hospital**
689. **Methodist Hospital**
690. **Methodist Hospital**
691. **Methodist Hospital**
692. **Methodist Hospital**
693. **Methodist Hospital**
694. **Methodist Hospital**
695. **Methodist Hospital**
696. **Methodist Hospital**
697. **Methodist Hospital**

==========
Prompt: 11380 tokens, 432.382 tokens-per-sec
Generation: 5000 tokens, 26.872 tokens-per-sec
Peak memory: 6.891 GB

Error message:

No response

Outlines/Python version information:

Version information

0.0.47.dev69+g72377db
Python 3.12.4
mlx==0.17.2
mlx-lm==0.18.1

Context for the issue:

No response no

The text was updated successfully, but these errors were encountered:

ea167 · 2024-09-06T04:20:07Z

I created the PR #1134 to fix the problem.

Please review it and merge it.

ea167 added the bug label Sep 5, 2024

ea167 linked a pull request Sep 6, 2024 that will close this issue

Update of MLX-LM generate_step to support repetition_penalty #1134

Open

rlouf added the JSON label Sep 9, 2024 — with Linear

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Infinite repetitions and invalid JSON - Outlines with MLX #1131

Infinite repetitions and invalid JSON - Outlines with MLX #1131

ea167 commented Sep 5, 2024 •

edited by rlouf

Loading

ea167 commented Sep 6, 2024

Infinite repetitions and invalid JSON - Outlines with MLX #1131

Infinite repetitions and invalid JSON - Outlines with MLX #1131

Comments

ea167 commented Sep 5, 2024 • edited by rlouf Loading

Describe the issue as clearly as possible:

Steps/code to reproduce the bug:

Expected result:

Error message:

Outlines/Python version information:

Context for the issue:

ea167 commented Sep 6, 2024

ea167 commented Sep 5, 2024 •

edited by rlouf

Loading