Actions: EleutherAI/lm-evaluation-harness
Actions
3,053 workflow runs
3,053 workflow runs
--examples
Argument for Fine-Grained Task Evaluation in lm-evaluation-harness
. This feature is the first step towards efficient multi-prompt evaluation with PromptEval [1,2]
Unit Tests
#3770:
Pull request #2520
synchronize
by
mirianfsilva
--examples
Argument for Fine-Grained Task Evaluation in lm-evaluation-harness
. This feature is the first step towards efficient multi-prompt evaluation with PromptEval [1,2]
Unit Tests
#3769:
Pull request #2520
opened
by
felipemaiapolo
until
Unit Tests
#3763:
Pull request #2518
synchronize
by
baberabb
until
Unit Tests
#3762:
Pull request #2518
opened
by
baberabb
metrics
and filter
to logged sample
Unit Tests
#3761:
Pull request #2517
synchronize
by
baberabb
metrics
and filter
to logged sample
Unit Tests
#3760:
Pull request #2517
opened
by
baberabb