Actions: EleutherAI/lm-evaluation-harness
Actions
3,100 workflow runs
3,100 workflow runs
metrics
and filter
to logged sample (#2517)
Unit Tests
#3779:
Commit 5680a2e
pushed
by
baberabb
metrics
and filter
to logged sample
Unit Tests
#3777:
Pull request #2517
synchronize
by
baberabb
metrics
and filter
to logged sample
Unit Tests
#3776:
Pull request #2517
synchronize
by
baberabb
metrics
and filter
to logged sample
Unit Tests
#3775:
Pull request #2517
synchronize
by
baberabb
metrics
and filter
to logged sample
Unit Tests
#3774:
Pull request #2517
synchronize
by
baberabb
--examples
Argument for Fine-Grained Task Evaluation in lm-evaluation-harness
. This feature is the first step towards efficient multi-prompt evaluation with PromptEval [1,2]
Unit Tests
#3770:
Pull request #2520
synchronize
by
mirianfsilva
--examples
Argument for Fine-Grained Task Evaluation in lm-evaluation-harness
. This feature is the first step towards efficient multi-prompt evaluation with PromptEval [1,2]
Unit Tests
#3769:
Pull request #2520
opened
by
felipemaiapolo
until
Unit Tests
#3763:
Pull request #2518
synchronize
by
baberabb
until
Unit Tests
#3762:
Pull request #2518
opened
by
baberabb
metrics
and filter
to logged sample
Unit Tests
#3761:
Pull request #2517
synchronize
by
baberabb
metrics
and filter
to logged sample
Unit Tests
#3760:
Pull request #2517
opened
by
baberabb