Skip to content

Latest commit

 

History

History
65 lines (47 loc) · 3.36 KB

experiments-bpr.md

File metadata and controls

65 lines (47 loc) · 3.36 KB

Pyserini: Reproducing BPR Results

Binary passage retriever (BPR) is a two-stage ranking approach that represents the passages in both binary codes and dense vectors for memory efficiency and effectiveness.

Ikuya Yamada, Akari Asai, Hannaneh Hajishirzi. Efficient Passage Retrieval with Hashing for Open-domain Question Answering. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 979-986, 2021.

We have replicated BPR's results and incorporated the model into Pyserini. To be clear, we started with model checkpoint and index releases in the official BPR repo and did not train the query and passage encoders from scratch.

This guide provides instructions to reproduce the BPR's results.

Summary

Here's how our results stack up against results reported in the paper using the BPR model (index 2.3 GB + model 0.4 GB):

Dataset Method Top-20 (orig) Top-20 (us) Top-100 (orig) Top-100 (us)
NQ BPR 77.9 77.9 85.7 85.7
NQ BPR w/o reranking 76.5 76.0 84.9 85.0

Natural Questions (NQ) with BPR

BPR with brute-force index:

python -m pyserini.search.faiss \
  --index wikipedia-dpr-100w.bpr-single-nq \
  --topics dpr-nq-test \
  --encoded-queries bpr_single_nq-nq-test \
  --output runs/run.bpr.rerank.nq-test.nq.hash.trec \
  --batch-size 512 --threads 16 \
  --hits 100 --binary-hits 1000 \
  --searcher bpr --rerank

The option --encoded-queries specifies the use of encoded queries (i.e., queries that have already been converted into dense vectors and cached).

To evaluate, first convert the TREC output format to DPR's json format:

python -m pyserini.eval.convert_trec_run_to_dpr_retrieval_run \
  --index wikipedia-dpr \
  --topics dpr-nq-test \
  --input runs/run.bpr.rerank.nq-test.nq.hash.trec \
  --output runs/run.bpr.rerank.nq-test.nq.hash.json

python -m pyserini.eval.evaluate_dpr_retrieval \
  --retrieval runs/run.bpr.rerank.nq-test.nq.hash.json \
  --topk 20 100

Results:

Top20  accuracy: 0.7792
Top100 accuracy: 0.8571

Reproduction Log*