Skip to content

Commit

Permalink
chore: refactor the code, remove data files
Browse files Browse the repository at this point in the history
  • Loading branch information
honghanhh committed Oct 31, 2024
1 parent 95a4bd6 commit 034c49b
Show file tree
Hide file tree
Showing 5 changed files with 4 additions and 6,770 deletions.
7 changes: 3 additions & 4 deletions lib/questions_eval/bash/experiments/tiny_mimoracle.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
# python run_mimoracle.py -m model=gpt-4o samples=18 num_questions=5
# python run_mimoracle.py -m model=gpt-4o-mini samples=18 num_questions=5
# python run_mimoracle.py -m model=llama3.1-405b-local samples=18 num_questions=5
python run_mimoracle.py -m model=gpt-4o samples=2 num_questions=2
python run_mimoracle.py -m model=gpt-4o samples=18 num_questions=5
python run_mimoracle.py -m model=gpt-4o-mini samples=18 num_questions=5
python run_mimoracle.py -m model=llama3.1-405b-local samples=18 num_questions=5
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ def _preprocess(text: str) -> str:
def _resample(df: pd.DataFrame, n_sample: int, n_section: int) -> pd.DataFrame:
patterns = "allergies|history of present illness|past medical history|\
discharge medications|social history|medications on admission"
df["section_title"] = [_preprocess(x) for x in df["section_title"]]
df["section_title"] = df["section_title"].apply(_preprocess)
df = df[df.section_title.str.contains(patterns)]
df = df.groupby("section_title").filter(lambda x: len(x) > n_sample)
df = df.groupby("document_id").filter(lambda x: len(x) == n_section)
Expand Down
Loading

0 comments on commit 034c49b

Please sign in to comment.