Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I get the dataset? #2

Open
hxiaoj opened this issue Sep 16, 2024 · 1 comment
Open

How can I get the dataset? #2

hxiaoj opened this issue Sep 16, 2024 · 1 comment

Comments

@hxiaoj
Copy link

hxiaoj commented Sep 16, 2024

how can I get the dataset? I have download the dataset from eedi but i don't know how to process it

@alexscarlatos
Copy link
Contributor

alexscarlatos commented Sep 30, 2024

Hi! Apologies for the late response. The data files (.csv) should have the following columns:

  • id: the unique question ID
  • question: the question text
  • correct_option: a JSON object containing:
    • option: the correct option text
    • explanation: the textual solution for the question
  • construct_info: a JSON object containing:
    • construct1: a list for the top-level construct: [construct ID: int, construct description: str]
    • construct2: a list for the mid-level construct: [construct ID: int, construct description: str]
    • construct3: a list for the low-level construct: [construct ID: int, construct description: str]
  • distractors: a JSON list of 3 objects (one for each distractor), each containing:
    • option_idx: the index of the option in the question (1-4)
    • option: the distractor option text
    • explanation: the feedback/explanation text for the distractor
    • proportion: the percentage of students that chose this option (set to 0 if unknown)
    • misconception: the text for the misconception for this distractor (optional, only needed for misconception-based ICL prompts or rule-based method)

You can specify the train and test file paths using the data.trainFilepath and data.testFilepath command line arguments. Hope that helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants