The Surprisal Toolkit is a web-based user interface for computing surprisal measurements over text through the languagemodels toolkit.
Developed using: Angular 14.2.2 frontend, Python Flask 2.2.2 backend.
Contents
(1) Create an empty project folder. Inside this folder, clone (if contributing) or download (if using only) the surprisal-toolkit and languagemodels repositories:
git clone https://github.com/uds-lsv/surprisal-toolkit.git
git clone https://github.com/uds-lsv/languagemodels.git
(2) Inside the main folder, create and activate a virtual environment using either:
-
Python venv
- Create a new virtual environment:
python3 -m venv ./languagemodels-venv
- Activate the virtual environment:
source languagemodels-venv/bin/activate
- Upgrade pip:
pip install --upgrade pip
- Create a new virtual environment:
-
Python miniconda
- Create a new virtual environment:
conda create --name languagemodels python=3.9
- Activate the virtual environment:
conda activate languagemodels
- Upgrade pip:
pip install --upgrade pip
- Create a new virtual environment:
With the virutal environment activated, install the requirements in the following steps.
(3) Install the appropriate version of PyTorch: https://pytorch.org/get-started/locally/
(4) Install the languagemodels package & requirements: pip install -e languagemodels
(5) Install the requirements for surprisal-toolkit: pip install -r surprisal-toolkit/requirements.txt
(6) Only if developing or modifying with Angular: Navigate to folder surprisal-toolkit. Install npm dependencies: npm install .
-
Activate the virtual environment, if not done so already:
source languagemodels-venv/bin/activate
orconda activate languagemodels
. -
From the command line, navigate to folder surprisal-toolkit.
-
Optional. (If needed to re-compile
dist/
after changes. Otherwise, skip.) Runng build --configuration production --build-optimizer
.(This creates the
dist/
folder in the Angular project directory, from which the Flask backend renders the built files.)This command only needs to be run once. If changes are made to
surprisal-toolkit/src/
, run again to updatedist/
. -
Run
python3 backend/main.py
. -
In a broswer, navigate to
http://localhost:5000/
.
(Use Control+C to stop running the application.)
Note: this will not connect to functionality in Python back-end files.
-
Navigate to parent folder.
-
Run
ng serve
for a dev server. -
Navigate to
http://localhost:4200/
. The application will automatically reload if you change any of the source files in src/.
(Use Control+C to stop running the application.)
The application can also be hosted on a web server. We use Apache 2.4 and mod_wsgi 4.7.1. As a reference, files for configuring the Flask application are stored under \server
.
This code is described in our paper, "An Interactive Toolkit for Approachable NLP" by AriaRay Brown, Julius Steuer, Marius Mosbach, and Dietrich Klakow, presented in the TeachNLP Workshop at ACL 2024.
The Surprisal Toolkit can be used as a resource for teaching information theory along with the calculation of surprisal from large language models. We share our teaching materials here for your interest.