Interactive HTML Intersecting Texts Visualizer

This script generates an interactive HTML visualization of intersecting texts, featuring word counters, hover effects, and dynamic highlighting. It now supports multiple file formats and URL-based input.

Example output

https://genaforvena.github.io/tikziod-thinking/

Scum Manifesto - The only thing worth reading. The rest are paste.
Scum Manifesto - degendered - Removing all gender related words from Scum Manifesto make it even more true.
Tolstoy's Gospel - A visualization of the Gospel by Tolstoy in correct form .
Minima Moralia - A visualization of the book "Minima Moralia" by Theodor W. Adorno.
LP Tractatus - Young Wittgenstein's Tractatus Logico-Philosophicus.
Worstward Hoe by Beckett - A visualization of the book "Worstward Hoe" by Samuel Beckett.
Илья Масодов - Мрак твоих глаз - Илья Масодов "Мрак твоих глаз" (a masterpiece in russian, yet to be translated into English).
Ilya Masodov - The darkness of your eyes - The masterpiece from above in English, roughly translated with small llms (low quality, yet to be edited).
Anti-Oedipus to check the visualization of the book "Anti-Oedipus" by Deleuze and Guattari. It is large and may take a while to load.
Finnegans Wake - A visualization of the book "Finnegans Wake" by James Joyce.

Supported File Formats

The tool supports the following file formats:

Plain text (.txt)
PDF (.pdf)
Microsoft Word (.docx)
Markdown (.md)
HTML (.html, .htm)

Basic Usage

The script supports three input methods:

Local File Input

Prepare your input file in any of the supported formats.
Run the script with the file input option:
```
python main.py -f input.txt
```
Replace input.txt with your file name and appropriate extension.

Direct Text Input

You can provide texts directly as command-line arguments:

python main.py -t "First text here" "Second text here" "Third text here"

Remote File Download

You can now provide a URL to download and process a file:

python main.py -u https://example.com/path/to/document.pdf

The script will download the file, detect its format, and process it accordingly.

Viewing and Interacting with the Visualization

After running the script:

An interactive HTML file named index.html will be generated in the docs folder.
Open this file in any modern web browser to view the visualization.
The visualization offers a rich set of interactive features:
- Word Selection: Click on any word to highlight all its occurrences across the text(s).
- Word Controls: For each selected word, a control panel appears with options to remove, strike out, or navigate between occurrences.
- Frequency Slider: Use the slider to hide less frequent words, dynamically updating the visualization.
- Hidden Words Popup: View a list of words hidden by the frequency slider.
- Search Functionality: Use the search bar to find specific words or phrases in the text.
- Shareable State: Generate a shareable link that captures the current state of your visualization.
Additional Interactive Elements:
- Hover over words to see their frequency across all texts.
- The font size of words reflects their frequency or importance in the text.

Advanced Features

Multi-format Support: The tool can process various text formats, automatically detecting and handling the file type.
Remote File Processing: Ability to download and process files from URLs, expanding the range of accessible texts.
Natural Language Processing: Utilizes NLTK for advanced text tokenization and analysis.
LaTeX Integration: Uses Jinja2 for potential LaTeX template rendering, useful for academic or scientific texts.

Customization

You can customize the visualization by modifying the interactive.js file:

Adjust the color scheme for highlighted words
Modify the behavior of word selection and navigation
Add new interactive features or buttons
Customize the styling of various elements (words, control panels, popups)

Dependencies

To run this script with all features, you need to have the following Python libraries installed:

numpy (1.21.0): For numerical computations
matplotlib (3.4.2): For data visualization
nltk (3.6.2): For natural language processing
PyPDF2 (3.0.1): For PDF file support
python-docx (0.8.11): For DOCX file support
markdown (3.3.4): For Markdown file support
beautifulsoup4 (4.9.3): For HTML parsing
requests (2.25.1): For downloading files from URLs
jinja2 (3.0.1): For template rendering

Development dependencies:

pylint (2.8.3): For code linting
black (21.6b0): For code formatting

You can install the required dependencies using the provided requirements.txt file:

pip install -r requirements.txt

Troubleshooting

If you encounter issues with PDF processing, ensure you have the correct version of PyPDF2 installed.
For NLTK-related functions, you may need to download additional NLTK data. Refer to the NLTK documentation for details.
When processing files from URLs, ensure you have a stable internet connection and the URL is accessible.
If a specific file format fails to process, check that you have the necessary dependencies installed and the file is not corrupted.

For any further questions or issues, please refer to the script comments or reach out to the project maintainers.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
docs		docs
templates		templates
.gitignore		.gitignore
.nojekyll		.nojekyll
README.md		README.md
_config.yml		_config.yml
html_generator.py		html_generator.py
main.py		main.py
requirements.txt		requirements.txt
text_processor.py		text_processor.py
utils.py		utils.py
web_server.py		web_server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Interactive HTML Intersecting Texts Visualizer

Example output

Supported File Formats

Basic Usage

Local File Input

Direct Text Input

Remote File Download

Viewing and Interacting with the Visualization

Advanced Features

Customization

Dependencies

Troubleshooting

About

Releases

Packages

Languages

genaforvena/tikziod-thinking

Folders and files

Latest commit

History

Repository files navigation

Interactive HTML Intersecting Texts Visualizer

Example output

Supported File Formats

Basic Usage

Local File Input

Direct Text Input

Remote File Download

Viewing and Interacting with the Visualization

Advanced Features

Customization

Dependencies

Troubleshooting

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages