Fern: The Multimodal Magic 🌿

In the realm where technology and nature intertwine, Fern emerges as a beacon of innovation and mystery. This deep learning model transcends the ordinary, seamlessly blending text, vision, and audio into a harmonious symphony of understanding. Much like its mythological namesake, the fern—believed to possess hidden powers and secrets of the forest—Fern delves into the depths of multimodal data, uncovering patterns and insights that elude conventional models. Enigmatic and almost magical, Fern transforms the way we perceive and interact with information, revealing the unseen and unheard with an elegance that borders on the supernatural.

What is Fern?

Fern is an ongoing effort to create a model similar to Chameleon, but which would cover all modalities. Currently it resembles Llama 2 in terms of architecture.

Overview

Architecture

The following system components are implemented:

Transformer architecture
- RMSNorm normalization
- RoPE positional embeddings
- SwiGLU activation function
- Embedding sharing
Byte-Pair Encoding tokenizer

Data and tokenization

Train data is all the books from The Elder Scrolls video game series taken from The Imperial Library. BPE tokenizer is trained on the whole dataset to create 2048 new tokens, including the special <|endoftext|> token. This results in vocabulary containing 2304 tokens in total (256 byte tokens, 2047 pair tokens and 1 special token). Vocabulary of this size captures lore terms such as Valenwood and Bosmer as well as literature categories such as Fiction and Narrative in a single token which is learned nearly at the end of the tokenizer training. Further investigation on the optimal vocabulary size needed.

Results

The model dim is 128, it has 32 layers with 8 heads each. Model is trained with the context size 512 tokens and batch size containing 32 examples. The model achieves 2.74 train loss and 3.52 validation loss after 10000 training steps. The training process took nearly 50 minutes on a single laptop Nvidia RTX 4060 GPU.

Configuration	Train loss	Val loss
Transformer (d=128, L=32, h=8) + RMSNorm + RoPE + SwiGLU + ES; Tok2304, batch 32, context 512	2.74	3.52

Install and run

Install miniforge3.
Clone this repository. To clone the repository using git, run:

git clone https://github.com/vdyma/fern

Open repository root directory as a working directory in your shell and create a new conda environment:

conda install --file environment.yaml

Activate conda environment:

conda activate fern

You are now ready to use the model.

(Optional) In order to run Jupyter Notebooks, you need to additionally install Jupyter Lab and ipywidgets:

conda install jupyterlab ipywidgets

You can then run jupyter as follows:

jupyter lab

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
checkpoints		checkpoints
fern		fern
notebooks		notebooks
.gitattributes		.gitattributes
README.md		README.md
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fern: The Multimodal Magic 🌿

What is Fern?

Overview

Architecture

Data and tokenization

Results

Install and run

About

Releases

Packages

Languages

vdyma/fern

Folders and files

Latest commit

History

Repository files navigation

Fern: The Multimodal Magic 🌿

What is Fern?

Overview

Architecture

Data and tokenization

Results

Install and run

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages