Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inline relevant fields of HookedTransformer rather than nesting it in config #53

Open
JasonGross opened this issue Feb 19, 2024 · 0 comments
Labels
engineering priority: medium-low Non-blocking but somewhat time-sensitive, e.g., adds overhead or friction

Comments

@JasonGross
Copy link
Owner

I think we made the wrong decision (/ I gave bad design advice) when making HookedTransformerConfig a field of each of the experiments. I think on reflection the downsides outweigh the upsides.

Upsides:

  1. uniform way of adding CLI arguments (the utility functions can be replaced by a lookup table of the argparse arugments corresponding to each HookedTransformerConfig field)
  2. enforced uniform naming scheme

Downsides:

  1. upgrading HookedTransformer invalidates all of our config hashes / wandb models (see also More compatibility in config #52 and Update dependency transformer-lens to v1.14.0 #40 (comment))
  2. we have to introduce kludges when we want more control over renaming and defaulting arguments, e.g., when we want to be based on sequence length rather than context window size, or use a prime p rather than d_vocab_out, etc.
@JasonGross JasonGross added engineering priority: medium-low Non-blocking but somewhat time-sensitive, e.g., adds overhead or friction labels Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
engineering priority: medium-low Non-blocking but somewhat time-sensitive, e.g., adds overhead or friction
Projects
None yet
Development

No branches or pull requests

1 participant