Replies: 1 comment
-
@lucidrains just don't want to pollute issues, but I'm not sure if you're looking here. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello! I've been following along on the implementation to try and learn machine learning while doing the fast.ai course and have been fascinated with this.
My question relates to the section of code that relates to this part of the paper.
I believe this is the related code:
One major aspect that I'm missing is why is there the 50% coin flip to choose between the segment masking, and the mask controlled by
pdrop
?The other part that I wanted to make sure I understood is, what is "batch"?
Is "batch" representing each individual file and at inference time you would only have one "batch"?
Is it for handling training multiple files at one time, does it represent where they talk in the paper about training in batches?
Thanks for any help!
Beta Was this translation helpful? Give feedback.
All reactions