You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, thank you very much for your work. In my network, I need to use multiple Mamba1 blocks, which makes the network quite large and slows down training. Could you advise on how to reduce the number of Mamba operations in each block?
I called the Mamba1 method by directly using the provided integrated block:
model = Mamba(
# This module uses roughly 3 * expand * d_model^2 parameters
d_model=dim, # Model dimension d_model
d_state=16, # SSM state expansion factor
d_conv=4, # Local convolution width
expand=2, # Block expansion factor
)
Thanks!
The text was updated successfully, but these errors were encountered:
Hello, thank you very much for your work. In my network, I need to use multiple Mamba1 blocks, which makes the network quite large and slows down training. Could you advise on how to reduce the number of Mamba operations in each block?
I called the Mamba1 method by directly using the provided integrated block:
model = Mamba(
# This module uses roughly 3 * expand * d_model^2 parameters
d_model=dim, # Model dimension d_model
d_state=16, # SSM state expansion factor
d_conv=4, # Local convolution width
expand=2, # Block expansion factor
)
Thanks!
The text was updated successfully, but these errors were encountered: