How to reduce the number of Mamba operations in each block? #606

xiaosa269 · 2024-10-25T03:25:15Z

Hello, thank you very much for your work. In my network, I need to use multiple Mamba1 blocks, which makes the network quite large and slows down training. Could you advise on how to reduce the number of Mamba operations in each block?

I called the Mamba1 method by directly using the provided integrated block：
model = Mamba(
# This module uses roughly 3 * expand * d_model^2 parameters
d_model=dim, # Model dimension d_model
d_state=16, # SSM state expansion factor
d_conv=4, # Local convolution width
expand=2, # Block expansion factor
)
Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to reduce the number of Mamba operations in each block? #606

How to reduce the number of Mamba operations in each block? #606

xiaosa269 commented Oct 25, 2024

How to reduce the number of Mamba operations in each block? #606

How to reduce the number of Mamba operations in each block? #606

Comments

xiaosa269 commented Oct 25, 2024