Questions related to TRT conversion and TRT-LLM support #26

shixianc · 2023-08-28T20:41:30Z

I have 2 separate questions which I could not find an answer yet, so post it here hope someone can answer:

When doing TRT conversion from torchscript to trt. Would nav call polygraphy surgeon sanitize to do things like constant folding? This is helpful when dealing with larger size models. It seems nav underlying uses polygraphy but want to check if it also sanitizes.
There's an alpha release for TRT-LLM tool which combines TensorRT and FasterTransformer. Is this tool on your roadmap to support it? As a user for nav, I like the simpler interface it provides compared to do compilation/conversion in multiple steps. It would be great to see future support related to LLM.

The text was updated successfully, but these errors were encountered:

ptarasiewiczNV · 2023-08-29T11:43:43Z

Hi @shixianc ,

Thank you for the questions.

Currently, we do not use the polygraphy surgeon, but it is on our roadmap, and we're aiming to support it even as early as next month.
Our plans for supporting TRT-LLM are still under discussion, but we're definitely interested in integrating it where Navigator can provide assistance. We're also open to suggestions – if you have any specific use cases where you see Model Navigator being helpful, please feel free to share.

Best regards,
Piotr

shixianc · 2023-09-20T17:32:58Z

@ptarasiewiczNV

Thank you for the reply. Regarding 1 it would be nice to have that as some of our models are small enough (can be loaded on a 16GB GPU) but during compilation it went OOM, and it seems sanitize scripts would help reduce the onnx model size.

We also tried out trt-llm. It looks promising when we compare its benchmarking to fastertransformer, since it provides the latest mha attention optimization techniques. Their engine building requires many steps and parameters. I think model-navigator might be able to package all the scripts and provide a higher-level user-friendly API on top of trt-llm.

These are a few suggestions as an external user, but you should have more latest info than me on where they're heading

github-actions · 2023-11-20T01:48:12Z

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions bot added the Stale label Nov 20, 2023

jkosek added enhancement New feature or request non-stale and removed Stale labels Nov 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions related to TRT conversion and TRT-LLM support #26

Questions related to TRT conversion and TRT-LLM support #26

shixianc commented Aug 28, 2023

ptarasiewiczNV commented Aug 29, 2023

shixianc commented Sep 20, 2023 •

edited

Loading

github-actions bot commented Nov 20, 2023

Questions related to TRT conversion and TRT-LLM support #26

Questions related to TRT conversion and TRT-LLM support #26

Comments

shixianc commented Aug 28, 2023

ptarasiewiczNV commented Aug 29, 2023

shixianc commented Sep 20, 2023 • edited Loading

github-actions bot commented Nov 20, 2023

shixianc commented Sep 20, 2023 •

edited

Loading