Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorRT-LLM Triton Backend Support #33

Open
shixianc opened this issue Nov 15, 2023 · 6 comments
Open

TensorRT-LLM Triton Backend Support #33

shixianc opened this issue Nov 15, 2023 · 6 comments
Assignees
Labels
enhancement New feature or request non-stale

Comments

@shixianc
Copy link

shixianc commented Nov 15, 2023

When can NAV support creating Triton Repo for this new backend? Is it on your roadmap?
https://github.com/triton-inference-server/tensorrtllm_backend

@jkosek jkosek self-assigned this Nov 21, 2023
@jkosek jkosek added enhancement New feature or request non-stale labels Nov 21, 2023
@jkosek
Copy link
Collaborator

jkosek commented Nov 21, 2023

@shixianc thanks for feature request. We are going to review the backend options and add the support in next release.

If there are any specific requirements you see, let us know. Thanks!

@ishandhanani
Copy link

Hi team! Was this ever added? I'm looking through the release notes but cannot find support for TRT-LLM

@jkosek
Copy link
Collaborator

jkosek commented Apr 4, 2024

Hi @ishandhanani. Apologize, not yet. Let us prioritize this feature and provide some ETA.

@jkosek
Copy link
Collaborator

jkosek commented Apr 4, 2024

@ishandhanani maybe some questions to clarify expected behavior. Do you see this feature as generating the model store for tensorrtllm backend only (example) or you would expect that whole deployment of pre/post processing with BLS would be created (similar to this example)?

@ishandhanani
Copy link

I think a good first step would be to have it generate the model repo for the trtllm backend only. In the future it would be great if we could generate the entire pre/post processing model repo @jkosek

@jkosek
Copy link
Collaborator

jkosek commented Aug 6, 2024

@ishandhanani you may want to review the newly added TensorRTLLMModelConfig class that specify the TensorRT-LLM backend configuration: https://triton-inference-server.github.io/model_navigator/0.11.0/inference_deployment/triton/api/specialized_configs/#model_navigator.triton.TensorRTLLMModelConfig

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request non-stale
Development

No branches or pull requests

3 participants