-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TensorRT-LLM Triton Backend Support #33
Comments
@shixianc thanks for feature request. We are going to review the backend options and add the support in next release. If there are any specific requirements you see, let us know. Thanks! |
Hi team! Was this ever added? I'm looking through the release notes but cannot find support for TRT-LLM |
Hi @ishandhanani. Apologize, not yet. Let us prioritize this feature and provide some ETA. |
@ishandhanani maybe some questions to clarify expected behavior. Do you see this feature as generating the model store for tensorrtllm backend only (example) or you would expect that whole deployment of pre/post processing with BLS would be created (similar to this example)? |
I think a good first step would be to have it generate the model repo for the trtllm backend only. In the future it would be great if we could generate the entire pre/post processing model repo @jkosek |
@ishandhanani you may want to review the newly added TensorRTLLMModelConfig class that specify the TensorRT-LLM backend configuration: https://triton-inference-server.github.io/model_navigator/0.11.0/inference_deployment/triton/api/specialized_configs/#model_navigator.triton.TensorRTLLMModelConfig |
When can NAV support creating Triton Repo for this new backend? Is it on your roadmap?
https://github.com/triton-inference-server/tensorrtllm_backend
The text was updated successfully, but these errors were encountered: