Release Triton Model Navigator v0.12.0 · triton-inference-server/model_navigator

Updates:
- new: simple and detailed reporting of the optimization process
- new: adjusted exporting TensorFlow SavedModel for Keras 3.x
- new: inform user when wrapped a module which is not called during optimize
- new: inform user when module use a custom forward function
- new: support for dynamic shapes in Torch ExportedProgram
- new: use ExportedProgram for Torch-TensorRT conversion
- new: support back-off policy during profiling to avoid reporting local minimum
- new: automatically scale conversion batch size when modules have different batch sizes in scope of a single pipeline
- change: TensorRT conversion max batch size search rely on saturating throughput for base formats
- change: adjusted profiling configuration for throughput cutoff search
- change: include optimized pipeline to list of examined variants during nav.profile
- change: performance is not executed when correctness failed for format and runtime
- change: verify command is not executed when verify function is not provided
- change: do not create a model copy before executing torch.compile
- fix: pipelines sometimes obtain model and tensors on different devices during nav.profile
- fix: extract graph from ExportedProgram for running inference
- fix: runner configuration not propagated to pre-processing steps
Version of external components used during testing:
- PyTorch 2.4.0a0+3bcc3cddb5
- TensorFlow 2.16.1
- TensorRT 10.3.0.26
- Torch-TensorRT 2.4.0.a0
- ONNX Runtime 1.18.1
- Polygraphy: 0.49.12
- GraphSurgeon: 0.5.2
- tf2onnx v1.16.1
- Other component versions depend on the used framework containers versions.
  See its support matrix
  for a detailed summary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triton Model Navigator v0.12.0