You are viewing documentation for Kubeflow 1.2

This is a static snapshot from the time of the Kubeflow 1.2 release.
For up-to-date information, see the latest version.

NVIDIA Triton Inference Server

Model serving with Triton Inference Server

Kubeflow currently doesn’t have a specific guide for NVIDIA Triton Inference Server. Note that Triton was previously known as the TensorRT Inference Server. See the NVIDIA documentation for instructions on running NVIDIA inference server on Kubernetes.