Use ONNX Runtime for Inference
Docker Images
- ONNX-Ecosystem: includes ONNX Runtime (CPU, Python), dependencies, tools to convert from various frameworks, and Jupyter notebooks to help get started
- Additional dockerfiles
API Documentation
API | Supported Versions | Samples |
---|---|---|
Python | 3.5, 3.6, 3.7, 3.8 (3.8 excludes Win GPU and Linux ARM) Python Dev Notes | Samples |
C# | Samples | |
C++ | Samples | |
C | Samples | |
WinRT | Windows.AI.MachineLearning | Samples |
Java | 8+ | Samples |
Ruby (external project) | 2.4-2.7 | Samples |
Javascript (node.js) | 12.x | Samples |
Supported Accelerators
CPU | GPU | IoT/Edge/Mobile | Other |
---|---|---|---|
Default CPU - MLAS (Microsoft Linear Algebra Subprograms) + Eigen | NVIDIA CUDA | Intel OpenVINO | |
Intel DNNL | NVIDIA TensorRT | ARM Compute Library (preview) | Rockchip NPU (preview) |
Intel nGraph | DirectML | Android Neural Networks API (preview) | Xilinx Vitis-AI (preview) |
Intel MKL-ML (build option) | AMD MIGraphX (*preview) | ARM-NN (preview) |
Deploying ONNX Runtime
Cloud
- ONNX Runtime can be deployed to any cloud for model inference, including Azure Machine Learning Services.
- ONNX Runtime Server (beta) is a hosting application for serving ONNX models using ONNX Runtime, providing a REST API for prediction.
IoT and edge devices
The expanding focus and selection of IoT devices with sensors and consistent signal streams introduces new opportunities to move AI workloads to the edge. This is particularly important when there are massive volumes of incoming data/signals that may not be efficient or useful to push to the cloud due to storage or latency considerations. Consider: surveillance tapes where 99% of footage is uneventful, or real-time person detection scenarios where immediate action is required. In these scenarios, directly executing model inference on the target device is crucial for optimal assistance.
Client applications
-
Install or build the package you need to use in your application. (sample implementations using the C++ API)
-
On newer Windows 10 devices (1809+), ONNX Runtime is available by default as part of the OS and is accessible via the Windows Machine Learning APIs. (Tutorials for Windows Desktop or UWP app)
Build from Source
For production scenarios, it’s strongly recommended to build only from an official release branch.