#机器学习 #推理 #ONNX #GPU #模型 #运行时 #形状

应用 wonnx-cli

WONNX的命令行界面。WONNX是基于wgpu的ONNX运行时,旨在成为一个通用的GPU运行时,用Rust编写。

4个版本 (2个重大更改)

0.5.1 2023年9月30日
0.5.0 2023年4月20日
0.4.0 2023年3月5日
0.3.0 2022年7月31日
0.2.5 2022年7月31日

图形API中排名第132

每月下载量27

MIT/Apache

765KB
14K SLoC

Wonnx命令行界面(nnx)

使用wonnx进行推理的命令行界面

ONNX定义了一种标准化的格式来交换机器学习模型。然而,到目前为止,没有简单的方法可以在不使用Python的情况下对这种模型进行单次推理。安装Python和所需的库(例如TensorFlow和底层GPU设置)可能很麻烦。此外,始终需要特定的代码来传输输入(图像、文本等)和模型所需的格式之间的数据(即图像分类模型希望其图像为固定大小的张量,像素值归一化到特定值等)。

此项目提供了一个非常简单的二进制命令行工具,可用于在GPU上使用ONNX模型进行推理。得益于wonnx,推理在GPU上执行。

NNX试图推测如何对模型的输入和输出进行转换。这些推测是默认的,即应该始终可以覆盖它们。目标是减少运行模型所需的配置量。目前应用以下启发式方法:

  • 默认情况下使用ONNX文件中指定的第一个输入和第一个输出。
  • 接受形状为(1,3,w,h)和(3,w,h)的输入的模型将提供调整大小为w*h的图像,像素值归一化到0...1(目前我们还应用SqueezeNet归一化)。
  • 类似地,接受形状为(1,1,w,h)和(1,w,h)的输入的模型将提供灰度图像,像素值归一化到0...1。
  • 当提供标签文件时,形状为 (n,) 的输出向量将被解释为提供每个类别的概率。每个类的标签取自标签文件的第 n 行。

用法

推理

nnx infer ./data/models/opt-squeeze.onnx -i data=./data/images/pelican.jpeg --labels ./data/models/squeeze-labels.txt --probabilities
n01608432 kite: 21.820244
n02051845 pelican: 21.112095
n02018795 bustard: 20.359694
n01622779 great grey owl, great gray owl, Strix nebulosa: 20.176003
n04417672 thatch, thatched roof: 19.638676
n02028035 redshank, Tringa totanus: 19.606218
n02011460 bittern: 18.90648
n02033041 dowitcher: 18.708323
n01829413 hornbill: 18.595457
n01616318 vulture: 17.508785
nnx infer ./data/models/opt-mnist.onnx -i Input3=./data/images/7.jpg
[-1.2942507, 0.5192305, 8.655695, 9.474595, -13.768464, -5.8907413, -23.467274, 28.252314, -6.7598896, 3.9513395]
nnx infer ./data/models/opt-mnist.onnx -i Input3=./data/images/7.jpg --labels ./data/models/mnist-labels.txt --top=1
Seven

获取基本模型信息

$ nnx info ./data/models/opt-mnist.onnx
+------------------+----------------------------------------------------+
| Model version    | 1                                                  |
+------------------+----------------------------------------------------+
| IR version       | 3                                                  |
+------------------+----------------------------------------------------+
| Producer name    | CNTK                                               |
+------------------+----------------------------------------------------+
| Producer version | 2.5.1                                              |
+------------------+----------------------------------------------------+
| Opsets           | 8                                                  |
+------------------+----------------------------------------------------+
| Inputs           | +--------+-------------+-----------------+------+  |
|                  | | Name   | Description | Shape           | Type |  |
|                  | +--------+-------------+-----------------+------+  |
|                  | | Input3 |             | 1 x 1 x 28 x 28 | f32  |  |
|                  | +--------+-------------+-----------------+------+  |
+------------------+----------------------------------------------------+
| Outputs          | +------------------+-------------+--------+------+ |
|                  | | Name             | Description | Shape  | Type | |
|                  | +------------------+-------------+--------+------+ |
|                  | | Plus214_Output_0 |             | 1 x 10 | f32  | |
|                  | +------------------+-------------+--------+------+ |
+------------------+----------------------------------------------------+
| Ops used         | +---------+---------------------+                  |
|                  | | Op      | Attributes          |                  |
|                  | +---------+---------------------+                  |
|                  | | Conv    | auto_pad=SAME_UPPER |                  |
|                  | |         | dilations=<INTS>    |                  |
|                  | |         | group=1             |                  |
|                  | |         | kernel_shape=<INTS> |                  |
|                  | |         | strides=<INTS>      |                  |
|                  | +---------+---------------------+                  |
|                  | | Gemm    | alpha=1             |                  |
|                  | |         | beta=1              |                  |
|                  | |         | transA=0            |                  |
|                  | |         | transB=0            |                  |
|                  | +---------+---------------------+                  |
|                  | | MaxPool | auto_pad=NOTSET     |                  |
|                  | |         | kernel_shape=<INTS> |                  |
|                  | |         | pads=<INTS>         |                  |
|                  | |         | strides=<INTS>      |                  |
|                  | +---------+---------------------+                  |
|                  | | Relu    |                     |                  |
|                  | +---------+---------------------+                  |
|                  | | Reshape |                     |                  |
|                  | +---------+---------------------+                  |
+------------------+----------------------------------------------------+
| Memory usage     | +--------------+----------+                        |
|                  | | Inputs       |  26.5 KB |                        |
|                  | +--------------+----------+                        |
|                  | | Outputs      |     40 B |                        |
|                  | +--------------+----------+                        |
|                  | | Intermediate |  81.6 KB |                        |
|                  | +--------------+----------+                        |
|                  | | Weights      |  23.4 KB |                        |
|                  | +--------------+----------+                        |
|                  | | Total        | 131.6 KB |                        |
|                  | +--------------+----------+                        |
+------------------+----------------------------------------------------+

打印特定输出到输入的跟踪

nnx trace ./data/models/opt-mnist.onnx Plus214_Output_0
+ Plus214_Output_0: output #0 of node #7 '' (Gemm)
| + Pooling160_Output_0_reshape0: output #0 of node #6 'Times212_reshape0' (Reshape)
| | + Pooling160_Output_0: output #0 of node #5 'Pooling160' (MaxPool)
| | | + ReLU114_Output_0: output #0 of node #4 'ReLU114' (Relu)
| | | | + Convolution110_Output_0: output #0 of node #3 'Convolution110' (Conv)
| | | | | + Pooling66_Output_0: output #0 of node #2 'Pooling66' (MaxPool)
| | | | | | + ReLU32_Output_0: output #0 of node #1 'ReLU32' (Relu)
| | | | | | | + Convolution28_Output_0: output #0 of node #0 'Convolution28' (Conv)
| | | | | | | | + Input3: input
| | | | | | | | + Parameter5: initializer
| | | | | | | | + 23: initializer
| | | | | + Parameter87: initializer
| | | | | + 24: initializer
| | + Pooling160_Output_0_reshape0_shape: initializer
| + Parameter193_reshape1: initializer
| + Parameter194: initializer

形状推理

RUST_LOG=info nnx prepare ./data/models/opt-mnist-clear.onnx ./data/models/opt-mnist-inferred.onnx --discard-shapes
[2023-02-11T17:48:50Z INFO  nnx] writing model to './data/models/opt-mnist-inferred.onnx'
[2023-02-11T17:48:50Z INFO  nnx] model written to file
RUST_LOG=info nnx prepare ./data/models/opt-mnist-clear.onnx ./data/models/opt-mnist-inferred.onnx --infer-shapes  
[2023-02-11T17:48:56Z INFO  wonnx_preprocessing::shape_inference] node Convolution28 inferred shape: 1x8x28x28:f32
[2023-02-11T17:48:56Z INFO  wonnx_preprocessing::shape_inference] node ReLU32 inferred shape: 1x8x28x28:f32
[2023-02-11T17:48:56Z INFO  wonnx_preprocessing::shape_inference] node Pooling66 inferred shape: 1x8x14x14:f32
[2023-02-11T17:48:56Z INFO  wonnx_preprocessing::shape_inference] node Convolution110 inferred shape: 1x16x14x14:f32
[2023-02-11T17:48:56Z INFO  wonnx_preprocessing::shape_inference] node ReLU114 inferred shape: 1x16x14x14:f32
[2023-02-11T17:48:56Z INFO  wonnx_preprocessing::shape_inference] node Pooling160 inferred shape: 1x16x4x4:f32
[2023-02-11T17:48:56Z INFO  wonnx_preprocessing::shape_inference] node Times212_reshape0 inferred shape: 1x256:f32
[2023-02-11T17:48:56Z INFO  nnx] writing model to './data/models/opt-mnist-inferred.onnx'
[2023-02-11T17:48:56Z INFO  nnx] model written to file
  • nnx 替换为 cargo run --release -- 来运行开发版本
  • 在命令行工具中添加 RUST_LOG=wonnx-cli=info 来查看有用的日志,添加 RUST_LOG=wonnx=info 来查看WONNX的日志。

使用 tract 进行CPU推理

nnx实用程序可以使用 tract 作为ONNX推理的基于CPU的后端。为了使用此功能,nnx需要编译时启用 cpu 功能。然后,您可以指定以下参数之一

  • --backend cpu 来选择CPU后端
  • --fallback 当GPU后端无法使用时(例如,因为不支持的操作类型)选择CPU后端
  • --compare 在CPU和GPU后端上运行推理并比较输出
  • --benchmark 运行指定的推理一百次,然后报告性能
  • --compare --benchmark 在CPU和GPU上各运行一百次推理,并比较性能

一个基准测试示例(以下结果是在Apple M1 Max系统上获得的)

# Run from workspace root
$ cargo run --release --features=cpu -- infer ./data/models/opt-squeeze.onnx -i data=./data/images/pelican.jpeg --compare --benchmark
OK (gpu=572ms, cpu=1384ms, 2.42x)

与Keras的端到端示例

  1. 安装 tensorflow onnx tf2onnx

  2. 为MNIST数字创建一个非常简单的模型

from tensorflow.keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# train_images will be (60000,28,28) i.e. 60k black-and-white images of 28x28 pixels (which are ints between 0..255)
# train_labels will be (60000,) i.e. 60k integers ranging 0...9
# test_images/test_labels are similar but only have 10k items

# Build model
from tensorflow import keras
from tensorflow.keras import layers

# Convert images to have pixel values as floats between 0...1
train_images_input = train_images.astype("float32") / 255

model = keras.Sequential([
    layers.Reshape((28*28,), input_shape=(28,28)),
    layers.Dense(512, activation = 'relu'),
    layers.Dropout(rate=0.01),
    layers.Dense(10,  activation = 'softmax')
])

model.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy", metrics=["accuracy"])

# Train the model
model.fit(train_images_input, train_labels, epochs=20, batch_size=1024)
  1. 将Keras模型保存到ONNX,并推断维度
import tf2onnx
import tensorflow as tf
import onnx
input_signature = [tf.TensorSpec([1,28,28], tf.float32, name='input')]
onnx_model, _ = tf2onnx.convert.from_keras(model, input_signature, opset=13)

from onnx import helper, shape_inference
inferred_model = shape_inference.infer_shapes(onnx_model)

onnx.save(onnx_model, "tymnist.onnx")
onnx.save(inferred_model, "tymnist-inferred.onnx")
  1. 使用NNX进行推断
nnx  ./tymnist-inferred.onnx infer -i input=./data/mnist-7.png --labels ./data/models/mnist-labels.txt
  1. 将推断结果与Keras生成的结果进行比较(pip install numpy pillow matplotlib
import PIL
import numpy
import matplotlib.pyplot as plt
m5 = PIL.Image.open("data/mnist-7.png").resize((28,28), PIL.Image.ANTIALIAS)
nm5 = numpy.array(m5).reshape((1,28,28))
model.predict(nm5)

依赖项

~29–65MB
~1M SLoC