不同框架实现LSTM代码及转Onnx方法

文章目录

1、Paddle 生成LSTM
2 pytorch 生成LSTM
3 Tensorflow2 生成LSTM
4 lstm 转换工具

本文将实现用paddle,pytorch,tensorflow2三种框架实现lstm的单层、双层、双向双层三种形式，并将整个过程生成的模型转换成onnx,并将onnx模型的结构展示。本文所有代码及执行结果都可以在https://gitee.com/tdddeeel/lstm_pytorch_tensorflow_paddle这里找到。

1、Paddle 生成LSTM

整个过程包括模型定义、导出、转onnx、优化onnx.最后的一个onnx是我们最后需要的onnx,可以查看图。这部分实际包括了paddle生成模型及转Onnx的过程，关于更多的整个流程，请参考博客

import os
import sys
import paddle
from paddle import nn
import numpy as np
from onnxsim import simplify
import onnxoptimizer
import onnx
import onnxruntime
1
2
3
4
5
6
7
8
9

/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:36: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
  'nearest': Image.NEAREST,
/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:37: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
  'bilinear': Image.BILINEAR,
/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:38: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
  'bicubic': Image.BICUBIC,
/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:39: DeprecationWarning: BOX is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BOX instead.
  'box': Image.BOX,
/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:40: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead.
  'lanczos': Image.LANCZOS,
/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:41: DeprecationWarning: HAMMING is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.HAMMING instead.
  'hamming': Image.HAMMING
/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/onnx/mapping.py:27: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. 
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  int(TensorProto.STRING): np.dtype(np.object)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

1.1 time_major=False

与pytorch的batch_first=True是相同的功能

class One_LSTM_batch(nn.Layer):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,time_major=False)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = paddle.reshape(x,[b,c,h*w])
        h0 = paddle.zeros((1,1,4))
        c0 = paddle.zeros((1,1,4))
        x2 = paddle.squeeze(x,2)
        x3 = paddle.transpose(x2,(0,2,1))
        out,_ = self.rnn(x3,(h0,c0))
        return out # shape 1,6,4

model_path = "paddle/One_LSTM_batch"
model = One_LSTM_batch()
model.eval()
infer_shape = [1,3,1,6]
input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)

model = onnx.load(model_path+'.onnx')
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = model_path+"_sim.onnx"
onnx.save(model_sim,save_path)

model = onnx.load(save_path)
if model.ir_version<4:
    print("Model with ir_version below 4 requires to in clude initializer in graph input")
    exit()
inputs = model.graph.input
name_to_input = {}
for input in inputs:
    name_to_input[input.name]=input
for initializer in model.graph.initializer:
    if initializer.name in name_to_input:
        inputs.remove(name_to_input[initializer.name])
passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
optimized_model = onnxoptimizer.optimize(model,passes)
save_path = model_path+"_sim_opt.onnx"
onnx.save(optimized_model,save_path)



class Two_LSTM_batch(nn.Layer):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,time_major=False,num_layers=2)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = paddle.reshape(x,[b,c,h*w])
        h0 = paddle.zeros((2,1,4))
        c0 = paddle.zeros((2,1,4))
        x2 = paddle.squeeze(x,2)
        x3 = paddle.transpose(x2,(0,2,1))
        out,_ = self.rnn(x3,(h0,c0))
        return out # shape 1,6,4

model_path = "paddle/Two_LSTM_batch"
model = Two_LSTM_batch()
model.eval()
infer_shape = [1,3,1,6]
input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)

model = onnx.load(model_path+'.onnx')
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = model_path+"_sim.onnx"
onnx.save(model_sim,save_path)

model = onnx.load(save_path)
if model.ir_version<4:
    print("Model with ir_version below 4 requires to in clude initializer in graph input")
    exit()
inputs = model.graph.input
name_to_input = {}
for input in inputs:
    name_to_input[input.name]=input
for initializer in model.graph.initializer:
    if initializer.name in name_to_input:
        inputs.remove(name_to_input[initializer.name])
passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
optimized_model = onnxoptimizer.optimize(model,passes)
save_path = model_path+"_sim_opt.onnx"
onnx.save(optimized_model,save_path)



class Bi_Two_LSTM_batch(nn.Layer):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,time_major=False,direction="bidirect",num_layers=2)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = paddle.reshape(x,[b,c,h*w])
        h0 = paddle.zeros((4,1,4))
        c0 = paddle.zeros((4,1,4))
        x2 = paddle.squeeze(x,2)
        x3 = paddle.transpose(x2,(0,2,1))
        out,_ = self.rnn(x3,(h0,c0))
        return out # shape 1,6,4

model_path = "paddle/Bi_Two_LSTM_batch"
model = Bi_Two_LSTM_batch()
model.eval()
infer_shape = [1,3,1,6]
input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)

model = onnx.load(model_path+'.onnx')
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = model_path+"_sim.onnx"
onnx.save(model_sim,save_path)

model = onnx.load(save_path)
if model.ir_version<4:
    print("Model with ir_version below 4 requires to in clude initializer in graph input")
    exit()
inputs = model.graph.input
name_to_input = {}
for input in inputs:
    name_to_input[input.name]=input
for initializer in model.graph.initializer:
    if initializer.name in name_to_input:
        inputs.remove(name_to_input[initializer.name])
passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
optimized_model = onnxoptimizer.optimize(model,passes)
save_path = model_path+"_sim_opt.onnx"
onnx.save(optimized_model,save_path)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133

2022-08-03 09:39:41 [INFO]	ONNX model generated is valid.
2022-08-03 09:39:41 [INFO]	ONNX model saved in paddle/One_LSTM_batch.onnx
2022-08-03 09:39:41 [INFO]	ONNX model generated is valid.
2022-08-03 09:39:41 [INFO]	ONNX model saved in paddle/Two_LSTM_batch.onnx
2022-08-03 09:39:41 [INFO]	ONNX model generated is valid.
2022-08-03 09:39:41 [INFO]	ONNX model saved in paddle/Bi_Two_LSTM_batch.onnx


/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/onnx/numpy_helper.py:93: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. 
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  if arr.dtype == np.object:
1
2
3
4
5
6
7
8
9
10
11

1.2 time_major=True

与pytorch的batch_first=False相同

class One_LSTM_time(nn.Layer):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,time_major=True)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = paddle.reshape(x,[b,c,h*w])
        h0 = paddle.zeros((1,1,4))
        c0 = paddle.zeros((1,1,4))
        x2 = paddle.squeeze(x,2)
        x3 = paddle.transpose(x2,(2,0,1))
        out,_ = self.rnn(x3,(h0,c0))
        return out # shape 1,6,4

model_path = "paddle/One_LSTM_time"
model = One_LSTM_time()
model.eval()
infer_shape = [1,3,1,6]
input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)

model = onnx.load(model_path+'.onnx')
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = model_path+"_sim.onnx"
onnx.save(model_sim,save_path)

model = onnx.load(save_path)
if model.ir_version<4:
    print("Model with ir_version below 4 requires to in clude initializer in graph input")
    exit()
inputs = model.graph.input
name_to_input = {}
for input in inputs:
    name_to_input[input.name]=input
for initializer in model.graph.initializer:
    if initializer.name in name_to_input:
        inputs.remove(name_to_input[initializer.name])
passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
optimized_model = onnxoptimizer.optimize(model,passes)
save_path = model_path+"_sim_opt.onnx"
onnx.save(optimized_model,save_path)


class Two_LSTM_time(nn.Layer):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,time_major=True,num_layers=2)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = paddle.reshape(x,[b,c,h*w])
        h0 = paddle.zeros((2,1,4))
        c0 = paddle.zeros((2,1,4))
        x2 = paddle.squeeze(x,2)
        x3 = paddle.transpose(x2,(2,0,1))
        out,_ = self.rnn(x3,(h0,c0))
        return out # shape 1,6,4

model_path = "paddle/Two_LSTM_time"
model = Two_LSTM_time()
model.eval()
infer_shape = [1,3,1,6]
input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)

model = onnx.load(model_path+'.onnx')
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = model_path+"_sim.onnx"
onnx.save(model_sim,save_path)

model = onnx.load(save_path)
if model.ir_version<4:
    print("Model with ir_version below 4 requires to in clude initializer in graph input")
    exit()
inputs = model.graph.input
name_to_input = {}
for input in inputs:
    name_to_input[input.name]=input
for initializer in model.graph.initializer:
    if initializer.name in name_to_input:
        inputs.remove(name_to_input[initializer.name])
passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
optimized_model = onnxoptimizer.optimize(model,passes)
save_path = model_path+"_sim_opt.onnx"
onnx.save(optimized_model,save_path)




class Bi_Two_LSTM_time(nn.Layer):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,time_major=True,direction="bidirect",num_layers=2)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = paddle.reshape(x,[b,c,h*w])
        h0 = paddle.zeros((4,1,4))
        c0 = paddle.zeros((4,1,4))
        x2 = paddle.squeeze(x,2)
        x3 = paddle.transpose(x2,(2,0,1))
        out,_ = self.rnn(x3,(h0,c0))
        return out # shape 1,6,4

model_path = "paddle/Bi_Two_LSTM_time"
model = Bi_Two_LSTM_time()
model.eval()
infer_shape = [1,3,1,6]
input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)

model = onnx.load(model_path+'.onnx')
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = model_path+"_sim.onnx"
onnx.save(model_sim,save_path)

model = onnx.load(save_path)
if model.ir_version<4:
    print("Model with ir_version below 4 requires to in clude initializer in graph input")
    exit()
inputs = model.graph.input
name_to_input = {}
for input in inputs:
    name_to_input[input.name]=input
for initializer in model.graph.initializer:
    if initializer.name in name_to_input:
        inputs.remove(name_to_input[initializer.name])
passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
optimized_model = onnxoptimizer.optimize(model,passes)
save_path = model_path+"_sim_opt.onnx"
onnx.save(optimized_model,save_path)


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134

2022-08-03 09:40:26 [INFO]	ONNX model generated is valid.
2022-08-03 09:40:26 [INFO]	ONNX model saved in paddle/One_LSTM_time.onnx
2022-08-03 09:40:26 [INFO]	ONNX model generated is valid.
2022-08-03 09:40:26 [INFO]	ONNX model saved in paddle/Two_LSTM_time.onnx
2022-08-03 09:40:26 [INFO]	ONNX model generated is valid.
2022-08-03 09:40:26 [INFO]	ONNX model saved in paddle/Bi_Two_LSTM_time.onnx
1
2
3
4
5
6

1.3 sequence_lens

对于paddle lstm还有一个参数是 sequence_lens,这个是与pytorch不一样的。sequence_length用于指定time steps不小于sequence_length时，就给截断了，多余的当做填充元素，只以单层LSTM，time_major=True来做个小试验

class Seq_One_LSTM_time(nn.Layer):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,time_major=True)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = paddle.reshape(x,[b,c,h*w])
        h0 = paddle.zeros((1,1,4))
        c0 = paddle.zeros((1,1,4))
        sequence_lens = paddle.to_tensor([6]) # same shape to b
        x2 = paddle.squeeze(x,2)
        x3 = paddle.transpose(x2,(2,0,1))
        out,_ = self.rnn(inputs=x3,initial_states=(h0,c0),sequence_length=sequence_lens)
        return out # shape 1,6,4

model_path = "paddle/Seq_One_LSTM_time"
model = Seq_One_LSTM_time()
model.eval()
infer_shape = [1,3,1,6]
input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)

model = onnx.load(model_path+'.onnx')
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = model_path+"_sim.onnx"
onnx.save(model_sim,save_path)

model = onnx.load(save_path)
if model.ir_version<4:
    print("Model with ir_version below 4 requires to in clude initializer in graph input")
    exit()
inputs = model.graph.input
name_to_input = {}
for input in inputs:
    name_to_input[input.name]=input
for initializer in model.graph.initializer:
    if initializer.name in name_to_input:
        inputs.remove(name_to_input[initializer.name])
passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
optimized_model = onnxoptimizer.optimize(model,passes)
save_path = model_path+"_sim_opt.onnx"
onnx.save(optimized_model,save_path)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

W0803 15:08:57.771442 24847 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.4, Runtime API Version: 11.2
W0803 15:08:57.774729 24847 device_context.cc:465] device: 0, cuDNN Version: 8.1.


2022-08-03 15:09:00 [INFO]	ONNX model generated is valid.
2022-08-03 15:09:00 [INFO]	ONNX model saved in paddle/Seq_One_LSTM_time.onnx


/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle2onnx/constant/dtypes.py:47: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  np.bool: core.VarDesc.VarType.BOOL,
/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle2onnx/constant/dtypes.py:48: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  core.VarDesc.VarType.FP32: np.float,
/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle2onnx/constant/dtypes.py:53: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  core.VarDesc.VarType.BOOL: np.bool
/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
  return (isinstance(seq, collections.Sequence) and
/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/onnx/helper.py:343: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
  is_iterable = isinstance(value, collections.Iterable)
/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/onnx/numpy_helper.py:93: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. 
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  if arr.dtype == np.object:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

1.4 无初始状态

这一点是指我们在调用lstm的时候不会手动传入初始状态h0和c0,但内部会自动赋值初始状态为全0，pytorch也是这个原理，但是Onnx的结构图是不一样的，pytorch在不传入初始状态时的结构和paddle手动传入的结果是一样的，这个后边再说，综合对比所有的结构就可以看出差异

class Ini_One_LSTM_time(nn.Layer):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,time_major=True)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = paddle.reshape(x,[b,c,h*w])
        x2 = paddle.squeeze(x,2)
        x3 = paddle.transpose(x2,(2,0,1))
        out,_ = self.rnn(x3)
        return out # shape 1,6,4

model_path = "paddle/Ini_One_LSTM_time"
model = Ini_One_LSTM_time()
model.eval()
infer_shape = [1,3,1,6]
input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)

model = onnx.load(model_path+'.onnx')
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = model_path+"_sim.onnx"
onnx.save(model_sim,save_path)

model = onnx.load(save_path)
if model.ir_version<4:
    print("Model with ir_version below 4 requires to in clude initializer in graph input")
    exit()
inputs = model.graph.input
name_to_input = {}
for input in inputs:
    name_to_input[input.name]=input
for initializer in model.graph.initializer:
    if initializer.name in name_to_input:
        inputs.remove(name_to_input[initializer.name])
passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
optimized_model = onnxoptimizer.optimize(model,passes)
save_path = model_path+"_sim_opt.onnx"
onnx.save(optimized_model,save_path)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

2022-08-03 15:06:50 [INFO]	ONNX model generated is valid.
2022-08-03 15:06:50 [INFO]	ONNX model saved in paddle/Ini_One_LSTM_time.onnx
1
2

1.5 查看生成的onnx模型

paddle_onnx = sorted(os.listdir('paddle'))
paddle_onnx_paths = sorted([os.path.join('paddle',path) for path in paddle_onnx])
print(paddle_onnx)
1
2
3

['Bi_Two_LSTM_batch.onnx', 'Bi_Two_LSTM_batch_sim.onnx', 'Bi_Two_LSTM_batch_sim_opt.onnx', 'Bi_Two_LSTM_time.onnx', 'Bi_Two_LSTM_time_sim.onnx', 'Bi_Two_LSTM_time_sim_opt.onnx', 'Ini_One_LSTM_time.onnx', 'Ini_One_LSTM_time_sim.onnx', 'Ini_One_LSTM_time_sim_opt.onnx', 'One_LSTM_batch.onnx', 'One_LSTM_batch_sim.onnx', 'One_LSTM_batch_sim_opt.onnx', 'One_LSTM_time.onnx', 'One_LSTM_time_sim.onnx', 'One_LSTM_time_sim_opt.onnx', 'Seq_One_LSTM_time.onnx', 'Seq_One_LSTM_time_sim.onnx', 'Seq_One_LSTM_time_sim_opt.onnx', 'Two_LSTM_batch.onnx', 'Two_LSTM_batch_sim.onnx', 'Two_LSTM_batch_sim_opt.onnx', 'Two_LSTM_time.onnx', 'Two_LSTM_time_sim.onnx', 'Two_LSTM_time_sim_opt.onnx']
1

# 查看每个模型的大小
! du -sh paddle/*
1
2

16K	paddle/Bi_Two_LSTM_batch.onnx
8.0K	paddle/Bi_Two_LSTM_batch_sim.onnx
8.0K	paddle/Bi_Two_LSTM_batch_sim_opt.onnx
16K	paddle/Bi_Two_LSTM_time.onnx
8.0K	paddle/Bi_Two_LSTM_time_sim.onnx
8.0K	paddle/Bi_Two_LSTM_time_sim_opt.onnx
8.0K	paddle/Ini_One_LSTM_time.onnx
4.0K	paddle/Ini_One_LSTM_time_sim.onnx
4.0K	paddle/Ini_One_LSTM_time_sim_opt.onnx
8.0K	paddle/One_LSTM_batch.onnx
4.0K	paddle/One_LSTM_batch_sim.onnx
4.0K	paddle/One_LSTM_batch_sim_opt.onnx
8.0K	paddle/One_LSTM_time.onnx
4.0K	paddle/One_LSTM_time_sim.onnx
4.0K	paddle/One_LSTM_time_sim_opt.onnx
8.0K	paddle/Seq_One_LSTM_time.onnx
4.0K	paddle/Seq_One_LSTM_time_sim.onnx
4.0K	paddle/Seq_One_LSTM_time_sim_opt.onnx
12K	paddle/Two_LSTM_batch.onnx
4.0K	paddle/Two_LSTM_batch_sim.onnx
4.0K	paddle/Two_LSTM_batch_sim_opt.onnx
12K	paddle/Two_LSTM_time.onnx
4.0K	paddle/Two_LSTM_time_sim.onnx
4.0K	paddle/Two_LSTM_time_sim_opt.onnx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

加载onnx模型并推理，对比推理结果，两两一对

def onnx_infer(model_path,data):
    """_summary_

    Args:
        model_path (_type_): _description_
        data (_type_): _description_
    """
    onnx_session=onnxruntime.InferenceSession(model_path)
    input_name = onnx_session.get_inputs()[0].name
    output_name = onnx_session.get_outputs()[0].name
    result = onnx_session.run([output_name],{input_name:data})
    return result[0]
1
2
3
4
5
6
7
8
9
10
11
12

test_data = np.random.random((1,3,1,6)).astype(np.float32) # batch,channel,height,width
results={}

for i,onnx_path in enumerate(paddle_onnx_paths):

    result = onnx_infer(onnx_path,test_data)
    results[os.path.basename(onnx_path)]=result

    if i%3 ==2:
        try:
            values = list(results.values())
            np.testing.assert_allclose(values[0],values[1],rtol=1e-5)
            np.testing.assert_allclose(values[2],values[1],rtol=1e-5)
            print(f"{list(results.keys())} have same results")
        except:
            print(f"{list(results.keys())} have different results")
        finally:
            results={}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

2022-08-03 17:03:35.322915835 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.322938940 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.322945528 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.322951708 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.322957288 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.322963180 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.322968668 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.322974083 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_14 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.322979304 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_17 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.322984535 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_26 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.322990949 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_27 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.322996580 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_87 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.392730468 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.392757571 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.392764542 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.392770328 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.392775836 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.392781517 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.392786892 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.392792184 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_14 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.392797446 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_17 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.392802640 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_26 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.392808940 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_27 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.392814520 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_87 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482187472 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_0 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482211960 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_0 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482218895 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482224351 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_1 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482229628 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_6 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482235089 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_7 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482240251 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_10 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482245348 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482250361 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_45 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482255449 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_46 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482267806 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_47 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482273280 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_48 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482278188 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_49 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.482283068 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_50 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.544810887 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.544836135 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.544842815 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.544848451 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.544853781 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.544859396 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.


['Bi_Two_LSTM_batch.onnx', 'Bi_Two_LSTM_batch_sim.onnx', 'Bi_Two_LSTM_batch_sim_opt.onnx'] have same results
['Bi_Two_LSTM_time.onnx', 'Bi_Two_LSTM_time_sim.onnx', 'Bi_Two_LSTM_time_sim_opt.onnx'] have same results
['Ini_One_LSTM_time.onnx', 'Ini_One_LSTM_time_sim.onnx', 'Ini_One_LSTM_time_sim_opt.onnx'] have same results
['One_LSTM_batch.onnx', 'One_LSTM_batch_sim.onnx', 'One_LSTM_batch_sim_opt.onnx'] have same results
['One_LSTM_time.onnx', 'One_LSTM_time_sim.onnx', 'One_LSTM_time_sim_opt.onnx'] have same results
['Seq_One_LSTM_time.onnx', 'Seq_One_LSTM_time_sim.onnx', 'Seq_One_LSTM_time_sim_opt.onnx'] have same results
['Two_LSTM_batch.onnx', 'Two_LSTM_batch_sim.onnx', 'Two_LSTM_batch_sim_opt.onnx'] have same results
['Two_LSTM_time.onnx', 'Two_LSTM_time_sim.onnx', 'Two_LSTM_time_sim_opt.onnx'] have same results


2022-08-03 17:03:35.627142152 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.627166154 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.627172672 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.627178399 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.627184004 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.627189596 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.680342507 [W:onnxruntime:, graph.cc:3559 CleanUnusedInitializersAndNodeArgs] Removing initializer 'assign_0.tmp_0'. It is not used by any node and should be removed from the model.
2022-08-03 17:03:35.684640360 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.684662133 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.684672193 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.684680922 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.684688933 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.684697379 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.710238396 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.710278276 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.710292209 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.710303902 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.710315317 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.710326920 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.710337914 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.710348837 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_14 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.710359584 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_17 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.710370872 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_26 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.710383823 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_27 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.710395467 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_87 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.810340323 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.810366323 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.810374143 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.810380074 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.810385473 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.810391336 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.810396558 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.810401858 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_14 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.810407006 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_17 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.810412186 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_26 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.810417317 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_27 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2022-08-03 17:03:35.810424180 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_87 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93

以上onnx模型的推理可以看到在1e-5（十万分之一，6位有效数字）的容差下，结果一完全一样的。关于那么多warning,是sim后缀的模型产生的，原始模型和opt结尾的模型没有这个问题

以下部分是生成的以opt结尾的onnx模型的结构图：

类型	单层	双层	双层双向
time major=False
time_major=True

接着，是sequence_lens这个参数的影响，只是一个单层lstm,结果图是：
在这里插入图片描述
还有一个是无自定义初始状态的单层lstm的图，如下：

可以看到会有增加的算子，这部分其实是没必要的。

2 pytorch 生成LSTM

由于pytorch在导出onnx时，参数keep_initializers_as_inputs=False,所以只需要执行sim操作即可，否则要和paddle一样，多执行一个操作

2.1 batch_first=True

import os
import sys
sys.path.append('/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages')
import torch
from torch import nn
import numpy as np
from onnxsim import simplify
import onnxoptimizer
import onnx
import onnxruntime
1
2
3
4
5
6
7
8
9
10

class One_LSTM_batch(nn.Module):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,batch_first=True)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = torch.reshape(x,[b,c,h*w])
        x2 = torch.squeeze(x,2)
        x3 = torch.permute(x2,(0,2,1))
        out,_ = self.rnn(x3)
        return out # shape 1,6,4
    
model = One_LSTM_batch()
model.to('cpu')
model.eval()

input = torch.randn(1,3,1,6)
output = model(input)
print("output shape:",output.shape)

input_shapes=[(1,3,1,6)]
onnx_export_path = "torch/One_lstm_batch.onnx"
dummy_input=[]
for ele in input_shapes:
    dummy_input.append(torch.randn(ele))
dummy_input=tuple(dummy_input)

# torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"], dynamic_axes={'input' : {0 : 'batch_size'},'output' : {0 : 'batch_size'}})
torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True,keep_initializers_as_inputs=False,input_names=["input"], output_names=["output"])
print("export onnx to:",onnx_export_path)

onnx_model = onnx.load(onnx_export_path)
model_sim ,check = simplify(onnx_model)
assert check,"simplified onnx model could not be validated"
save_path = os.path.splitext(onnx_export_path)[0]+"_sim.onnx"
onnx.save(model_sim,save_path)

class Two_LSTM_batch(nn.Module):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,batch_first=True,num_layers=2)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = torch.reshape(x,[b,c,h*w])
        x2 = torch.squeeze(x,2)
        x3 = torch.permute(x2,(0,2,1))
        out,_ = self.rnn(x3)
        return out # shape 1,6,4
    
model = Two_LSTM_batch()
model.to('cpu')
model.eval()

input = torch.randn(1,3,1,6)
output = model(input)
print("output shape:",output.shape)

input_shapes=[(1,3,1,6)]
onnx_export_path = "torch/Two_lstm_batch.onnx"
dummy_input=[]
for ele in input_shapes:
    dummy_input.append(torch.randn(ele))
dummy_input=tuple(dummy_input)

# torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"], dynamic_axes={'input' : {0 : 'batch_size'},'output' : {0 : 'batch_size'}})
torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, keep_initializers_as_inputs=False,input_names=["input"], output_names=["output"])
print("export onnx to:",onnx_export_path)

onnx_model = onnx.load(onnx_export_path)
model_sim ,check = simplify(onnx_model)
assert check,"simplified onnx model could not be validated"
save_path = os.path.splitext(onnx_export_path)[0]+"_sim.onnx"
onnx.save(model_sim,save_path)

class Bi_Two_LSTM_batch(nn.Module):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,batch_first=True,num_layers=2,bidirectional=True)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = torch.reshape(x,[b,c,h*w])
        x2 = torch.squeeze(x,2)
        x3 = torch.permute(x2,(0,2,1))
        out,_ = self.rnn(x3)
        return out # shape 1,6,4
    
model = Bi_Two_LSTM_batch()
model.to('cpu')
model.eval()

input = torch.randn(1,3,1,6)
output = model(input)
print("output shape:",output.shape)

input_shapes=[(1,3,1,6)]
onnx_export_path = "torch/Bi_Two_lstm_batch.onnx"
dummy_input=[]
for ele in input_shapes:
    dummy_input.append(torch.randn(ele))
dummy_input=tuple(dummy_input)

# torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"], dynamic_axes={'input' : {0 : 'batch_size'},'output' : {0 : 'batch_size'}})
torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, keep_initializers_as_inputs=False,input_names=["input"], output_names=["output"])
print("export onnx to:",onnx_export_path)

onnx_model = onnx.load(onnx_export_path)
model_sim ,check = simplify(onnx_model)
assert check,"simplified onnx model could not be validated"
save_path = os.path.splitext(onnx_export_path)[0]+"_sim.onnx"
onnx.save(model_sim,save_path)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110

output shape: torch.Size([1, 6, 4])
export onnx to: torch/One_lstm_batch.onnx
output shape: torch.Size([1, 6, 4])
export onnx to: torch/Two_lstm_batch.onnx


/home/tl/anaconda3/envs/ptch/lib/python3.7/site-packages/torch/onnx/symbolic_opset9.py:2192: UserWarning: Exporting a model to ONNX with a batch_size other than 1, with a variable length with LSTM can cause an error when running the ONNX model with a different batch size. Make sure to save the model with a batch size of 1, or define the initial states (h0/c0) as inputs of the model. 
  "or define the initial states (h0/c0) as inputs of the model. ")
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.


output shape: torch.Size([1, 6, 8])
export onnx to: torch/Bi_Two_lstm_batch.onnx


WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

有一些warning,所以最好也可以手动传入参数

2.2 batch_first=False

class One_LSTM_time(nn.Module):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,batch_first=False)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = torch.reshape(x,[b,c,h*w])
        x2 = torch.squeeze(x,2)
        x3 = torch.permute(x2,(2,0,1))
        out,_ = self.rnn(x3)
        return out # shape 1,6,4
    
model = One_LSTM_time()
model.to('cpu')
model.eval()

input = torch.randn(1,3,1,6)
output = model(input)
print("output shape:",output.shape)

input_shapes=[(1,3,1,6)]
onnx_export_path = "torch/One_lstm_time.onnx"
dummy_input=[]
for ele in input_shapes:
    dummy_input.append(torch.randn(ele))
dummy_input=tuple(dummy_input)

# torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"], dynamic_axes={'input' : {0 : 'batch_size'},'output' : {0 : 'batch_size'}})
torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"])
print("export onnx to:",onnx_export_path)

onnx_model = onnx.load(onnx_export_path)
model_sim ,check = simplify(onnx_model)
assert check,"simplified onnx model could not be validated"
save_path = os.path.splitext(onnx_export_path)[0]+"_sim.onnx"
onnx.save(model_sim,save_path)

class Two_LSTM_time(nn.Module):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,batch_first=False,num_layers=2)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = torch.reshape(x,[b,c,h*w])
        x2 = torch.squeeze(x,2)
        x3 = torch.permute(x2,(2,0,1))
        out,_ = self.rnn(x3)
        return out # shape 1,6,4
    
model = Two_LSTM_time()
model.to('cpu')
model.eval()

input = torch.randn(1,3,1,6)
output = model(input)
print("output shape:",output.shape)

input_shapes=[(1,3,1,6)]
onnx_export_path = "torch/Two_lstm_time.onnx"
dummy_input=[]
for ele in input_shapes:
    dummy_input.append(torch.randn(ele))
dummy_input=tuple(dummy_input)

# torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"], dynamic_axes={'input' : {0 : 'batch_size'},'output' : {0 : 'batch_size'}})
torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"])
print("export onnx to:",onnx_export_path)

onnx_model = onnx.load(onnx_export_path)
model_sim ,check = simplify(onnx_model)
assert check,"simplified onnx model could not be validated"
save_path = os.path.splitext(onnx_export_path)[0]+"_sim.onnx"
onnx.save(model_sim,save_path)

class Bi_Two_LSTM_time(nn.Module):
    def __init__(self,in_channels=3,out_channels=4):
        super().__init__()
        self.rnn = nn.LSTM(in_channels,out_channels,batch_first=False,num_layers=2,bidirectional=True)
    def forward(self,x):
        # b,c,h,w =x.shape
        # x1 = torch.reshape(x,[b,c,h*w])
        x2 = torch.squeeze(x,2)
        x3 = torch.permute(x2,(2,0,1))
        out,_ = self.rnn(x3)
        return out # shape 1,6,4
    
model = Bi_Two_LSTM_time()
model.to('cpu')
model.eval()

input = torch.randn(1,3,1,6)
output = model(input)
print("output shape:",output.shape)

input_shapes=[(1,3,1,6)]
onnx_export_path = "torch/Bi_Two_lstm_time.onnx"
dummy_input=[]
for ele in input_shapes:
    dummy_input.append(torch.randn(ele))
dummy_input=tuple(dummy_input)

# torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"], dynamic_axes={'input' : {0 : 'batch_size'},'output' : {0 : 'batch_size'}})
torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"])
print("export onnx to:",onnx_export_path)

onnx_model = onnx.load(onnx_export_path)
model_sim ,check = simplify(onnx_model)
assert check,"simplified onnx model could not be validated"
save_path = os.path.splitext(onnx_export_path)[0]+"_sim.onnx"
onnx.save(model_sim,save_path)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110

output shape: torch.Size([6, 1, 4])
export onnx to: torch/One_lstm_time.onnx
output shape: torch.Size([6, 1, 4])
export onnx to: torch/Two_lstm_time.onnx


/home/tl/anaconda3/envs/ptch/lib/python3.7/site-packages/torch/onnx/symbolic_opset9.py:2192: UserWarning: Exporting a model to ONNX with a batch_size other than 1, with a variable length with LSTM can cause an error when running the ONNX model with a different batch size. Make sure to save the model with a batch size of 1, or define the initial states (h0/c0) as inputs of the model. 
  "or define the initial states (h0/c0) as inputs of the model. ")
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.


output shape: torch.Size([6, 1, 8])
export onnx to: torch/Bi_Two_lstm_time.onnx


WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

2.3 查看生成的onnx模型

pytorch_onnx = sorted(os.listdir('torch'))
pytorch_onnx_paths = sorted([os.path.join('torch',path) for path in pytorch_onnx])
print(pytorch_onnx)
1
2
3

['Bi_Two_lstm_batch.onnx', 'Bi_Two_lstm_batch_sim.onnx', 'Bi_Two_lstm_time.onnx', 'Bi_Two_lstm_time_sim.onnx', 'One_lstm_batch.onnx', 'One_lstm_batch_sim.onnx', 'One_lstm_time.onnx', 'One_lstm_time_sim.onnx', 'Two_lstm_batch.onnx', 'Two_lstm_batch_sim.onnx', 'Two_lstm_time.onnx', 'Two_lstm_time_sim.onnx']
1

! du -sh torch/*
1

8.0K	torch/Bi_Two_lstm_batch.onnx
8.0K	torch/Bi_Two_lstm_batch_sim.onnx
8.0K	torch/Bi_Two_lstm_time.onnx
8.0K	torch/Bi_Two_lstm_time_sim.onnx
4.0K	torch/One_lstm_batch.onnx
4.0K	torch/One_lstm_batch_sim.onnx
4.0K	torch/One_lstm_time.onnx
4.0K	torch/One_lstm_time_sim.onnx
8.0K	torch/Two_lstm_batch.onnx
4.0K	torch/Two_lstm_batch_sim.onnx
8.0K	torch/Two_lstm_time.onnx
4.0K	torch/Two_lstm_time_sim.onnx
1
2
3
4
5
6
7
8
9
10
11
12

def onnx_infer(model_path,data):
    """_summary_

    Args:
        model_path (_type_): _description_
        data (_type_): _description_
    """
    onnx_session=onnxruntime.InferenceSession(model_path)
    input_name = onnx_session.get_inputs()[0].name
    output_name = onnx_session.get_outputs()[0].name
    result = onnx_session.run([output_name],{input_name:data})
    return result[0]
1
2
3
4
5
6
7
8
9
10
11
12

test_data = np.random.random((1,3,1,6)).astype(np.float32) # batch,channel,height,width
results={}

for i,onnx_path in enumerate(pytorch_onnx_paths):

    result = onnx_infer(onnx_path,test_data)
    results[os.path.basename(onnx_path)]=result

    if i%2 ==1:
        try:
            values = list(results.values())
            np.testing.assert_allclose(values[0],values[1],rtol=1e-7)
            print(f"{list(results.keys())} have same results")
        except:
            print(f"{list(results.keys())} have different results")
        finally:
            results={}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

['Bi_Two_lstm_batch.onnx', 'Bi_Two_lstm_batch_sim.onnx'] have same results
['Bi_Two_lstm_time.onnx', 'Bi_Two_lstm_time_sim.onnx'] have same results
['One_lstm_batch.onnx', 'One_lstm_batch_sim.onnx'] have same results
['One_lstm_time.onnx', 'One_lstm_time_sim.onnx'] have same results
['Two_lstm_batch.onnx', 'Two_lstm_batch_sim.onnx'] have same results
['Two_lstm_time.onnx', 'Two_lstm_time_sim.onnx'] have same results
1
2
3
4
5
6

看起来pytorch转换成onnx在1e-7的精度下结果完全相同，相比paddle精度还是高一点

查看一下onnx的图

类型	单层	双层	双层双向
batch_first=True
batch_first=False

3 Tensorflow2 生成LSTM

在这里我使用的是tensorflow2.8版本。

import os
import tensorflow as tf
import onnx
import tf2onnx
from onnxsim import simplify
import onnxruntime
import numpy as np
from tensorflow.keras import layers as nn
#only use cpu
devices = tf.config.list_physical_devices("CPU")
tf.config.set_visible_devices(devices)

1
2
3
4
5
6
7
8
9
10
11
12

2022-08-09 16:10:42.419611: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
1

因为tensorflow和pytorch默认是返回每一步的output的，而tensorflow是可以指定返回最后一步还是全部，由reture_sequences来决定，为了保持一致，设置为True.
tensorflow的是初始输入是格式是B，H,W,C,以此为基础进行构建

3.1 time_major=False

def One_LSTM_batch():
    input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
    middle = tf.squeeze(input,axis=1)
    output = nn.LSTM(4,time_major=False,return_sequences=True,name='one')(middle)
    model = tf.keras.models.Model(input,output,name="One_LSTM_batch")
    return model
model = One_LSTM_batch()
#tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
model.save("tensorflow/One_LSTM_batch")
spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
output_path="tensorflow/"+model.name+'.onnx'
model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
output_names=[n.name for n in model_proto.graph.output]
model = onnx.load(output_path)
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = output_path.split('.')[0]+"_sim.onnx"
onnx.save(model_sim,save_path)

def Two_LSTM_batch():
    input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
    middle = tf.squeeze(input,axis=1)
    output1 = nn.LSTM(4,time_major=False,return_sequences=True,name='one')(middle)
    output = nn.LSTM(4,time_major=False,return_sequences=True,name='two')(output1)
    model = tf.keras.models.Model(input,output,name="Two_LSTM_batch")
    return model
model = Two_LSTM_batch()
#tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
model.save("tensorflow/Two_LSTM_batch")
spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
output_path="tensorflow/"+model.name+'.onnx'
model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
output_names=[n.name for n in model_proto.graph.output]
model = onnx.load(output_path)
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = output_path.split('.')[0]+"_sim.onnx"
onnx.save(model_sim,save_path)

def Bi_Two_LSTM_batch():
    input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
    middle = tf.squeeze(input,axis=1)
    output1 = nn.Bidirectional(nn.LSTM(4,time_major=False,return_sequences=True,name='one'),merge_mode="concat")(middle)
    output = nn.Bidirectional(nn.LSTM(4,time_major=False,return_sequences=True,name='two'),merge_mode="concat")(output1)
    model = tf.keras.models.Model(input,output,name="Bi_Two_LSTM_batch")
    return model
model = Bi_Two_LSTM_batch()
#tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
model.save("tensorflow/Bi_Two_LSTM_batch")
spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
output_path="tensorflow/"+model.name+'.onnx'
model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
output_names=[n.name for n in model_proto.graph.output]
model = onnx.load(output_path)
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = output_path.split('.')[0]+"_sim.onnx"
onnx.save(model_sim,save_path)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58

WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.


WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:absl:Found untraced functions such as lstm_cell_191_layer_call_fn, lstm_cell_191_layer_call_and_return_conditional_losses while saving (showing 2 of 2). These functions will not be directly callable after loading.


INFO:tensorflow:Assets written to: tensorflow/One_LSTM_batch/assets


INFO:tensorflow:Assets written to: tensorflow/One_LSTM_batch/assets
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
2022-08-16 16:19:10.687955: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-16 16:19:10.688047: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-16 16:19:10.706293: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-16 16:19:10.707378: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-16 16:19:10.708381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-16 16:19:10.709447: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-16 16:19:10.719888: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
  function_optimizer: Graph size after: 87 nodes (60), 98 edges (68), time = 2.078ms.
  function_optimizer: Graph size after: 87 nodes (0), 98 edges (0), time = 1.092ms.
Optimization results for grappler item: while_cond_1209930
  function_optimizer: function_optimizer did nothing. time = 0.004ms.
  function_optimizer: function_optimizer did nothing. time = 0ms.
Optimization results for grappler item: while_body_1209931
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0ms.

2022-08-16 16:19:10.780564: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-16 16:19:10.780636: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-16 16:19:10.798557: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-16 16:19:10.799642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-16 16:19:10.800648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-16 16:19:10.801728: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-16 16:19:10.812277: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
  constant_folding: Graph size after: 30 nodes (-23), 30 edges (-27), time = 1.377ms.
  function_optimizer: Graph size after: 30 nodes (0), 30 edges (0), time = 0.6ms.
  constant_folding: Graph size after: 30 nodes (0), 30 edges (0), time = 0.554ms.
  function_optimizer: Graph size after: 30 nodes (0), 30 edges (0), time = 0.592ms.
Optimization results for grappler item: while_cond_1209930
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.272ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.182ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_1209931
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.766ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.646ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.



WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.


WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:absl:Found untraced functions such as lstm_cell_192_layer_call_fn, lstm_cell_192_layer_call_and_return_conditional_losses, lstm_cell_193_layer_call_fn, lstm_cell_193_layer_call_and_return_conditional_losses while saving (showing 4 of 4). These functions will not be directly callable after loading.


INFO:tensorflow:Assets written to: tensorflow/Two_LSTM_batch/assets


INFO:tensorflow:Assets written to: tensorflow/Two_LSTM_batch/assets
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
2022-08-16 16:19:16.650941: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-16 16:19:16.651052: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-16 16:19:16.669042: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-16 16:19:16.670119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-16 16:19:16.671115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-16 16:19:16.672189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-16 16:19:16.689047: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
  function_optimizer: Graph size after: 170 nodes (120), 193 edges (136), time = 3.802ms.
  function_optimizer: Graph size after: 170 nodes (0), 193 edges (0), time = 2.068ms.
Optimization results for grappler item: while_cond_1221920
  function_optimizer: function_optimizer did nothing. time = 0.005ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_1221921
  function_optimizer: function_optimizer did nothing. time = 0.003ms.
  function_optimizer: function_optimizer did nothing. time = 0ms.
Optimization results for grappler item: while_body_1221499
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_cond_1221498
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0ms.

2022-08-16 16:19:16.788220: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-16 16:19:16.788291: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-16 16:19:16.812080: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-16 16:19:16.813181: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-16 16:19:16.814187: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-16 16:19:16.815259: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-16 16:19:16.832669: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
  constant_folding: Graph size after: 56 nodes (-46), 57 edges (-54), time = 2.381ms.
  function_optimizer: Graph size after: 56 nodes (0), 57 edges (0), time = 1.085ms.
  constant_folding: Graph size after: 56 nodes (0), 57 edges (0), time = 0.99ms.
  function_optimizer: Graph size after: 56 nodes (0), 57 edges (0), time = 1.093ms.
Optimization results for grappler item: while_cond_1221920
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.284ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.182ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_1221921
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.779ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.65ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_1221499
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.776ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.645ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
Optimization results for grappler item: while_cond_1221498
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.271ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.182ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.



WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.


WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:absl:Found untraced functions such as lstm_cell_195_layer_call_fn, lstm_cell_195_layer_call_and_return_conditional_losses, lstm_cell_196_layer_call_fn, lstm_cell_196_layer_call_and_return_conditional_losses, lstm_cell_198_layer_call_fn while saving (showing 5 of 8). These functions will not be directly callable after loading.


INFO:tensorflow:Assets written to: tensorflow/Bi_Two_LSTM_batch/assets


INFO:tensorflow:Assets written to: tensorflow/Bi_Two_LSTM_batch/assets
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
2022-08-16 16:19:31.962845: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-16 16:19:31.962974: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-16 16:19:31.980983: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-16 16:19:31.982057: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-16 16:19:31.983048: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-16 16:19:31.984113: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-16 16:19:32.016037: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
  function_optimizer: Graph size after: 348 nodes (244), 397 edges (276), time = 8.373ms.
  function_optimizer: Graph size after: 348 nodes (0), 397 edges (0), time = 4.505ms.
Optimization results for grappler item: while_cond_1256176
  function_optimizer: function_optimizer did nothing. time = 0.004ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_1255323
  function_optimizer: function_optimizer did nothing. time = 0.003ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_cond_1255322
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_1256601
  function_optimizer: function_optimizer did nothing. time = 0.003ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_cond_1255746
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_1256177
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_1255747
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_cond_1256600
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.

2022-08-16 16:19:32.194241: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-16 16:19:32.194314: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-16 16:19:32.212231: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-16 16:19:32.213308: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-16 16:19:32.214300: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-16 16:19:32.215369: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-16 16:19:32.248503: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
  constant_folding: Graph size after: 120 nodes (-92), 125 edges (-108), time = 4.738ms.
  function_optimizer: Graph size after: 120 nodes (0), 125 edges (0), time = 2.234ms.
  constant_folding: Graph size after: 120 nodes (0), 125 edges (0), time = 2.266ms.
  function_optimizer: Graph size after: 120 nodes (0), 125 edges (0), time = 2.25ms.
Optimization results for grappler item: while_cond_1256176
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.274ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.181ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
Optimization results for grappler item: while_body_1255323
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.79ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.647ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_cond_1255322
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.266ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.18ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_1256601
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.777ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.654ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_cond_1255746
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.276ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.181ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_1256177
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.78ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.651ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_1255747
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.772ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.649ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_cond_1256600
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.277ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.181ms.
  function_optimizer: function_optimizer did nothing. time = 0ms.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221

3.2 time_major=True

def One_LSTM_time():
    input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
    middle1 = tf.squeeze(input,axis=1)
    middle = tf.transpose(middle1,[1,0,2])
    output = nn.LSTM(4,time_major=True,return_sequences=True,name='one')(middle)
    model = tf.keras.models.Model(input,output,name="One_LSTM_time")
    return model
model = One_LSTM_time()
#tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
model.save("tensorflow/One_LSTM_time")
spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
output_path="tensorflow/"+model.name+'.onnx'
model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
output_names=[n.name for n in model_proto.graph.output]
model = onnx.load(output_path)
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = output_path.split('.')[0]+"_sim.onnx"
onnx.save(model_sim,save_path)

def Two_LSTM_time():
    input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
    middle1 = tf.squeeze(input,axis=1)
    middle = tf.transpose(middle1,[1,0,2])
    output1 = nn.LSTM(4,time_major=True,return_sequences=True,name='one')(middle)
    output = nn.LSTM(4,time_major=True,return_sequences=True,name='two')(output1)
    model = tf.keras.models.Model(input,output,name="Two_LSTM_time")
    return model
model = Two_LSTM_time()
#tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
model.save("tensorflow/Two_LSTM_time")
spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
output_path="tensorflow/"+model.name+'.onnx'

model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
output_names=[n.name for n in model_proto.graph.output]
model = onnx.load(output_path)
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = output_path.split('.')[0]+"_sim.onnx"
onnx.save(model_sim,save_path)

def Bi_Two_LSTM_time():
    input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
    middle1 = tf.squeeze(input,axis=1)
    middle = tf.transpose(middle1,[1,0,2])
    output1 = nn.Bidirectional(nn.LSTM(4,time_major=True,return_sequences=True,name='one'),merge_mode="concat")(middle)
    output = nn.Bidirectional(nn.LSTM(4,time_major=True,return_sequences=True,name='two'),merge_mode="concat")(output1)
    model = tf.keras.models.Model(input,output,name="Bi_Two_LSTM_time")
    return model
model = Bi_Two_LSTM_time()
#tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
model.save("tensorflow/Bi_Two_LSTM_time")
spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
output_path="tensorflow/"+model.name+'.onnx'
model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
output_names=[n.name for n in model_proto.graph.output]
model = onnx.load(output_path)
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = output_path.split('.')[0]+"_sim.onnx"
onnx.save(model_sim,save_path)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62

WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.


WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:absl:Found untraced functions such as lstm_cell_8_layer_call_fn, lstm_cell_8_layer_call_and_return_conditional_losses while saving (showing 2 of 2). These functions will not be directly callable after loading.


INFO:tensorflow:Assets written to: tensorflow/One_LSTM_time/assets


INFO:tensorflow:Assets written to: tensorflow/One_LSTM_time/assets
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
2022-08-16 15:43:52.434772: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-16 15:43:52.434862: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-16 15:43:52.452892: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-16 15:43:52.453967: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-16 15:43:52.454957: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-16 15:43:52.456033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-16 15:43:52.465981: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
  function_optimizer: Graph size after: 85 nodes (56), 96 edges (64), time = 1.955ms.
  function_optimizer: Graph size after: 85 nodes (0), 96 edges (0), time = 1.076ms.
Optimization results for grappler item: while_cond_48297
  function_optimizer: function_optimizer did nothing. time = 0.004ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_48298
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0ms.

2022-08-16 15:43:52.520244: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-16 15:43:52.520309: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-16 15:43:52.538162: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-16 15:43:52.539247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-16 15:43:52.540262: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-16 15:43:52.541338: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-16 15:43:52.551627: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
  constant_folding: Graph size after: 28 nodes (-23), 28 edges (-27), time = 1.298ms.
  function_optimizer: Graph size after: 28 nodes (0), 28 edges (0), time = 0.575ms.
  constant_folding: Graph size after: 28 nodes (0), 28 edges (0), time = 0.512ms.
  function_optimizer: Graph size after: 28 nodes (0), 28 edges (0), time = 0.588ms.
Optimization results for grappler item: while_cond_48297
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.269ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.179ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_48298
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.769ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.642ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.



WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.


WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:absl:Found untraced functions such as lstm_cell_9_layer_call_fn, lstm_cell_9_layer_call_and_return_conditional_losses, lstm_cell_10_layer_call_fn, lstm_cell_10_layer_call_and_return_conditional_losses while saving (showing 4 of 4). These functions will not be directly callable after loading.


INFO:tensorflow:Assets written to: tensorflow/Two_LSTM_time/assets


INFO:tensorflow:Assets written to: tensorflow/Two_LSTM_time/assets
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
2022-08-16 15:43:57.663352: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-16 15:43:57.663442: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-16 15:43:57.681413: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-16 15:43:57.682504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-16 15:43:57.683492: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-16 15:43:57.684558: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-16 15:43:57.701027: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
  function_optimizer: Graph size after: 164 nodes (112), 187 edges (128), time = 3.95ms.
  function_optimizer: Graph size after: 164 nodes (0), 187 edges (0), time = 2.055ms.
Optimization results for grappler item: while_cond_59528
  function_optimizer: function_optimizer did nothing. time = 0.004ms.
  function_optimizer: function_optimizer did nothing. time = 0ms.
Optimization results for grappler item: while_body_59937
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0ms.
Optimization results for grappler item: while_cond_59936
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_59529
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0ms.

2022-08-16 15:43:57.789964: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-16 15:43:57.790031: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-16 15:43:57.807917: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-16 15:43:57.809002: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-16 15:43:57.809991: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-16 15:43:57.811055: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-16 15:43:57.832990: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
  constant_folding: Graph size after: 50 nodes (-46), 51 edges (-54), time = 2.266ms.
  function_optimizer: Graph size after: 50 nodes (0), 51 edges (0), time = 1.037ms.
  constant_folding: Graph size after: 50 nodes (0), 51 edges (0), time = 0.898ms.
  function_optimizer: Graph size after: 50 nodes (0), 51 edges (0), time = 1.077ms.
Optimization results for grappler item: while_cond_59528
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.274ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.182ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_59937
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.767ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.655ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
Optimization results for grappler item: while_cond_59936
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.261ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.183ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_59529
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 2.66ms.
  function_optimizer: function_optimizer did nothing. time = 0.004ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 2.102ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.



WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.


WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:absl:Found untraced functions such as lstm_cell_12_layer_call_fn, lstm_cell_12_layer_call_and_return_conditional_losses, lstm_cell_13_layer_call_fn, lstm_cell_13_layer_call_and_return_conditional_losses, lstm_cell_15_layer_call_fn while saving (showing 5 of 8). These functions will not be directly callable after loading.


INFO:tensorflow:Assets written to: tensorflow/Bi_Two_LSTM_time/assets


INFO:tensorflow:Assets written to: tensorflow/Bi_Two_LSTM_time/assets
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
2022-08-16 15:44:12.614816: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-16 15:44:12.614936: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-16 15:44:12.633044: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-16 15:44:12.634138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-16 15:44:12.635134: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-16 15:44:12.636200: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-16 15:44:12.667021: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
  function_optimizer: Graph size after: 334 nodes (228), 383 edges (260), time = 8.111ms.
  function_optimizer: Graph size after: 334 nodes (0), 383 edges (0), time = 4.232ms.
Optimization results for grappler item: while_body_93140
  function_optimizer: function_optimizer did nothing. time = 0.005ms.
  function_optimizer: function_optimizer did nothing. time = 0ms.
Optimization results for grappler item: while_body_93550
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0ms.
Optimization results for grappler item: while_cond_92313
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_cond_93139
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_cond_93549
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_92314
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_cond_92723
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_92724
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.

2022-08-16 15:44:12.837667: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-16 15:44:12.837739: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-16 15:44:12.855667: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-16 15:44:12.856749: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-16 15:44:12.857738: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-16 15:44:12.858802: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-16 15:44:12.896488: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
  constant_folding: Graph size after: 106 nodes (-92), 111 edges (-108), time = 4.422ms.
  function_optimizer: Graph size after: 106 nodes (0), 111 edges (0), time = 2.035ms.
  constant_folding: Graph size after: 106 nodes (0), 111 edges (0), time = 1.936ms.
  function_optimizer: Graph size after: 106 nodes (0), 111 edges (0), time = 2.095ms.
Optimization results for grappler item: while_body_93140
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.783ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.648ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_93550
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.778ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.65ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_cond_92313
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.265ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.181ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_cond_93139
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.258ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.179ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_cond_93549
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.893ms.
  function_optimizer: function_optimizer did nothing. time = 0.004ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.633ms.
  function_optimizer: function_optimizer did nothing. time = 0.004ms.
Optimization results for grappler item: while_body_92314
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 1.257ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.959ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
Optimization results for grappler item: while_cond_92723
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.377ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.274ms.
  function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: while_body_92724
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 1.113ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
  constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.952ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221

3.3 return_state=True

支持上一层的state做为下一层的初始状态

def One_LSTM_time_state():
    input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
    middle1 = tf.squeeze(input,axis=1)
    middle = tf.transpose(middle1,[1,0,2])
    output,h_state,c_state = nn.LSTM(4,time_major=True,return_sequences=True,return_state=True,name='one')(middle)
    model = tf.keras.models.Model(inputs=input,outputs=[output,h_state,c_state],name="One_LSTM_time_state")
    return model
model = One_LSTM_time_state()
#tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
model.save("tensorflow/One_LSTM_time_state")
spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
output_path="tensorflow/"+model.name+'.onnx'
model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
output_names=[n.name for n in model_proto.graph.output]
model = onnx.load(output_path)
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = output_path.split('.')[0]+"_sim.onnx"
onnx.save(model_sim,save_path)

def Two_LSTM_time_state():
    input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
    middle1 = tf.squeeze(input,axis=1)
    middle = tf.transpose(middle1,[1,0,2])
    output1,h_state,c_state = nn.LSTM(4,time_major=True,return_sequences=True,return_state=True,name='one')(middle)
    output,h_state1,c_state1 = nn.LSTM(4,time_major=True,return_sequences=True,return_state=True,name='two')(output1,initial_state=(h_state,c_state))
    model = tf.keras.models.Model(inputs=input,outputs=[output,h_state1,c_state1],name="Two_LSTM_time_state")
    return model
model = Two_LSTM_time_state()
#tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
model.save("tensorflow/Two_LSTM_time_state")
spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
output_path="tensorflow/"+model.name+'.onnx'

model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
output_names=[n.name for n in model_proto.graph.output]
model = onnx.load(output_path)
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = output_path.split('.')[0]+"_sim.onnx"
onnx.save(model_sim,save_path)

def Bi_Two_LSTM_time_state():
    input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
    middle1 = tf.squeeze(input,axis=1)
    middle = tf.transpose(middle1,[1,0,2])
    output1,h_state,c_state,h_state1,c_state1= nn.Bidirectional(nn.LSTM(4,time_major=True,return_sequences=True,return_state=True,name='one'),merge_mode="concat")(middle)
    output= nn.Bidirectional(nn.LSTM(4,time_major=True,return_sequences=True,return_state=True,name='two'),merge_mode="concat")(output1,initial_state=(h_state,c_state,h_state1,c_state1))
    model = tf.keras.models.Model(inputs=input,outputs=output,name="Bi_Two_LSTM_time_state")
    return model
model = Bi_Two_LSTM_time_state()
#tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
model.save("tensorflow/Bi_Two_LSTM_time_state")
spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
output_path="tensorflow/"+model.name+'.onnx'
model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
output_names=[n.name for n in model_proto.graph.output]
model = onnx.load(output_path)
model_sim ,check = simplify(model)
assert check,"simplified onnx model could not be validated"
save_path = output_path.split('.')[0]+"_sim.onnx"
onnx.save(model_sim,save_path)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62

WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.


WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:absl:Found untraced functions such as lstm_cell_151_layer_call_fn, lstm_cell_151_layer_call_and_return_conditional_losses while saving (showing 2 of 2). These functions will not be directly callable after loading.


INFO:tensorflow:Assets written to: tensorflow/One_LSTM_time_state/assets


INFO:tensorflow:Assets written to: tensorflow/One_LSTM_time_state/assets
2022-08-10 10:27:29.569744: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-10 10:27:29.569887: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-10 10:27:29.587666: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-10 10:27:29.588730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-10 10:27:29.589789: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-10 10:27:29.590839: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-10 10:27:29.657141: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-10 10:27:29.657227: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-10 10:27:29.674878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-10 10:27:29.675934: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-10 10:27:29.676968: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-10 10:27:29.678019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0


WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.


WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:absl:Found untraced functions such as lstm_cell_152_layer_call_fn, lstm_cell_152_layer_call_and_return_conditional_losses, lstm_cell_153_layer_call_fn, lstm_cell_153_layer_call_and_return_conditional_losses while saving (showing 4 of 4). These functions will not be directly callable after loading.


INFO:tensorflow:Assets written to: tensorflow/Two_LSTM_time_state/assets


INFO:tensorflow:Assets written to: tensorflow/Two_LSTM_time_state/assets
2022-08-10 10:27:34.989735: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-10 10:27:34.989854: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-10 10:27:35.007588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-10 10:27:35.008641: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-10 10:27:35.009675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-10 10:27:35.010708: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-10 10:27:35.118860: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-10 10:27:35.118956: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-10 10:27:35.136643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-10 10:27:35.137703: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-10 10:27:35.138736: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-10 10:27:35.139769: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0


WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.


WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:absl:Found untraced functions such as lstm_cell_155_layer_call_fn, lstm_cell_155_layer_call_and_return_conditional_losses, lstm_cell_156_layer_call_fn, lstm_cell_156_layer_call_and_return_conditional_losses, lstm_cell_158_layer_call_fn while saving (showing 5 of 8). These functions will not be directly callable after loading.


INFO:tensorflow:Assets written to: tensorflow/Bi_Two_LSTM_time_state/assets


INFO:tensorflow:Assets written to: tensorflow/Bi_Two_LSTM_time_state/assets
2022-08-10 10:27:50.572328: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-10 10:27:50.572459: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-10 10:27:50.590225: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-10 10:27:50.591290: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-10 10:27:50.592334: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-10 10:27:50.593388: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
2022-08-10 10:27:50.800566: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
2022-08-10 10:27:50.800672: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2022-08-10 10:27:50.818458: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
2022-08-10 10:27:50.819552: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
2022-08-10 10:27:50.820605: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
2022-08-10 10:27:50.821643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73

3.4 查看生成的模型

tf_models = sorted(os.listdir('tensorflow'))
tf_models_path=[os.path.join('tensorflow',p) for p in tf_models if p.endswith('onnx')]
print(tf_models)
1
2
3

['Bi_Two_LSTM_batch', 'Bi_Two_LSTM_batch.onnx', 'Bi_Two_LSTM_batch_sim.onnx', 'Bi_Two_LSTM_time', 'Bi_Two_LSTM_time.onnx', 'Bi_Two_LSTM_time_sim.onnx', 'Bi_Two_LSTM_time_state', 'Bi_Two_LSTM_time_state.onnx', 'Bi_Two_LSTM_time_state_sim.onnx', 'One_LSTM_batch', 'One_LSTM_batch.onnx', 'One_LSTM_batch_sim.onnx', 'One_LSTM_time', 'One_LSTM_time.onnx', 'One_LSTM_time_sim.onnx', 'One_LSTM_time_state', 'One_LSTM_time_state.onnx', 'One_LSTM_time_state_sim.onnx', 'Two_LSTM_batch', 'Two_LSTM_batch.onnx', 'Two_LSTM_batch_sim.onnx', 'Two_LSTM_time', 'Two_LSTM_time.onnx', 'Two_LSTM_time_sim.onnx', 'Two_LSTM_time_state', 'Two_LSTM_time_state.onnx', 'Two_LSTM_time_state_sim.onnx']
1

tf_models_path
1

['tensorflow/Bi_Two_LSTM_batch.onnx',
 'tensorflow/Bi_Two_LSTM_batch_sim.onnx',
 'tensorflow/Bi_Two_LSTM_time.onnx',
 'tensorflow/Bi_Two_LSTM_time_sim.onnx',
 'tensorflow/Bi_Two_LSTM_time_state.onnx',
 'tensorflow/Bi_Two_LSTM_time_state_sim.onnx',
 'tensorflow/One_LSTM_batch.onnx',
 'tensorflow/One_LSTM_batch_sim.onnx',
 'tensorflow/One_LSTM_time.onnx',
 'tensorflow/One_LSTM_time_sim.onnx',
 'tensorflow/One_LSTM_time_state.onnx',
 'tensorflow/One_LSTM_time_state_sim.onnx',
 'tensorflow/Two_LSTM_batch.onnx',
 'tensorflow/Two_LSTM_batch_sim.onnx',
 'tensorflow/Two_LSTM_time.onnx',
 'tensorflow/Two_LSTM_time_sim.onnx',
 'tensorflow/Two_LSTM_time_state.onnx',
 'tensorflow/Two_LSTM_time_state_sim.onnx']
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

! du -sh tensorflow/*
1

4.2M	tensorflow/Bi_Two_LSTM_batch
8.0K	tensorflow/Bi_Two_LSTM_batch.onnx
8.0K	tensorflow/Bi_Two_LSTM_batch_sim.onnx
4.0M	tensorflow/Bi_Two_LSTM_time
8.0K	tensorflow/Bi_Two_LSTM_time.onnx
8.0K	tensorflow/Bi_Two_LSTM_time_sim.onnx
4.0M	tensorflow/Bi_Two_LSTM_time_state
8.0K	tensorflow/Bi_Two_LSTM_time_state.onnx
8.0K	tensorflow/Bi_Two_LSTM_time_state_sim.onnx
708K	tensorflow/One_LSTM_batch
4.0K	tensorflow/One_LSTM_batch.onnx
4.0K	tensorflow/One_LSTM_batch_sim.onnx
684K	tensorflow/One_LSTM_time
4.0K	tensorflow/One_LSTM_time.onnx
4.0K	tensorflow/One_LSTM_time_sim.onnx
696K	tensorflow/One_LSTM_time_state
4.0K	tensorflow/One_LSTM_time_state.onnx
4.0K	tensorflow/One_LSTM_time_state_sim.onnx
1.4M	tensorflow/Two_LSTM_batch
4.0K	tensorflow/Two_LSTM_batch.onnx
4.0K	tensorflow/Two_LSTM_batch_sim.onnx
1.3M	tensorflow/Two_LSTM_time
4.0K	tensorflow/Two_LSTM_time.onnx
4.0K	tensorflow/Two_LSTM_time_sim.onnx
1.3M	tensorflow/Two_LSTM_time_state
4.0K	tensorflow/Two_LSTM_time_state.onnx
4.0K	tensorflow/Two_LSTM_time_state_sim.onnx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

def onnx_infer(model_path,data):
    """_summary_

    Args:
        model_path (_type_): _description_
        data (_type_): _description_
    """
    onnx_session=onnxruntime.InferenceSession(model_path)
    input_name = onnx_session.get_inputs()[0].name
    output_name = onnx_session.get_outputs()[0].name
    result = onnx_session.run([output_name],{input_name:data})
    return result[0]
1
2
3
4
5
6
7
8
9
10
11
12

tf_models_path=["tensorflow/One_LSTM_time.onnx"]
1

test_data = np.random.random(size=(1,1,6,3)).astype(np.float32) # batch,channel,height,width
for i,onnx_path in enumerate(tf_models_path):
    base_path = os.path.splitext(onnx_path)[0]
    if not base_path.endswith('sim'):
        results={}
        onnx_sim=base_path+'_sim.onnx'
        tf_result = tf.keras.models.load_model(base_path)(tf.convert_to_tensor(test_data))
        # print(f'base_path:{base_path} len:{len(tf_result)} type:{type(tf_result)}')
        if isinstance(tf_result,list):
            tf_result=tf_result[0].numpy()
        results[os.path.basename(base_path)]=tf_result
        onnx_result = onnx_infer(onnx_path,test_data)
        results[os.path.basename(onnx_path)]=onnx_result
        sim_result = onnx_infer(onnx_sim,test_data)
        results[os.path.basename(onnx_sim)]=sim_result
        try:
            values = list(results.values())
            np.testing.assert_allclose(values[0],values[1],rtol=1e-5)
            np.testing.assert_allclose(values[1],values[2],rtol=1e-5)
            print(f"{list(results.keys())} have same results")
        except:
            print(f"{list(results.keys())} have different results")
        finally:
            results={}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


['Bi_Two_LSTM_batch', 'Bi_Two_LSTM_batch.onnx', 'Bi_Two_LSTM_batch_sim.onnx'] have same results
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


['Bi_Two_LSTM_time', 'Bi_Two_LSTM_time.onnx', 'Bi_Two_LSTM_time_sim.onnx'] have same results
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


['Bi_Two_LSTM_time_state', 'Bi_Two_LSTM_time_state.onnx', 'Bi_Two_LSTM_time_state_sim.onnx'] have same results
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


['One_LSTM_batch', 'One_LSTM_batch.onnx', 'One_LSTM_batch_sim.onnx'] have same results
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


['One_LSTM_time', 'One_LSTM_time.onnx', 'One_LSTM_time_sim.onnx'] have same results
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


['One_LSTM_time_state', 'One_LSTM_time_state.onnx', 'One_LSTM_time_state_sim.onnx'] have same results
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


['Two_LSTM_batch', 'Two_LSTM_batch.onnx', 'Two_LSTM_batch_sim.onnx'] have same results
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


['Two_LSTM_time', 'Two_LSTM_time.onnx', 'Two_LSTM_time_sim.onnx'] have same results
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.


['Two_LSTM_time_state', 'Two_LSTM_time_state.onnx', 'Two_LSTM_time_state_sim.onnx'] have same results
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63

查看onnx的结构：
tensorflow不同与前两个框架默认返回状态，tensorflow可以指定是否返回，而且要返回的话会返回所用，而不仅仅是最后一step的，

类型	单层	双层	双层双向
time_major=False,return_state=False
time_major=True,return_state=False
time_major=True,returnstate=True

4 lstm 转换工具

https://gitee.com/tdddeeel/lstm_pytorch_tensorflow_paddle 在这里有一个小工具，可以将双向lstm转换成两个单向的lstm
这里将转换pytorch batch_first=True及tensorflow time_major=True的双层双向lstm为例进行转换，转换前后的两个onnx对比精度也是可以参考我边的例子，经测试是没有问题的。目前该工具只对paddle和pytorch的模型有效，
使用方法

./bilstm_opt --onnx_path ... --save_path ..
1

看下onnx:

类型	转换前	转换后
pytorch

相关阅读:
redis内存淘汰策略
 补充：selenium操作已打开的浏览器窗口
 erlang练习题(四)
【线性代数】沉浸式线性代数在线学习网站
 一、thymeleaf简介
 动态规划题：统计每个月兔子的总数
 [第五空间 2021]web 复现wp
Related to the third param of function “sort“ & Lambda of Cpp
【iOS】计算器实现
 【push,pop,shift,unshift】手写数组push,pop,shift,unshiftt方法
原文地址：https://blog.csdn.net/u011119817/article/details/126465541