• 不同框架实现LSTM代码及转Onnx方法



    本文将实现用paddle,pytorch,tensorflow2三种框架实现lstm的单层、双层、双向双层三种形式,并将整个过程生成的模型转换成onnx,并将onnx模型的结构展示。本文所有代码及执行结果都可以在https://gitee.com/tdddeeel/lstm_pytorch_tensorflow_paddle这里找到。

    1、Paddle 生成LSTM

    整个过程包括模型定义、导出、转onnx、优化onnx.最后的一个onnx是我们最后需要的onnx,可以查看图。这部分实际包括了paddle生成模型及转Onnx的过程,关于更多的整个流程,请参考博客

    import os
    import sys
    import paddle
    from paddle import nn
    import numpy as np
    from onnxsim import simplify
    import onnxoptimizer
    import onnx
    import onnxruntime
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:36: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
      'nearest': Image.NEAREST,
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:37: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
      'bilinear': Image.BILINEAR,
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:38: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
      'bicubic': Image.BICUBIC,
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:39: DeprecationWarning: BOX is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BOX instead.
      'box': Image.BOX,
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:40: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead.
      'lanczos': Image.LANCZOS,
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/vision/transforms/functional_pil.py:41: DeprecationWarning: HAMMING is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.HAMMING instead.
      'hamming': Image.HAMMING
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/onnx/mapping.py:27: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. 
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      int(TensorProto.STRING): np.dtype(np.object)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15

    1.1 time_major=False

    与pytorch的batch_first=True是相同的功能

    class One_LSTM_batch(nn.Layer):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,time_major=False)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = paddle.reshape(x,[b,c,h*w])
            h0 = paddle.zeros((1,1,4))
            c0 = paddle.zeros((1,1,4))
            x2 = paddle.squeeze(x,2)
            x3 = paddle.transpose(x2,(0,2,1))
            out,_ = self.rnn(x3,(h0,c0))
            return out # shape 1,6,4
    
    model_path = "paddle/One_LSTM_batch"
    model = One_LSTM_batch()
    model.eval()
    infer_shape = [1,3,1,6]
    input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
    paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)
    
    model = onnx.load(model_path+'.onnx')
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = model_path+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    model = onnx.load(save_path)
    if model.ir_version<4:
        print("Model with ir_version below 4 requires to in clude initializer in graph input")
        exit()
    inputs = model.graph.input
    name_to_input = {}
    for input in inputs:
        name_to_input[input.name]=input
    for initializer in model.graph.initializer:
        if initializer.name in name_to_input:
            inputs.remove(name_to_input[initializer.name])
    passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
    optimized_model = onnxoptimizer.optimize(model,passes)
    save_path = model_path+"_sim_opt.onnx"
    onnx.save(optimized_model,save_path)
    
    
    
    class Two_LSTM_batch(nn.Layer):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,time_major=False,num_layers=2)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = paddle.reshape(x,[b,c,h*w])
            h0 = paddle.zeros((2,1,4))
            c0 = paddle.zeros((2,1,4))
            x2 = paddle.squeeze(x,2)
            x3 = paddle.transpose(x2,(0,2,1))
            out,_ = self.rnn(x3,(h0,c0))
            return out # shape 1,6,4
    
    model_path = "paddle/Two_LSTM_batch"
    model = Two_LSTM_batch()
    model.eval()
    infer_shape = [1,3,1,6]
    input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
    paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)
    
    model = onnx.load(model_path+'.onnx')
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = model_path+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    model = onnx.load(save_path)
    if model.ir_version<4:
        print("Model with ir_version below 4 requires to in clude initializer in graph input")
        exit()
    inputs = model.graph.input
    name_to_input = {}
    for input in inputs:
        name_to_input[input.name]=input
    for initializer in model.graph.initializer:
        if initializer.name in name_to_input:
            inputs.remove(name_to_input[initializer.name])
    passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
    optimized_model = onnxoptimizer.optimize(model,passes)
    save_path = model_path+"_sim_opt.onnx"
    onnx.save(optimized_model,save_path)
    
    
    
    class Bi_Two_LSTM_batch(nn.Layer):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,time_major=False,direction="bidirect",num_layers=2)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = paddle.reshape(x,[b,c,h*w])
            h0 = paddle.zeros((4,1,4))
            c0 = paddle.zeros((4,1,4))
            x2 = paddle.squeeze(x,2)
            x3 = paddle.transpose(x2,(0,2,1))
            out,_ = self.rnn(x3,(h0,c0))
            return out # shape 1,6,4
    
    model_path = "paddle/Bi_Two_LSTM_batch"
    model = Bi_Two_LSTM_batch()
    model.eval()
    infer_shape = [1,3,1,6]
    input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
    paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)
    
    model = onnx.load(model_path+'.onnx')
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = model_path+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    model = onnx.load(save_path)
    if model.ir_version<4:
        print("Model with ir_version below 4 requires to in clude initializer in graph input")
        exit()
    inputs = model.graph.input
    name_to_input = {}
    for input in inputs:
        name_to_input[input.name]=input
    for initializer in model.graph.initializer:
        if initializer.name in name_to_input:
            inputs.remove(name_to_input[initializer.name])
    passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
    optimized_model = onnxoptimizer.optimize(model,passes)
    save_path = model_path+"_sim_opt.onnx"
    onnx.save(optimized_model,save_path)
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    2022-08-03 09:39:41 [INFO]	ONNX model generated is valid.
    2022-08-03 09:39:41 [INFO]	ONNX model saved in paddle/One_LSTM_batch.onnx
    2022-08-03 09:39:41 [INFO]	ONNX model generated is valid.
    2022-08-03 09:39:41 [INFO]	ONNX model saved in paddle/Two_LSTM_batch.onnx
    2022-08-03 09:39:41 [INFO]	ONNX model generated is valid.
    2022-08-03 09:39:41 [INFO]	ONNX model saved in paddle/Bi_Two_LSTM_batch.onnx
    
    
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/onnx/numpy_helper.py:93: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. 
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      if arr.dtype == np.object:
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11

    1.2 time_major=True

    与pytorch的batch_first=False相同

    class One_LSTM_time(nn.Layer):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,time_major=True)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = paddle.reshape(x,[b,c,h*w])
            h0 = paddle.zeros((1,1,4))
            c0 = paddle.zeros((1,1,4))
            x2 = paddle.squeeze(x,2)
            x3 = paddle.transpose(x2,(2,0,1))
            out,_ = self.rnn(x3,(h0,c0))
            return out # shape 1,6,4
    
    model_path = "paddle/One_LSTM_time"
    model = One_LSTM_time()
    model.eval()
    infer_shape = [1,3,1,6]
    input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
    paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)
    
    model = onnx.load(model_path+'.onnx')
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = model_path+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    model = onnx.load(save_path)
    if model.ir_version<4:
        print("Model with ir_version below 4 requires to in clude initializer in graph input")
        exit()
    inputs = model.graph.input
    name_to_input = {}
    for input in inputs:
        name_to_input[input.name]=input
    for initializer in model.graph.initializer:
        if initializer.name in name_to_input:
            inputs.remove(name_to_input[initializer.name])
    passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
    optimized_model = onnxoptimizer.optimize(model,passes)
    save_path = model_path+"_sim_opt.onnx"
    onnx.save(optimized_model,save_path)
    
    
    class Two_LSTM_time(nn.Layer):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,time_major=True,num_layers=2)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = paddle.reshape(x,[b,c,h*w])
            h0 = paddle.zeros((2,1,4))
            c0 = paddle.zeros((2,1,4))
            x2 = paddle.squeeze(x,2)
            x3 = paddle.transpose(x2,(2,0,1))
            out,_ = self.rnn(x3,(h0,c0))
            return out # shape 1,6,4
    
    model_path = "paddle/Two_LSTM_time"
    model = Two_LSTM_time()
    model.eval()
    infer_shape = [1,3,1,6]
    input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
    paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)
    
    model = onnx.load(model_path+'.onnx')
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = model_path+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    model = onnx.load(save_path)
    if model.ir_version<4:
        print("Model with ir_version below 4 requires to in clude initializer in graph input")
        exit()
    inputs = model.graph.input
    name_to_input = {}
    for input in inputs:
        name_to_input[input.name]=input
    for initializer in model.graph.initializer:
        if initializer.name in name_to_input:
            inputs.remove(name_to_input[initializer.name])
    passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
    optimized_model = onnxoptimizer.optimize(model,passes)
    save_path = model_path+"_sim_opt.onnx"
    onnx.save(optimized_model,save_path)
    
    
    
    
    class Bi_Two_LSTM_time(nn.Layer):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,time_major=True,direction="bidirect",num_layers=2)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = paddle.reshape(x,[b,c,h*w])
            h0 = paddle.zeros((4,1,4))
            c0 = paddle.zeros((4,1,4))
            x2 = paddle.squeeze(x,2)
            x3 = paddle.transpose(x2,(2,0,1))
            out,_ = self.rnn(x3,(h0,c0))
            return out # shape 1,6,4
    
    model_path = "paddle/Bi_Two_LSTM_time"
    model = Bi_Two_LSTM_time()
    model.eval()
    infer_shape = [1,3,1,6]
    input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
    paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)
    
    model = onnx.load(model_path+'.onnx')
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = model_path+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    model = onnx.load(save_path)
    if model.ir_version<4:
        print("Model with ir_version below 4 requires to in clude initializer in graph input")
        exit()
    inputs = model.graph.input
    name_to_input = {}
    for input in inputs:
        name_to_input[input.name]=input
    for initializer in model.graph.initializer:
        if initializer.name in name_to_input:
            inputs.remove(name_to_input[initializer.name])
    passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
    optimized_model = onnxoptimizer.optimize(model,passes)
    save_path = model_path+"_sim_opt.onnx"
    onnx.save(optimized_model,save_path)
    
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    • 134
    2022-08-03 09:40:26 [INFO]	ONNX model generated is valid.
    2022-08-03 09:40:26 [INFO]	ONNX model saved in paddle/One_LSTM_time.onnx
    2022-08-03 09:40:26 [INFO]	ONNX model generated is valid.
    2022-08-03 09:40:26 [INFO]	ONNX model saved in paddle/Two_LSTM_time.onnx
    2022-08-03 09:40:26 [INFO]	ONNX model generated is valid.
    2022-08-03 09:40:26 [INFO]	ONNX model saved in paddle/Bi_Two_LSTM_time.onnx
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    1.3 sequence_lens

    对于paddle lstm还有一个参数是 sequence_lens,这个是与pytorch不一样的。sequence_length用于指定time steps不小于sequence_length时, 就给截断了,多余的当做填充元素,只以单层LSTM,time_major=True来做个小试验

    class Seq_One_LSTM_time(nn.Layer):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,time_major=True)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = paddle.reshape(x,[b,c,h*w])
            h0 = paddle.zeros((1,1,4))
            c0 = paddle.zeros((1,1,4))
            sequence_lens = paddle.to_tensor([6]) # same shape to b
            x2 = paddle.squeeze(x,2)
            x3 = paddle.transpose(x2,(2,0,1))
            out,_ = self.rnn(inputs=x3,initial_states=(h0,c0),sequence_length=sequence_lens)
            return out # shape 1,6,4
    
    model_path = "paddle/Seq_One_LSTM_time"
    model = Seq_One_LSTM_time()
    model.eval()
    infer_shape = [1,3,1,6]
    input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
    paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)
    
    model = onnx.load(model_path+'.onnx')
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = model_path+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    model = onnx.load(save_path)
    if model.ir_version<4:
        print("Model with ir_version below 4 requires to in clude initializer in graph input")
        exit()
    inputs = model.graph.input
    name_to_input = {}
    for input in inputs:
        name_to_input[input.name]=input
    for initializer in model.graph.initializer:
        if initializer.name in name_to_input:
            inputs.remove(name_to_input[initializer.name])
    passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
    optimized_model = onnxoptimizer.optimize(model,passes)
    save_path = model_path+"_sim_opt.onnx"
    onnx.save(optimized_model,save_path)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    W0803 15:08:57.771442 24847 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.4, Runtime API Version: 11.2
    W0803 15:08:57.774729 24847 device_context.cc:465] device: 0, cuDNN Version: 8.1.
    
    
    2022-08-03 15:09:00 [INFO]	ONNX model generated is valid.
    2022-08-03 15:09:00 [INFO]	ONNX model saved in paddle/Seq_One_LSTM_time.onnx
    
    
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle2onnx/constant/dtypes.py:47: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      np.bool: core.VarDesc.VarType.BOOL,
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle2onnx/constant/dtypes.py:48: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      core.VarDesc.VarType.FP32: np.float,
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle2onnx/constant/dtypes.py:53: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      core.VarDesc.VarType.BOOL: np.bool
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
      return (isinstance(seq, collections.Sequence) and
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/onnx/helper.py:343: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
      is_iterable = isinstance(value, collections.Iterable)
    /home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages/onnx/numpy_helper.py:93: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. 
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      if arr.dtype == np.object:
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24

    1.4 无初始状态

    这一点是指我们在调用lstm的时候不会手动传入初始状态h0和c0,但内部会自动赋值初始状态为全0,pytorch也是这个原理,但是Onnx的结构图是不一样的,pytorch在不传入初始状态时的结构和paddle手动传入的结果是一样的,这个后边再说,综合对比所有的结构就可以看出差异

    class Ini_One_LSTM_time(nn.Layer):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,time_major=True)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = paddle.reshape(x,[b,c,h*w])
            x2 = paddle.squeeze(x,2)
            x3 = paddle.transpose(x2,(2,0,1))
            out,_ = self.rnn(x3)
            return out # shape 1,6,4
    
    model_path = "paddle/Ini_One_LSTM_time"
    model = Ini_One_LSTM_time()
    model.eval()
    infer_shape = [1,3,1,6]
    input_spec=paddle.static.InputSpec(shape=infer_shape, dtype="float32")
    paddle.onnx.export(model,model_path,input_spec=[input_spec],opset_version=11,enable_onnx_checker=True)
    
    model = onnx.load(model_path+'.onnx')
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = model_path+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    model = onnx.load(save_path)
    if model.ir_version<4:
        print("Model with ir_version below 4 requires to in clude initializer in graph input")
        exit()
    inputs = model.graph.input
    name_to_input = {}
    for input in inputs:
        name_to_input[input.name]=input
    for initializer in model.graph.initializer:
        if initializer.name in name_to_input:
            inputs.remove(name_to_input[initializer.name])
    passes=["extract_constant_to_initializer","eliminate_unused_initializer"]
    optimized_model = onnxoptimizer.optimize(model,passes)
    save_path = model_path+"_sim_opt.onnx"
    onnx.save(optimized_model,save_path)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    2022-08-03 15:06:50 [INFO]	ONNX model generated is valid.
    2022-08-03 15:06:50 [INFO]	ONNX model saved in paddle/Ini_One_LSTM_time.onnx
    
    • 1
    • 2

    1.5 查看生成的onnx模型

    paddle_onnx = sorted(os.listdir('paddle'))
    paddle_onnx_paths = sorted([os.path.join('paddle',path) for path in paddle_onnx])
    print(paddle_onnx)
    
    • 1
    • 2
    • 3
    ['Bi_Two_LSTM_batch.onnx', 'Bi_Two_LSTM_batch_sim.onnx', 'Bi_Two_LSTM_batch_sim_opt.onnx', 'Bi_Two_LSTM_time.onnx', 'Bi_Two_LSTM_time_sim.onnx', 'Bi_Two_LSTM_time_sim_opt.onnx', 'Ini_One_LSTM_time.onnx', 'Ini_One_LSTM_time_sim.onnx', 'Ini_One_LSTM_time_sim_opt.onnx', 'One_LSTM_batch.onnx', 'One_LSTM_batch_sim.onnx', 'One_LSTM_batch_sim_opt.onnx', 'One_LSTM_time.onnx', 'One_LSTM_time_sim.onnx', 'One_LSTM_time_sim_opt.onnx', 'Seq_One_LSTM_time.onnx', 'Seq_One_LSTM_time_sim.onnx', 'Seq_One_LSTM_time_sim_opt.onnx', 'Two_LSTM_batch.onnx', 'Two_LSTM_batch_sim.onnx', 'Two_LSTM_batch_sim_opt.onnx', 'Two_LSTM_time.onnx', 'Two_LSTM_time_sim.onnx', 'Two_LSTM_time_sim_opt.onnx']
    
    • 1
    # 查看每个模型的大小
    ! du -sh paddle/*
    
    • 1
    • 2
    16K	paddle/Bi_Two_LSTM_batch.onnx
    8.0K	paddle/Bi_Two_LSTM_batch_sim.onnx
    8.0K	paddle/Bi_Two_LSTM_batch_sim_opt.onnx
    16K	paddle/Bi_Two_LSTM_time.onnx
    8.0K	paddle/Bi_Two_LSTM_time_sim.onnx
    8.0K	paddle/Bi_Two_LSTM_time_sim_opt.onnx
    8.0K	paddle/Ini_One_LSTM_time.onnx
    4.0K	paddle/Ini_One_LSTM_time_sim.onnx
    4.0K	paddle/Ini_One_LSTM_time_sim_opt.onnx
    8.0K	paddle/One_LSTM_batch.onnx
    4.0K	paddle/One_LSTM_batch_sim.onnx
    4.0K	paddle/One_LSTM_batch_sim_opt.onnx
    8.0K	paddle/One_LSTM_time.onnx
    4.0K	paddle/One_LSTM_time_sim.onnx
    4.0K	paddle/One_LSTM_time_sim_opt.onnx
    8.0K	paddle/Seq_One_LSTM_time.onnx
    4.0K	paddle/Seq_One_LSTM_time_sim.onnx
    4.0K	paddle/Seq_One_LSTM_time_sim_opt.onnx
    12K	paddle/Two_LSTM_batch.onnx
    4.0K	paddle/Two_LSTM_batch_sim.onnx
    4.0K	paddle/Two_LSTM_batch_sim_opt.onnx
    12K	paddle/Two_LSTM_time.onnx
    4.0K	paddle/Two_LSTM_time_sim.onnx
    4.0K	paddle/Two_LSTM_time_sim_opt.onnx
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24

    加载onnx模型并推理,对比推理结果,两两一对

    def onnx_infer(model_path,data):
        """_summary_
    
        Args:
            model_path (_type_): _description_
            data (_type_): _description_
        """
        onnx_session=onnxruntime.InferenceSession(model_path)
        input_name = onnx_session.get_inputs()[0].name
        output_name = onnx_session.get_outputs()[0].name
        result = onnx_session.run([output_name],{input_name:data})
        return result[0]
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    test_data = np.random.random((1,3,1,6)).astype(np.float32) # batch,channel,height,width
    results={}
    
    for i,onnx_path in enumerate(paddle_onnx_paths):
    
        result = onnx_infer(onnx_path,test_data)
        results[os.path.basename(onnx_path)]=result
    
        if i%3 ==2:
            try:
                values = list(results.values())
                np.testing.assert_allclose(values[0],values[1],rtol=1e-5)
                np.testing.assert_allclose(values[2],values[1],rtol=1e-5)
                print(f"{list(results.keys())} have same results")
            except:
                print(f"{list(results.keys())} have different results")
            finally:
                results={}
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    2022-08-03 17:03:35.322915835 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.322938940 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.322945528 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.322951708 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.322957288 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.322963180 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.322968668 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.322974083 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_14 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.322979304 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_17 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.322984535 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_26 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.322990949 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_27 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.322996580 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_87 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.392730468 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.392757571 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.392764542 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.392770328 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.392775836 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.392781517 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.392786892 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.392792184 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_14 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.392797446 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_17 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.392802640 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_26 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.392808940 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_27 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.392814520 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_87 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482187472 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_0 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482211960 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_0 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482218895 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482224351 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_1 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482229628 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_6 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482235089 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_7 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482240251 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_10 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482245348 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482250361 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_45 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482255449 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_46 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482267806 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_47 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482273280 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_48 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482278188 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_49 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.482283068 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_50 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.544810887 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.544836135 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.544842815 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.544848451 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.544853781 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.544859396 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    
    
    ['Bi_Two_LSTM_batch.onnx', 'Bi_Two_LSTM_batch_sim.onnx', 'Bi_Two_LSTM_batch_sim_opt.onnx'] have same results
    ['Bi_Two_LSTM_time.onnx', 'Bi_Two_LSTM_time_sim.onnx', 'Bi_Two_LSTM_time_sim_opt.onnx'] have same results
    ['Ini_One_LSTM_time.onnx', 'Ini_One_LSTM_time_sim.onnx', 'Ini_One_LSTM_time_sim_opt.onnx'] have same results
    ['One_LSTM_batch.onnx', 'One_LSTM_batch_sim.onnx', 'One_LSTM_batch_sim_opt.onnx'] have same results
    ['One_LSTM_time.onnx', 'One_LSTM_time_sim.onnx', 'One_LSTM_time_sim_opt.onnx'] have same results
    ['Seq_One_LSTM_time.onnx', 'Seq_One_LSTM_time_sim.onnx', 'Seq_One_LSTM_time_sim_opt.onnx'] have same results
    ['Two_LSTM_batch.onnx', 'Two_LSTM_batch_sim.onnx', 'Two_LSTM_batch_sim_opt.onnx'] have same results
    ['Two_LSTM_time.onnx', 'Two_LSTM_time_sim.onnx', 'Two_LSTM_time_sim_opt.onnx'] have same results
    
    
    2022-08-03 17:03:35.627142152 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.627166154 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.627172672 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.627178399 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.627184004 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.627189596 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.680342507 [W:onnxruntime:, graph.cc:3559 CleanUnusedInitializersAndNodeArgs] Removing initializer 'assign_0.tmp_0'. It is not used by any node and should be removed from the model.
    2022-08-03 17:03:35.684640360 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.684662133 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.684672193 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.684680922 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.684688933 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.684697379 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.710238396 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.710278276 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.710292209 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.710303902 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.710315317 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.710326920 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.710337914 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.710348837 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_14 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.710359584 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_17 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.710370872 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_26 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.710383823 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_27 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.710395467 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_87 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.810340323 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_4 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.810366323 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_5 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.810374143 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_8 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.810380074 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_12 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.810385473 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.810391336 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_44 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.810396558 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_13 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.810401858 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_14 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.810407006 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Concat_17 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.810412186 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_26 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.810417317 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Slice_27 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    2022-08-03 17:03:35.810424180 [W:onnxruntime:, graph.cc:1271 Graph] Initializer Constant_87 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93

    以上onnx模型的推理可以看到在1e-5(十万分之一,6位有效数字)的容差下,结果一完全一样的。关于那么多warning,是sim后缀的模型产生的,原始模型和opt结尾的模型没有这个问题

    以下部分是生成的以opt结尾的onnx模型的结构图:

    类型单层双层双层双向
    time major=False在这里插入图片描述在这里插入图片描述在这里插入图片描述
    time_major=True在这里插入图片描述在这里插入图片描述在这里插入图片描述

    接着,是sequence_lens这个参数的影响,只是一个单层lstm,结果图是:
    在这里插入图片描述
    还有一个是无自定义初始状态的单层lstm的图,如下:
    在这里插入图片描述
    可以看到会有增加的算子,这部分其实是没必要的。

    2 pytorch 生成LSTM

    由于pytorch在导出onnx时,参数keep_initializers_as_inputs=False,所以只需要执行sim操作即可,否则要和paddle一样,多执行一个操作

    2.1 batch_first=True

    import os
    import sys
    sys.path.append('/home/tl/anaconda3/envs/pdle/lib/python3.7/site-packages')
    import torch
    from torch import nn
    import numpy as np
    from onnxsim import simplify
    import onnxoptimizer
    import onnx
    import onnxruntime
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    class One_LSTM_batch(nn.Module):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,batch_first=True)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = torch.reshape(x,[b,c,h*w])
            x2 = torch.squeeze(x,2)
            x3 = torch.permute(x2,(0,2,1))
            out,_ = self.rnn(x3)
            return out # shape 1,6,4
        
    model = One_LSTM_batch()
    model.to('cpu')
    model.eval()
    
    input = torch.randn(1,3,1,6)
    output = model(input)
    print("output shape:",output.shape)
    
    input_shapes=[(1,3,1,6)]
    onnx_export_path = "torch/One_lstm_batch.onnx"
    dummy_input=[]
    for ele in input_shapes:
        dummy_input.append(torch.randn(ele))
    dummy_input=tuple(dummy_input)
    
    # torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"], dynamic_axes={'input' : {0 : 'batch_size'},'output' : {0 : 'batch_size'}})
    torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True,keep_initializers_as_inputs=False,input_names=["input"], output_names=["output"])
    print("export onnx to:",onnx_export_path)
    
    onnx_model = onnx.load(onnx_export_path)
    model_sim ,check = simplify(onnx_model)
    assert check,"simplified onnx model could not be validated"
    save_path = os.path.splitext(onnx_export_path)[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    class Two_LSTM_batch(nn.Module):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,batch_first=True,num_layers=2)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = torch.reshape(x,[b,c,h*w])
            x2 = torch.squeeze(x,2)
            x3 = torch.permute(x2,(0,2,1))
            out,_ = self.rnn(x3)
            return out # shape 1,6,4
        
    model = Two_LSTM_batch()
    model.to('cpu')
    model.eval()
    
    input = torch.randn(1,3,1,6)
    output = model(input)
    print("output shape:",output.shape)
    
    input_shapes=[(1,3,1,6)]
    onnx_export_path = "torch/Two_lstm_batch.onnx"
    dummy_input=[]
    for ele in input_shapes:
        dummy_input.append(torch.randn(ele))
    dummy_input=tuple(dummy_input)
    
    # torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"], dynamic_axes={'input' : {0 : 'batch_size'},'output' : {0 : 'batch_size'}})
    torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, keep_initializers_as_inputs=False,input_names=["input"], output_names=["output"])
    print("export onnx to:",onnx_export_path)
    
    onnx_model = onnx.load(onnx_export_path)
    model_sim ,check = simplify(onnx_model)
    assert check,"simplified onnx model could not be validated"
    save_path = os.path.splitext(onnx_export_path)[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    class Bi_Two_LSTM_batch(nn.Module):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,batch_first=True,num_layers=2,bidirectional=True)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = torch.reshape(x,[b,c,h*w])
            x2 = torch.squeeze(x,2)
            x3 = torch.permute(x2,(0,2,1))
            out,_ = self.rnn(x3)
            return out # shape 1,6,4
        
    model = Bi_Two_LSTM_batch()
    model.to('cpu')
    model.eval()
    
    input = torch.randn(1,3,1,6)
    output = model(input)
    print("output shape:",output.shape)
    
    input_shapes=[(1,3,1,6)]
    onnx_export_path = "torch/Bi_Two_lstm_batch.onnx"
    dummy_input=[]
    for ele in input_shapes:
        dummy_input.append(torch.randn(ele))
    dummy_input=tuple(dummy_input)
    
    # torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"], dynamic_axes={'input' : {0 : 'batch_size'},'output' : {0 : 'batch_size'}})
    torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, keep_initializers_as_inputs=False,input_names=["input"], output_names=["output"])
    print("export onnx to:",onnx_export_path)
    
    onnx_model = onnx.load(onnx_export_path)
    model_sim ,check = simplify(onnx_model)
    assert check,"simplified onnx model could not be validated"
    save_path = os.path.splitext(onnx_export_path)[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    output shape: torch.Size([1, 6, 4])
    export onnx to: torch/One_lstm_batch.onnx
    output shape: torch.Size([1, 6, 4])
    export onnx to: torch/Two_lstm_batch.onnx
    
    
    /home/tl/anaconda3/envs/ptch/lib/python3.7/site-packages/torch/onnx/symbolic_opset9.py:2192: UserWarning: Exporting a model to ONNX with a batch_size other than 1, with a variable length with LSTM can cause an error when running the ONNX model with a different batch size. Make sure to save the model with a batch size of 1, or define the initial states (h0/c0) as inputs of the model. 
      "or define the initial states (h0/c0) as inputs of the model. ")
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    
    
    output shape: torch.Size([1, 6, 8])
    export onnx to: torch/Bi_Two_lstm_batch.onnx
    
    
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23

    有一些warning,所以最好也可以手动传入参数

    2.2 batch_first=False

    class One_LSTM_time(nn.Module):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,batch_first=False)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = torch.reshape(x,[b,c,h*w])
            x2 = torch.squeeze(x,2)
            x3 = torch.permute(x2,(2,0,1))
            out,_ = self.rnn(x3)
            return out # shape 1,6,4
        
    model = One_LSTM_time()
    model.to('cpu')
    model.eval()
    
    input = torch.randn(1,3,1,6)
    output = model(input)
    print("output shape:",output.shape)
    
    input_shapes=[(1,3,1,6)]
    onnx_export_path = "torch/One_lstm_time.onnx"
    dummy_input=[]
    for ele in input_shapes:
        dummy_input.append(torch.randn(ele))
    dummy_input=tuple(dummy_input)
    
    # torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"], dynamic_axes={'input' : {0 : 'batch_size'},'output' : {0 : 'batch_size'}})
    torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"])
    print("export onnx to:",onnx_export_path)
    
    onnx_model = onnx.load(onnx_export_path)
    model_sim ,check = simplify(onnx_model)
    assert check,"simplified onnx model could not be validated"
    save_path = os.path.splitext(onnx_export_path)[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    class Two_LSTM_time(nn.Module):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,batch_first=False,num_layers=2)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = torch.reshape(x,[b,c,h*w])
            x2 = torch.squeeze(x,2)
            x3 = torch.permute(x2,(2,0,1))
            out,_ = self.rnn(x3)
            return out # shape 1,6,4
        
    model = Two_LSTM_time()
    model.to('cpu')
    model.eval()
    
    input = torch.randn(1,3,1,6)
    output = model(input)
    print("output shape:",output.shape)
    
    input_shapes=[(1,3,1,6)]
    onnx_export_path = "torch/Two_lstm_time.onnx"
    dummy_input=[]
    for ele in input_shapes:
        dummy_input.append(torch.randn(ele))
    dummy_input=tuple(dummy_input)
    
    # torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"], dynamic_axes={'input' : {0 : 'batch_size'},'output' : {0 : 'batch_size'}})
    torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"])
    print("export onnx to:",onnx_export_path)
    
    onnx_model = onnx.load(onnx_export_path)
    model_sim ,check = simplify(onnx_model)
    assert check,"simplified onnx model could not be validated"
    save_path = os.path.splitext(onnx_export_path)[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    class Bi_Two_LSTM_time(nn.Module):
        def __init__(self,in_channels=3,out_channels=4):
            super().__init__()
            self.rnn = nn.LSTM(in_channels,out_channels,batch_first=False,num_layers=2,bidirectional=True)
        def forward(self,x):
            # b,c,h,w =x.shape
            # x1 = torch.reshape(x,[b,c,h*w])
            x2 = torch.squeeze(x,2)
            x3 = torch.permute(x2,(2,0,1))
            out,_ = self.rnn(x3)
            return out # shape 1,6,4
        
    model = Bi_Two_LSTM_time()
    model.to('cpu')
    model.eval()
    
    input = torch.randn(1,3,1,6)
    output = model(input)
    print("output shape:",output.shape)
    
    input_shapes=[(1,3,1,6)]
    onnx_export_path = "torch/Bi_Two_lstm_time.onnx"
    dummy_input=[]
    for ele in input_shapes:
        dummy_input.append(torch.randn(ele))
    dummy_input=tuple(dummy_input)
    
    # torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"], dynamic_axes={'input' : {0 : 'batch_size'},'output' : {0 : 'batch_size'}})
    torch.onnx.export(model, dummy_input, onnx_export_path,export_params=True,verbose=False, opset_version=11,do_constant_folding=True, input_names=["input"], output_names=["output"])
    print("export onnx to:",onnx_export_path)
    
    onnx_model = onnx.load(onnx_export_path)
    model_sim ,check = simplify(onnx_model)
    assert check,"simplified onnx model could not be validated"
    save_path = os.path.splitext(onnx_export_path)[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    output shape: torch.Size([6, 1, 4])
    export onnx to: torch/One_lstm_time.onnx
    output shape: torch.Size([6, 1, 4])
    export onnx to: torch/Two_lstm_time.onnx
    
    
    /home/tl/anaconda3/envs/ptch/lib/python3.7/site-packages/torch/onnx/symbolic_opset9.py:2192: UserWarning: Exporting a model to ONNX with a batch_size other than 1, with a variable length with LSTM can cause an error when running the ONNX model with a different batch size. Make sure to save the model with a batch size of 1, or define the initial states (h0/c0) as inputs of the model. 
      "or define the initial states (h0/c0) as inputs of the model. ")
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    
    
    output shape: torch.Size([6, 1, 8])
    export onnx to: torch/Bi_Two_lstm_time.onnx
    
    
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23

    2.3 查看生成的onnx模型

    pytorch_onnx = sorted(os.listdir('torch'))
    pytorch_onnx_paths = sorted([os.path.join('torch',path) for path in pytorch_onnx])
    print(pytorch_onnx)
    
    • 1
    • 2
    • 3
    ['Bi_Two_lstm_batch.onnx', 'Bi_Two_lstm_batch_sim.onnx', 'Bi_Two_lstm_time.onnx', 'Bi_Two_lstm_time_sim.onnx', 'One_lstm_batch.onnx', 'One_lstm_batch_sim.onnx', 'One_lstm_time.onnx', 'One_lstm_time_sim.onnx', 'Two_lstm_batch.onnx', 'Two_lstm_batch_sim.onnx', 'Two_lstm_time.onnx', 'Two_lstm_time_sim.onnx']
    
    • 1
    ! du -sh torch/*
    
    • 1
    8.0K	torch/Bi_Two_lstm_batch.onnx
    8.0K	torch/Bi_Two_lstm_batch_sim.onnx
    8.0K	torch/Bi_Two_lstm_time.onnx
    8.0K	torch/Bi_Two_lstm_time_sim.onnx
    4.0K	torch/One_lstm_batch.onnx
    4.0K	torch/One_lstm_batch_sim.onnx
    4.0K	torch/One_lstm_time.onnx
    4.0K	torch/One_lstm_time_sim.onnx
    8.0K	torch/Two_lstm_batch.onnx
    4.0K	torch/Two_lstm_batch_sim.onnx
    8.0K	torch/Two_lstm_time.onnx
    4.0K	torch/Two_lstm_time_sim.onnx
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    def onnx_infer(model_path,data):
        """_summary_
    
        Args:
            model_path (_type_): _description_
            data (_type_): _description_
        """
        onnx_session=onnxruntime.InferenceSession(model_path)
        input_name = onnx_session.get_inputs()[0].name
        output_name = onnx_session.get_outputs()[0].name
        result = onnx_session.run([output_name],{input_name:data})
        return result[0]
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    test_data = np.random.random((1,3,1,6)).astype(np.float32) # batch,channel,height,width
    results={}
    
    for i,onnx_path in enumerate(pytorch_onnx_paths):
    
        result = onnx_infer(onnx_path,test_data)
        results[os.path.basename(onnx_path)]=result
    
        if i%2 ==1:
            try:
                values = list(results.values())
                np.testing.assert_allclose(values[0],values[1],rtol=1e-7)
                print(f"{list(results.keys())} have same results")
            except:
                print(f"{list(results.keys())} have different results")
            finally:
                results={}
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    ['Bi_Two_lstm_batch.onnx', 'Bi_Two_lstm_batch_sim.onnx'] have same results
    ['Bi_Two_lstm_time.onnx', 'Bi_Two_lstm_time_sim.onnx'] have same results
    ['One_lstm_batch.onnx', 'One_lstm_batch_sim.onnx'] have same results
    ['One_lstm_time.onnx', 'One_lstm_time_sim.onnx'] have same results
    ['Two_lstm_batch.onnx', 'Two_lstm_batch_sim.onnx'] have same results
    ['Two_lstm_time.onnx', 'Two_lstm_time_sim.onnx'] have same results
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    看起来pytorch转换成onnx在1e-7的精度下结果完全相同,相比paddle精度还是高一点

    查看一下onnx的图

    类型单层双层双层双向
    batch_first=True在这里插入图片描述在这里插入图片描述在这里插入图片描述
    batch_first=False在这里插入图片描述在这里插入图片描述在这里插入图片描述

    3 Tensorflow2 生成LSTM

    在这里我使用的是tensorflow2.8版本。

    import os
    import tensorflow as tf
    import onnx
    import tf2onnx
    from onnxsim import simplify
    import onnxruntime
    import numpy as np
    from tensorflow.keras import layers as nn
    #only use cpu
    devices = tf.config.list_physical_devices("CPU")
    tf.config.set_visible_devices(devices)
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    2022-08-09 16:10:42.419611: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
    
    • 1

    因为tensorflow和pytorch默认是返回每一步的output的,而tensorflow是可以指定返回最后一步还是全部,由reture_sequences来决定,为了保持一致,设置为True.
    tensorflow的是初始输入是格式是B,H,W,C,以此为基础进行构建

    3.1 time_major=False

    def One_LSTM_batch():
        input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
        middle = tf.squeeze(input,axis=1)
        output = nn.LSTM(4,time_major=False,return_sequences=True,name='one')(middle)
        model = tf.keras.models.Model(input,output,name="One_LSTM_batch")
        return model
    model = One_LSTM_batch()
    #tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
    model.save("tensorflow/One_LSTM_batch")
    spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
    output_path="tensorflow/"+model.name+'.onnx'
    model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
    output_names=[n.name for n in model_proto.graph.output]
    model = onnx.load(output_path)
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = output_path.split('.')[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    def Two_LSTM_batch():
        input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
        middle = tf.squeeze(input,axis=1)
        output1 = nn.LSTM(4,time_major=False,return_sequences=True,name='one')(middle)
        output = nn.LSTM(4,time_major=False,return_sequences=True,name='two')(output1)
        model = tf.keras.models.Model(input,output,name="Two_LSTM_batch")
        return model
    model = Two_LSTM_batch()
    #tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
    model.save("tensorflow/Two_LSTM_batch")
    spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
    output_path="tensorflow/"+model.name+'.onnx'
    model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
    output_names=[n.name for n in model_proto.graph.output]
    model = onnx.load(output_path)
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = output_path.split('.')[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    def Bi_Two_LSTM_batch():
        input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
        middle = tf.squeeze(input,axis=1)
        output1 = nn.Bidirectional(nn.LSTM(4,time_major=False,return_sequences=True,name='one'),merge_mode="concat")(middle)
        output = nn.Bidirectional(nn.LSTM(4,time_major=False,return_sequences=True,name='two'),merge_mode="concat")(output1)
        model = tf.keras.models.Model(input,output,name="Bi_Two_LSTM_batch")
        return model
    model = Bi_Two_LSTM_batch()
    #tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
    model.save("tensorflow/Bi_Two_LSTM_batch")
    spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
    output_path="tensorflow/"+model.name+'.onnx'
    model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
    output_names=[n.name for n in model_proto.graph.output]
    model = onnx.load(output_path)
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = output_path.split('.')[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    WARNING:absl:Found untraced functions such as lstm_cell_191_layer_call_fn, lstm_cell_191_layer_call_and_return_conditional_losses while saving (showing 2 of 2). These functions will not be directly callable after loading.
    
    
    INFO:tensorflow:Assets written to: tensorflow/One_LSTM_batch/assets
    
    
    INFO:tensorflow:Assets written to: tensorflow/One_LSTM_batch/assets
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    2022-08-16 16:19:10.687955: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-16 16:19:10.688047: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-16 16:19:10.706293: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-16 16:19:10.707378: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-16 16:19:10.708381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-16 16:19:10.709447: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-16 16:19:10.719888: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
      function_optimizer: Graph size after: 87 nodes (60), 98 edges (68), time = 2.078ms.
      function_optimizer: Graph size after: 87 nodes (0), 98 edges (0), time = 1.092ms.
    Optimization results for grappler item: while_cond_1209930
      function_optimizer: function_optimizer did nothing. time = 0.004ms.
      function_optimizer: function_optimizer did nothing. time = 0ms.
    Optimization results for grappler item: while_body_1209931
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0ms.
    
    2022-08-16 16:19:10.780564: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-16 16:19:10.780636: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-16 16:19:10.798557: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-16 16:19:10.799642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-16 16:19:10.800648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-16 16:19:10.801728: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-16 16:19:10.812277: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
      constant_folding: Graph size after: 30 nodes (-23), 30 edges (-27), time = 1.377ms.
      function_optimizer: Graph size after: 30 nodes (0), 30 edges (0), time = 0.6ms.
      constant_folding: Graph size after: 30 nodes (0), 30 edges (0), time = 0.554ms.
      function_optimizer: Graph size after: 30 nodes (0), 30 edges (0), time = 0.592ms.
    Optimization results for grappler item: while_cond_1209930
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.272ms.
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.182ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_1209931
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.766ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.646ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    WARNING:absl:Found untraced functions such as lstm_cell_192_layer_call_fn, lstm_cell_192_layer_call_and_return_conditional_losses, lstm_cell_193_layer_call_fn, lstm_cell_193_layer_call_and_return_conditional_losses while saving (showing 4 of 4). These functions will not be directly callable after loading.
    
    
    INFO:tensorflow:Assets written to: tensorflow/Two_LSTM_batch/assets
    
    
    INFO:tensorflow:Assets written to: tensorflow/Two_LSTM_batch/assets
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    2022-08-16 16:19:16.650941: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-16 16:19:16.651052: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-16 16:19:16.669042: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-16 16:19:16.670119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-16 16:19:16.671115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-16 16:19:16.672189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-16 16:19:16.689047: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
      function_optimizer: Graph size after: 170 nodes (120), 193 edges (136), time = 3.802ms.
      function_optimizer: Graph size after: 170 nodes (0), 193 edges (0), time = 2.068ms.
    Optimization results for grappler item: while_cond_1221920
      function_optimizer: function_optimizer did nothing. time = 0.005ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_1221921
      function_optimizer: function_optimizer did nothing. time = 0.003ms.
      function_optimizer: function_optimizer did nothing. time = 0ms.
    Optimization results for grappler item: while_body_1221499
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_cond_1221498
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0ms.
    
    2022-08-16 16:19:16.788220: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-16 16:19:16.788291: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-16 16:19:16.812080: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-16 16:19:16.813181: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-16 16:19:16.814187: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-16 16:19:16.815259: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-16 16:19:16.832669: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
      constant_folding: Graph size after: 56 nodes (-46), 57 edges (-54), time = 2.381ms.
      function_optimizer: Graph size after: 56 nodes (0), 57 edges (0), time = 1.085ms.
      constant_folding: Graph size after: 56 nodes (0), 57 edges (0), time = 0.99ms.
      function_optimizer: Graph size after: 56 nodes (0), 57 edges (0), time = 1.093ms.
    Optimization results for grappler item: while_cond_1221920
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.284ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.182ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_1221921
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.779ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.65ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_1221499
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.776ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.645ms.
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
    Optimization results for grappler item: while_cond_1221498
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.271ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.182ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    WARNING:absl:Found untraced functions such as lstm_cell_195_layer_call_fn, lstm_cell_195_layer_call_and_return_conditional_losses, lstm_cell_196_layer_call_fn, lstm_cell_196_layer_call_and_return_conditional_losses, lstm_cell_198_layer_call_fn while saving (showing 5 of 8). These functions will not be directly callable after loading.
    
    
    INFO:tensorflow:Assets written to: tensorflow/Bi_Two_LSTM_batch/assets
    
    
    INFO:tensorflow:Assets written to: tensorflow/Bi_Two_LSTM_batch/assets
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    2022-08-16 16:19:31.962845: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-16 16:19:31.962974: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-16 16:19:31.980983: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-16 16:19:31.982057: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-16 16:19:31.983048: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-16 16:19:31.984113: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-16 16:19:32.016037: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
      function_optimizer: Graph size after: 348 nodes (244), 397 edges (276), time = 8.373ms.
      function_optimizer: Graph size after: 348 nodes (0), 397 edges (0), time = 4.505ms.
    Optimization results for grappler item: while_cond_1256176
      function_optimizer: function_optimizer did nothing. time = 0.004ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_1255323
      function_optimizer: function_optimizer did nothing. time = 0.003ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_cond_1255322
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_1256601
      function_optimizer: function_optimizer did nothing. time = 0.003ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_cond_1255746
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_1256177
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_1255747
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_cond_1256600
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    
    2022-08-16 16:19:32.194241: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-16 16:19:32.194314: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-16 16:19:32.212231: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-16 16:19:32.213308: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-16 16:19:32.214300: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-16 16:19:32.215369: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-16 16:19:32.248503: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
      constant_folding: Graph size after: 120 nodes (-92), 125 edges (-108), time = 4.738ms.
      function_optimizer: Graph size after: 120 nodes (0), 125 edges (0), time = 2.234ms.
      constant_folding: Graph size after: 120 nodes (0), 125 edges (0), time = 2.266ms.
      function_optimizer: Graph size after: 120 nodes (0), 125 edges (0), time = 2.25ms.
    Optimization results for grappler item: while_cond_1256176
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.274ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.181ms.
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
    Optimization results for grappler item: while_body_1255323
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.79ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.647ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_cond_1255322
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.266ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.18ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_1256601
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.777ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.654ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_cond_1255746
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.276ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.181ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_1256177
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.78ms.
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.651ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_1255747
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.772ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.649ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_cond_1256600
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.277ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.181ms.
      function_optimizer: function_optimizer did nothing. time = 0ms.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    • 134
    • 135
    • 136
    • 137
    • 138
    • 139
    • 140
    • 141
    • 142
    • 143
    • 144
    • 145
    • 146
    • 147
    • 148
    • 149
    • 150
    • 151
    • 152
    • 153
    • 154
    • 155
    • 156
    • 157
    • 158
    • 159
    • 160
    • 161
    • 162
    • 163
    • 164
    • 165
    • 166
    • 167
    • 168
    • 169
    • 170
    • 171
    • 172
    • 173
    • 174
    • 175
    • 176
    • 177
    • 178
    • 179
    • 180
    • 181
    • 182
    • 183
    • 184
    • 185
    • 186
    • 187
    • 188
    • 189
    • 190
    • 191
    • 192
    • 193
    • 194
    • 195
    • 196
    • 197
    • 198
    • 199
    • 200
    • 201
    • 202
    • 203
    • 204
    • 205
    • 206
    • 207
    • 208
    • 209
    • 210
    • 211
    • 212
    • 213
    • 214
    • 215
    • 216
    • 217
    • 218
    • 219
    • 220
    • 221

    3.2 time_major=True

    def One_LSTM_time():
        input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
        middle1 = tf.squeeze(input,axis=1)
        middle = tf.transpose(middle1,[1,0,2])
        output = nn.LSTM(4,time_major=True,return_sequences=True,name='one')(middle)
        model = tf.keras.models.Model(input,output,name="One_LSTM_time")
        return model
    model = One_LSTM_time()
    #tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
    model.save("tensorflow/One_LSTM_time")
    spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
    output_path="tensorflow/"+model.name+'.onnx'
    model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
    output_names=[n.name for n in model_proto.graph.output]
    model = onnx.load(output_path)
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = output_path.split('.')[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    def Two_LSTM_time():
        input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
        middle1 = tf.squeeze(input,axis=1)
        middle = tf.transpose(middle1,[1,0,2])
        output1 = nn.LSTM(4,time_major=True,return_sequences=True,name='one')(middle)
        output = nn.LSTM(4,time_major=True,return_sequences=True,name='two')(output1)
        model = tf.keras.models.Model(input,output,name="Two_LSTM_time")
        return model
    model = Two_LSTM_time()
    #tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
    model.save("tensorflow/Two_LSTM_time")
    spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
    output_path="tensorflow/"+model.name+'.onnx'
    
    model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
    output_names=[n.name for n in model_proto.graph.output]
    model = onnx.load(output_path)
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = output_path.split('.')[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    def Bi_Two_LSTM_time():
        input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
        middle1 = tf.squeeze(input,axis=1)
        middle = tf.transpose(middle1,[1,0,2])
        output1 = nn.Bidirectional(nn.LSTM(4,time_major=True,return_sequences=True,name='one'),merge_mode="concat")(middle)
        output = nn.Bidirectional(nn.LSTM(4,time_major=True,return_sequences=True,name='two'),merge_mode="concat")(output1)
        model = tf.keras.models.Model(input,output,name="Bi_Two_LSTM_time")
        return model
    model = Bi_Two_LSTM_time()
    #tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
    model.save("tensorflow/Bi_Two_LSTM_time")
    spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
    output_path="tensorflow/"+model.name+'.onnx'
    model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
    output_names=[n.name for n in model_proto.graph.output]
    model = onnx.load(output_path)
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = output_path.split('.')[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    WARNING:absl:Found untraced functions such as lstm_cell_8_layer_call_fn, lstm_cell_8_layer_call_and_return_conditional_losses while saving (showing 2 of 2). These functions will not be directly callable after loading.
    
    
    INFO:tensorflow:Assets written to: tensorflow/One_LSTM_time/assets
    
    
    INFO:tensorflow:Assets written to: tensorflow/One_LSTM_time/assets
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    2022-08-16 15:43:52.434772: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-16 15:43:52.434862: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-16 15:43:52.452892: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-16 15:43:52.453967: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-16 15:43:52.454957: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-16 15:43:52.456033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-16 15:43:52.465981: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
      function_optimizer: Graph size after: 85 nodes (56), 96 edges (64), time = 1.955ms.
      function_optimizer: Graph size after: 85 nodes (0), 96 edges (0), time = 1.076ms.
    Optimization results for grappler item: while_cond_48297
      function_optimizer: function_optimizer did nothing. time = 0.004ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_48298
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0ms.
    
    2022-08-16 15:43:52.520244: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-16 15:43:52.520309: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-16 15:43:52.538162: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-16 15:43:52.539247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-16 15:43:52.540262: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-16 15:43:52.541338: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-16 15:43:52.551627: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
      constant_folding: Graph size after: 28 nodes (-23), 28 edges (-27), time = 1.298ms.
      function_optimizer: Graph size after: 28 nodes (0), 28 edges (0), time = 0.575ms.
      constant_folding: Graph size after: 28 nodes (0), 28 edges (0), time = 0.512ms.
      function_optimizer: Graph size after: 28 nodes (0), 28 edges (0), time = 0.588ms.
    Optimization results for grappler item: while_cond_48297
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.269ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.179ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_48298
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.769ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.642ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    WARNING:absl:Found untraced functions such as lstm_cell_9_layer_call_fn, lstm_cell_9_layer_call_and_return_conditional_losses, lstm_cell_10_layer_call_fn, lstm_cell_10_layer_call_and_return_conditional_losses while saving (showing 4 of 4). These functions will not be directly callable after loading.
    
    
    INFO:tensorflow:Assets written to: tensorflow/Two_LSTM_time/assets
    
    
    INFO:tensorflow:Assets written to: tensorflow/Two_LSTM_time/assets
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    2022-08-16 15:43:57.663352: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-16 15:43:57.663442: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-16 15:43:57.681413: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-16 15:43:57.682504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-16 15:43:57.683492: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-16 15:43:57.684558: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-16 15:43:57.701027: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
      function_optimizer: Graph size after: 164 nodes (112), 187 edges (128), time = 3.95ms.
      function_optimizer: Graph size after: 164 nodes (0), 187 edges (0), time = 2.055ms.
    Optimization results for grappler item: while_cond_59528
      function_optimizer: function_optimizer did nothing. time = 0.004ms.
      function_optimizer: function_optimizer did nothing. time = 0ms.
    Optimization results for grappler item: while_body_59937
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0ms.
    Optimization results for grappler item: while_cond_59936
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_59529
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0ms.
    
    2022-08-16 15:43:57.789964: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-16 15:43:57.790031: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-16 15:43:57.807917: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-16 15:43:57.809002: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-16 15:43:57.809991: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-16 15:43:57.811055: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-16 15:43:57.832990: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
      constant_folding: Graph size after: 50 nodes (-46), 51 edges (-54), time = 2.266ms.
      function_optimizer: Graph size after: 50 nodes (0), 51 edges (0), time = 1.037ms.
      constant_folding: Graph size after: 50 nodes (0), 51 edges (0), time = 0.898ms.
      function_optimizer: Graph size after: 50 nodes (0), 51 edges (0), time = 1.077ms.
    Optimization results for grappler item: while_cond_59528
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.274ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.182ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_59937
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.767ms.
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.655ms.
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
    Optimization results for grappler item: while_cond_59936
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.261ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.183ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_59529
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 2.66ms.
      function_optimizer: function_optimizer did nothing. time = 0.004ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 2.102ms.
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
    
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    WARNING:absl:Found untraced functions such as lstm_cell_12_layer_call_fn, lstm_cell_12_layer_call_and_return_conditional_losses, lstm_cell_13_layer_call_fn, lstm_cell_13_layer_call_and_return_conditional_losses, lstm_cell_15_layer_call_fn while saving (showing 5 of 8). These functions will not be directly callable after loading.
    
    
    INFO:tensorflow:Assets written to: tensorflow/Bi_Two_LSTM_time/assets
    
    
    INFO:tensorflow:Assets written to: tensorflow/Bi_Two_LSTM_time/assets
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming  to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
    2022-08-16 15:44:12.614816: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-16 15:44:12.614936: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-16 15:44:12.633044: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-16 15:44:12.634138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-16 15:44:12.635134: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-16 15:44:12.636200: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-16 15:44:12.667021: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
      function_optimizer: Graph size after: 334 nodes (228), 383 edges (260), time = 8.111ms.
      function_optimizer: Graph size after: 334 nodes (0), 383 edges (0), time = 4.232ms.
    Optimization results for grappler item: while_body_93140
      function_optimizer: function_optimizer did nothing. time = 0.005ms.
      function_optimizer: function_optimizer did nothing. time = 0ms.
    Optimization results for grappler item: while_body_93550
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0ms.
    Optimization results for grappler item: while_cond_92313
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_cond_93139
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_cond_93549
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_92314
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_cond_92723
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_92724
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    
    2022-08-16 15:44:12.837667: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-16 15:44:12.837739: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-16 15:44:12.855667: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-16 15:44:12.856749: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-16 15:44:12.857738: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-16 15:44:12.858802: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-16 15:44:12.896488: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1164] Optimization results for grappler item: graph_to_optimize
      constant_folding: Graph size after: 106 nodes (-92), 111 edges (-108), time = 4.422ms.
      function_optimizer: Graph size after: 106 nodes (0), 111 edges (0), time = 2.035ms.
      constant_folding: Graph size after: 106 nodes (0), 111 edges (0), time = 1.936ms.
      function_optimizer: Graph size after: 106 nodes (0), 111 edges (0), time = 2.095ms.
    Optimization results for grappler item: while_body_93140
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.783ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.648ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_93550
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.778ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.65ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_cond_92313
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.265ms.
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.181ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_cond_93139
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.258ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.179ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_cond_93549
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.893ms.
      function_optimizer: function_optimizer did nothing. time = 0.004ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.633ms.
      function_optimizer: function_optimizer did nothing. time = 0.004ms.
    Optimization results for grappler item: while_body_92314
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 1.257ms.
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.959ms.
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
    Optimization results for grappler item: while_cond_92723
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.377ms.
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      constant_folding: Graph size after: 14 nodes (0), 4 edges (0), time = 0.274ms.
      function_optimizer: function_optimizer did nothing. time = 0.001ms.
    Optimization results for grappler item: while_body_92724
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 1.113ms.
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
      constant_folding: Graph size after: 50 nodes (0), 50 edges (0), time = 0.952ms.
      function_optimizer: function_optimizer did nothing. time = 0.002ms.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    • 134
    • 135
    • 136
    • 137
    • 138
    • 139
    • 140
    • 141
    • 142
    • 143
    • 144
    • 145
    • 146
    • 147
    • 148
    • 149
    • 150
    • 151
    • 152
    • 153
    • 154
    • 155
    • 156
    • 157
    • 158
    • 159
    • 160
    • 161
    • 162
    • 163
    • 164
    • 165
    • 166
    • 167
    • 168
    • 169
    • 170
    • 171
    • 172
    • 173
    • 174
    • 175
    • 176
    • 177
    • 178
    • 179
    • 180
    • 181
    • 182
    • 183
    • 184
    • 185
    • 186
    • 187
    • 188
    • 189
    • 190
    • 191
    • 192
    • 193
    • 194
    • 195
    • 196
    • 197
    • 198
    • 199
    • 200
    • 201
    • 202
    • 203
    • 204
    • 205
    • 206
    • 207
    • 208
    • 209
    • 210
    • 211
    • 212
    • 213
    • 214
    • 215
    • 216
    • 217
    • 218
    • 219
    • 220
    • 221

    3.3 return_state=True

    支持上一层的state做为下一层的初始状态

    def One_LSTM_time_state():
        input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
        middle1 = tf.squeeze(input,axis=1)
        middle = tf.transpose(middle1,[1,0,2])
        output,h_state,c_state = nn.LSTM(4,time_major=True,return_sequences=True,return_state=True,name='one')(middle)
        model = tf.keras.models.Model(inputs=input,outputs=[output,h_state,c_state],name="One_LSTM_time_state")
        return model
    model = One_LSTM_time_state()
    #tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
    model.save("tensorflow/One_LSTM_time_state")
    spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
    output_path="tensorflow/"+model.name+'.onnx'
    model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
    output_names=[n.name for n in model_proto.graph.output]
    model = onnx.load(output_path)
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = output_path.split('.')[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    def Two_LSTM_time_state():
        input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
        middle1 = tf.squeeze(input,axis=1)
        middle = tf.transpose(middle1,[1,0,2])
        output1,h_state,c_state = nn.LSTM(4,time_major=True,return_sequences=True,return_state=True,name='one')(middle)
        output,h_state1,c_state1 = nn.LSTM(4,time_major=True,return_sequences=True,return_state=True,name='two')(output1,initial_state=(h_state,c_state))
        model = tf.keras.models.Model(inputs=input,outputs=[output,h_state1,c_state1],name="Two_LSTM_time_state")
        return model
    model = Two_LSTM_time_state()
    #tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
    model.save("tensorflow/Two_LSTM_time_state")
    spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
    output_path="tensorflow/"+model.name+'.onnx'
    
    model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
    output_names=[n.name for n in model_proto.graph.output]
    model = onnx.load(output_path)
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = output_path.split('.')[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    def Bi_Two_LSTM_time_state():
        input = nn.Input(shape=[1,6,3],batch_size=1,name="input")
        middle1 = tf.squeeze(input,axis=1)
        middle = tf.transpose(middle1,[1,0,2])
        output1,h_state,c_state,h_state1,c_state1= nn.Bidirectional(nn.LSTM(4,time_major=True,return_sequences=True,return_state=True,name='one'),merge_mode="concat")(middle)
        output= nn.Bidirectional(nn.LSTM(4,time_major=True,return_sequences=True,return_state=True,name='two'),merge_mode="concat")(output1,initial_state=(h_state,c_state,h_state1,c_state1))
        model = tf.keras.models.Model(inputs=input,outputs=output,name="Bi_Two_LSTM_time_state")
        return model
    model = Bi_Two_LSTM_time_state()
    #tf.keras.utils.plot_model(model,to_file=f'tensorflow/{model.name}.png',show_shapes=True,show_layer_names=True,show_dtype=True)
    model.save("tensorflow/Bi_Two_LSTM_time_state")
    spec = (tf.TensorSpec((1,1,6,3),tf.float32,name="input"),)
    output_path="tensorflow/"+model.name+'.onnx'
    model_proto,_=tf2onnx.convert.from_keras(model,input_signature=spec,opset=11,output_path=output_path)
    output_names=[n.name for n in model_proto.graph.output]
    model = onnx.load(output_path)
    model_sim ,check = simplify(model)
    assert check,"simplified onnx model could not be validated"
    save_path = output_path.split('.')[0]+"_sim.onnx"
    onnx.save(model_sim,save_path)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    WARNING:absl:Found untraced functions such as lstm_cell_151_layer_call_fn, lstm_cell_151_layer_call_and_return_conditional_losses while saving (showing 2 of 2). These functions will not be directly callable after loading.
    
    
    INFO:tensorflow:Assets written to: tensorflow/One_LSTM_time_state/assets
    
    
    INFO:tensorflow:Assets written to: tensorflow/One_LSTM_time_state/assets
    2022-08-10 10:27:29.569744: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-10 10:27:29.569887: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-10 10:27:29.587666: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-10 10:27:29.588730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-10 10:27:29.589789: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-10 10:27:29.590839: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-10 10:27:29.657141: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-10 10:27:29.657227: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-10 10:27:29.674878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-10 10:27:29.675934: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-10 10:27:29.676968: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-10 10:27:29.678019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    WARNING:absl:Found untraced functions such as lstm_cell_152_layer_call_fn, lstm_cell_152_layer_call_and_return_conditional_losses, lstm_cell_153_layer_call_fn, lstm_cell_153_layer_call_and_return_conditional_losses while saving (showing 4 of 4). These functions will not be directly callable after loading.
    
    
    INFO:tensorflow:Assets written to: tensorflow/Two_LSTM_time_state/assets
    
    
    INFO:tensorflow:Assets written to: tensorflow/Two_LSTM_time_state/assets
    2022-08-10 10:27:34.989735: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-10 10:27:34.989854: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-10 10:27:35.007588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-10 10:27:35.008641: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-10 10:27:35.009675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-10 10:27:35.010708: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-10 10:27:35.118860: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-10 10:27:35.118956: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-10 10:27:35.136643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-10 10:27:35.137703: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-10 10:27:35.138736: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-10 10:27:35.139769: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    
    
    WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
    WARNING:absl:Found untraced functions such as lstm_cell_155_layer_call_fn, lstm_cell_155_layer_call_and_return_conditional_losses, lstm_cell_156_layer_call_fn, lstm_cell_156_layer_call_and_return_conditional_losses, lstm_cell_158_layer_call_fn while saving (showing 5 of 8). These functions will not be directly callable after loading.
    
    
    INFO:tensorflow:Assets written to: tensorflow/Bi_Two_LSTM_time_state/assets
    
    
    INFO:tensorflow:Assets written to: tensorflow/Bi_Two_LSTM_time_state/assets
    2022-08-10 10:27:50.572328: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-10 10:27:50.572459: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-10 10:27:50.590225: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-10 10:27:50.591290: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-10 10:27:50.592334: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-10 10:27:50.593388: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    2022-08-10 10:27:50.800566: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 4
    2022-08-10 10:27:50.800672: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
    2022-08-10 10:27:50.818458: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14627 MB memory:  -> device: 0, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:da:00.0, compute capability: 7.0
    2022-08-10 10:27:50.819552: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14627 MB memory:  -> device: 1, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:db:00.0, compute capability: 7.0
    2022-08-10 10:27:50.820605: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 14627 MB memory:  -> device: 2, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dc:00.0, compute capability: 7.0
    2022-08-10 10:27:50.821643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 14627 MB memory:  -> device: 3, name: Tesla V100-PCIE-16GB-LS, pci bus id: 0000:dd:00.0, compute capability: 7.0
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73

    3.4 查看生成的模型

    tf_models = sorted(os.listdir('tensorflow'))
    tf_models_path=[os.path.join('tensorflow',p) for p in tf_models if p.endswith('onnx')]
    print(tf_models)
    
    • 1
    • 2
    • 3
    ['Bi_Two_LSTM_batch', 'Bi_Two_LSTM_batch.onnx', 'Bi_Two_LSTM_batch_sim.onnx', 'Bi_Two_LSTM_time', 'Bi_Two_LSTM_time.onnx', 'Bi_Two_LSTM_time_sim.onnx', 'Bi_Two_LSTM_time_state', 'Bi_Two_LSTM_time_state.onnx', 'Bi_Two_LSTM_time_state_sim.onnx', 'One_LSTM_batch', 'One_LSTM_batch.onnx', 'One_LSTM_batch_sim.onnx', 'One_LSTM_time', 'One_LSTM_time.onnx', 'One_LSTM_time_sim.onnx', 'One_LSTM_time_state', 'One_LSTM_time_state.onnx', 'One_LSTM_time_state_sim.onnx', 'Two_LSTM_batch', 'Two_LSTM_batch.onnx', 'Two_LSTM_batch_sim.onnx', 'Two_LSTM_time', 'Two_LSTM_time.onnx', 'Two_LSTM_time_sim.onnx', 'Two_LSTM_time_state', 'Two_LSTM_time_state.onnx', 'Two_LSTM_time_state_sim.onnx']
    
    • 1
    tf_models_path
    
    • 1
    ['tensorflow/Bi_Two_LSTM_batch.onnx',
     'tensorflow/Bi_Two_LSTM_batch_sim.onnx',
     'tensorflow/Bi_Two_LSTM_time.onnx',
     'tensorflow/Bi_Two_LSTM_time_sim.onnx',
     'tensorflow/Bi_Two_LSTM_time_state.onnx',
     'tensorflow/Bi_Two_LSTM_time_state_sim.onnx',
     'tensorflow/One_LSTM_batch.onnx',
     'tensorflow/One_LSTM_batch_sim.onnx',
     'tensorflow/One_LSTM_time.onnx',
     'tensorflow/One_LSTM_time_sim.onnx',
     'tensorflow/One_LSTM_time_state.onnx',
     'tensorflow/One_LSTM_time_state_sim.onnx',
     'tensorflow/Two_LSTM_batch.onnx',
     'tensorflow/Two_LSTM_batch_sim.onnx',
     'tensorflow/Two_LSTM_time.onnx',
     'tensorflow/Two_LSTM_time_sim.onnx',
     'tensorflow/Two_LSTM_time_state.onnx',
     'tensorflow/Two_LSTM_time_state_sim.onnx']
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    ! du -sh tensorflow/*
    
    • 1
    4.2M	tensorflow/Bi_Two_LSTM_batch
    8.0K	tensorflow/Bi_Two_LSTM_batch.onnx
    8.0K	tensorflow/Bi_Two_LSTM_batch_sim.onnx
    4.0M	tensorflow/Bi_Two_LSTM_time
    8.0K	tensorflow/Bi_Two_LSTM_time.onnx
    8.0K	tensorflow/Bi_Two_LSTM_time_sim.onnx
    4.0M	tensorflow/Bi_Two_LSTM_time_state
    8.0K	tensorflow/Bi_Two_LSTM_time_state.onnx
    8.0K	tensorflow/Bi_Two_LSTM_time_state_sim.onnx
    708K	tensorflow/One_LSTM_batch
    4.0K	tensorflow/One_LSTM_batch.onnx
    4.0K	tensorflow/One_LSTM_batch_sim.onnx
    684K	tensorflow/One_LSTM_time
    4.0K	tensorflow/One_LSTM_time.onnx
    4.0K	tensorflow/One_LSTM_time_sim.onnx
    696K	tensorflow/One_LSTM_time_state
    4.0K	tensorflow/One_LSTM_time_state.onnx
    4.0K	tensorflow/One_LSTM_time_state_sim.onnx
    1.4M	tensorflow/Two_LSTM_batch
    4.0K	tensorflow/Two_LSTM_batch.onnx
    4.0K	tensorflow/Two_LSTM_batch_sim.onnx
    1.3M	tensorflow/Two_LSTM_time
    4.0K	tensorflow/Two_LSTM_time.onnx
    4.0K	tensorflow/Two_LSTM_time_sim.onnx
    1.3M	tensorflow/Two_LSTM_time_state
    4.0K	tensorflow/Two_LSTM_time_state.onnx
    4.0K	tensorflow/Two_LSTM_time_state_sim.onnx
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    def onnx_infer(model_path,data):
        """_summary_
    
        Args:
            model_path (_type_): _description_
            data (_type_): _description_
        """
        onnx_session=onnxruntime.InferenceSession(model_path)
        input_name = onnx_session.get_inputs()[0].name
        output_name = onnx_session.get_outputs()[0].name
        result = onnx_session.run([output_name],{input_name:data})
        return result[0]
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    tf_models_path=["tensorflow/One_LSTM_time.onnx"]
    
    • 1
    test_data = np.random.random(size=(1,1,6,3)).astype(np.float32) # batch,channel,height,width
    for i,onnx_path in enumerate(tf_models_path):
        base_path = os.path.splitext(onnx_path)[0]
        if not base_path.endswith('sim'):
            results={}
            onnx_sim=base_path+'_sim.onnx'
            tf_result = tf.keras.models.load_model(base_path)(tf.convert_to_tensor(test_data))
            # print(f'base_path:{base_path} len:{len(tf_result)} type:{type(tf_result)}')
            if isinstance(tf_result,list):
                tf_result=tf_result[0].numpy()
            results[os.path.basename(base_path)]=tf_result
            onnx_result = onnx_infer(onnx_path,test_data)
            results[os.path.basename(onnx_path)]=onnx_result
            sim_result = onnx_infer(onnx_sim,test_data)
            results[os.path.basename(onnx_sim)]=sim_result
            try:
                values = list(results.values())
                np.testing.assert_allclose(values[0],values[1],rtol=1e-5)
                np.testing.assert_allclose(values[1],values[2],rtol=1e-5)
                print(f"{list(results.keys())} have same results")
            except:
                print(f"{list(results.keys())} have different results")
            finally:
                results={}
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    ['Bi_Two_LSTM_batch', 'Bi_Two_LSTM_batch.onnx', 'Bi_Two_LSTM_batch_sim.onnx'] have same results
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    ['Bi_Two_LSTM_time', 'Bi_Two_LSTM_time.onnx', 'Bi_Two_LSTM_time_sim.onnx'] have same results
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    ['Bi_Two_LSTM_time_state', 'Bi_Two_LSTM_time_state.onnx', 'Bi_Two_LSTM_time_state_sim.onnx'] have same results
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    ['One_LSTM_batch', 'One_LSTM_batch.onnx', 'One_LSTM_batch_sim.onnx'] have same results
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    ['One_LSTM_time', 'One_LSTM_time.onnx', 'One_LSTM_time_sim.onnx'] have same results
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    ['One_LSTM_time_state', 'One_LSTM_time_state.onnx', 'One_LSTM_time_state_sim.onnx'] have same results
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    ['Two_LSTM_batch', 'Two_LSTM_batch.onnx', 'Two_LSTM_batch_sim.onnx'] have same results
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    ['Two_LSTM_time', 'Two_LSTM_time.onnx', 'Two_LSTM_time_sim.onnx'] have same results
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
    
    
    ['Two_LSTM_time_state', 'Two_LSTM_time_state.onnx', 'Two_LSTM_time_state_sim.onnx'] have same results
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63

    查看onnx的结构:
    tensorflow不同与前两个框架默认返回状态,tensorflow可以指定是否返回,而且要返回的话会返回所用,而不仅仅是最后一step的,

    类型单层双层双层双向
    time_major=False,return_state=False在这里插入图片描述在这里插入图片描述在这里插入图片描述
    time_major=True,return_state=False在这里插入图片描述在这里插入图片描述在这里插入图片描述
    time_major=True,returnstate=True在这里插入图片描述在这里插入图片描述在这里插入图片描述

    4 lstm 转换工具

    https://gitee.com/tdddeeel/lstm_pytorch_tensorflow_paddle 在这里有一个小工具,可以将双向lstm转换成两个单向的lstm
    这里将转换pytorch batch_first=True及tensorflow time_major=True的双层双向lstm为例进行转换,转换前后的两个onnx对比精度也是可以参考我边的例子,经测试是没有问题的。目前该工具只对paddle和pytorch的模型有效,
    使用方法

    ./bilstm_opt --onnx_path ... --save_path ..
    
    • 1

    看下onnx:

    类型转换前转换后
    pytorch在这里插入图片描述在这里插入图片描述
  • 相关阅读:
    redis内存淘汰策略
    补充:selenium操作已打开的浏览器窗口
    erlang练习题(四)
    【线性代数】沉浸式线性代数在线学习网站
    一、thymeleaf简介
    动态规划题: 统计每个月兔子的总数
    [第五空间 2021]web 复现wp
    Related to the third param of function “sort“ & Lambda of Cpp
    【iOS】计算器实现
    【push,pop,shift,unshift】手写数组push,pop,shift,unshiftt方法
  • 原文地址:https://blog.csdn.net/u011119817/article/details/126465541