• MobileNetV2详解与多动物分类实战


    一.MobileNet详解

    MobileNetV1

    传统的CNN内存需求大,运算量大,导致无法在移动设备以及嵌入式设备上运行。通过研究发现,卷积层和全连接层是最费时的两个阶段,batch_size越大耗时越大,因此需要对他进行轻量化。

     在MobileNet中,使用深度可分离卷积,即DW(deoth-wise)卷积和PW(point-wise)卷积代替传统的卷积,可以有效减少参数数量。

    DW卷积的每一个卷积核只负责输入特征矩阵的一个channel维度,所以卷积核的channel=1,输入特征矩阵的channel=卷积核个数=输出特征矩阵channel。也就是说,经过DW卷积之后,特征矩阵维度(channel)不会发生变化。

    PW卷积的结构与传统卷积相同,它的特点是卷积核大小为1 。

    将DW与PW卷积结合使用就构成了MobileNet网络。

     这样构成的网络可以有效减少参数数量,使网络更加轻量化。

    在参数量的计算上,传统卷积为DF*DF*DK*DK*M*N,DW+PW为DF*DF*DK*DK*M+DF*DF*M*N

    在网络中,还有2个超参数α和β,分别控制卷积核个数 和输入图片分辨率.

    对于下图的网络结构,Conv/s1代表DW卷积,Conv/s1代表PW卷积,网络就是有这些模块堆叠而成;输入的尺寸大小为224*224*3,卷积层最后一层输出的是7*7*1024,之后经过全局平均池化把每一个7*7求一个平均值,变成了1*1*1024,再把这1024个数喂到全连接层(FC),输出这1000个;类别的logist,再经过sofamax归一化变成1000个概率。

    MobileNetV2 

    V1网络主要使用深度可分离卷积的堆叠,原论文指出,在训练完成后。DW卷积的大部分卷积核参数都是0,说明并没有起到作用,因此诞生了改进版本的MobileNetV2。

    在V2中,除了还继续使用深度可分离之外,还使用了Expansion layer和Projection layer。

    projection layer也是使用1*1卷积将低维空间映射到高维空间;Expansion layer则是使用1*1卷积将低维空间映射到高维空间。

    此图更详细的展示了整个模块的结构。我们输入是24维,最后输出也是24维。但这个过程中,我们扩展了6倍,然后应用深度可分离卷积进行处理。整个网络是中间胖,两头窄,像一个纺锤形。bottleneck residual block(ResNet论文中的)是中间窄两头胖,在MobileNetV2中正好反了过来,所以,在MobileNetV2的论文中我们称这样的网络结构为Inverted residuals。需要注意的是residual connection是在输入和输出的部分进行连接。另外,我们之前已经花了很大篇幅来讲Linear Bottleneck,因为从高维向低维转换,使用ReLU激活函数可能会造成信息丢失或破坏(不使用非线性激活数数)。所以在projection convolution这一部分,我们不再使用ReLU激活函数而是使用线性激活函数。

    二.数据集准备

    新建一个项目文件夹MobileNet,并在里面建立data_set文件夹用来保存数据集,在data_set文件夹下创建新文件夹"raw_data",下载一个准备好的数据集(这里采用的10分类的动物数据集)

    链接: https://pan.baidu.com/s/12fxvkaIJ9cnmz7iTbU3RtA 提取码: 8x8w 复制这段内容后打开百度网盘手机App,操作更方便哦

    将它解压到raw_data文件夹下,执行"split_data.py"脚本自动将数据集划分成训练集train和验证集val。

      split.py如下:

    1. import os
    2. from shutil import copy, rmtree
    3. import random
    4. def mk_file(file_path: str):
    5. if os.path.exists(file_path):
    6. # 如果文件夹存在,则先删除原文件夹在重新创建
    7. rmtree(file_path)
    8. os.makedirs(file_path)
    9. def main():
    10. # 保证随机可复现
    11. random.seed(0)
    12. # 将数据集中10%的数据划分到验证集中
    13. split_rate = 0.1
    14. # 指向你解压后的flower_photos文件夹
    15. cwd = os.getcwd()
    16. data_root = os.path.join(cwd, "raw_data")
    17. origin_flower_path = os.path.join(data_root, "raw_photo")
    18. assert os.path.exists(origin_flower_path), "path '{}' does not exist.".format(origin_flower_path)
    19. flower_class = [cla for cla in os.listdir(origin_flower_path)
    20. if os.path.isdir(os.path.join(origin_flower_path, cla))]
    21. # 建立保存训练集的文件夹
    22. train_root = os.path.join(data_root, "train")
    23. mk_file(train_root)
    24. for cla in flower_class:
    25. # 建立每个类别对应的文件夹
    26. mk_file(os.path.join(train_root, cla))
    27. # 建立保存验证集的文件夹
    28. val_root = os.path.join(data_root, "val")
    29. mk_file(val_root)
    30. for cla in flower_class:
    31. # 建立每个类别对应的文件夹
    32. mk_file(os.path.join(val_root, cla))
    33. for cla in flower_class:
    34. cla_path = os.path.join(origin_flower_path, cla)
    35. images = os.listdir(cla_path)
    36. num = len(images)
    37. # 随机采样验证集的索引
    38. eval_index = random.sample(images, k=int(num*split_rate))
    39. for index, image in enumerate(images):
    40. if image in eval_index:
    41. # 将分配至验证集中的文件复制到相应目录
    42. image_path = os.path.join(cla_path, image)
    43. new_path = os.path.join(val_root, cla)
    44. copy(image_path, new_path)
    45. else:
    46. # 将分配至训练集中的文件复制到相应目录
    47. image_path = os.path.join(cla_path, image)
    48. new_path = os.path.join(train_root, cla)
    49. copy(image_path, new_path)
    50. print("\r[{}] processing [{}/{}]".format(cla, index+1, num), end="") # processing bar
    51. print()
    52. print("processing done!")
    53. if __name__ == '__main__':
    54. main()

    之后会在文件夹下生成train和val数据集,到此,完成了数据集的准备。

     三.定义网络 

    根据网络的整体结构可以了解到V2网络采用了bottleneck(倒残差结构),多个(表格中的参数n为其个数)bottleneck组合成block。

    这里根据pytorch官方给出代码进行简单修改和简化,完成网络定义,具体的代码逻辑可以看下方修改后的代码注释。

    pytorch官方源代码MobileNetV2

    修改后的train.py:

    1. from torch import nn
    2. import torch
    3. # 将channel调整为离8最近的整数倍,这样的处理对硬件更加的友好,也有一定训练速度的提升
    4. def _make_divisible(ch, divisor=8, min_ch=None):
    5. """
    6. This function is taken from the original tf repo.
    7. It ensures that all layers have a channel number that is divisible by 8
    8. It can be seen here:
    9. https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
    10. """
    11. if min_ch is None:
    12. min_ch = divisor
    13. new_ch = max(min_ch, int(ch + divisor / 2) // divisor * divisor)
    14. # Make sure that round down does not go down by more than 10%.
    15. if new_ch < 0.9 * ch:
    16. new_ch += divisor
    17. return new_ch
    18. class ConvBNReLU(nn.Sequential):
    19. def __init__(self, in_channel, out_channel, kernel_size=3, stride=1, groups=1):
    20. padding = (kernel_size - 1) // 2
    21. super(ConvBNReLU, self).__init__(
    22. # 这里如果group=1,则为普通卷积;group=输入特征矩阵的深度时,则为DW卷积
    23. nn.Conv2d(in_channel, out_channel, kernel_size, stride, padding, groups=groups, bias=False),
    24. nn.BatchNorm2d(out_channel),
    25. nn.ReLU6(inplace=True)
    26. )
    27. class InvertedResidual(nn.Module):
    28. def __init__(self, in_channel, out_channel, stride, expand_ratio):
    29. super(InvertedResidual, self).__init__()
    30. hidden_channel = in_channel * expand_ratio
    31. # 当步长为1,且输入输出维度相同时,使用捷径分支
    32. self.use_shortcut = stride == 1 and in_channel == out_channel
    33. layers = []
    34. if expand_ratio != 1:
    35. # 1x1 pointwise conv
    36. layers.append(ConvBNReLU(in_channel, hidden_channel, kernel_size=1))
    37. layers.extend([
    38. # 3x3 depthwise conv
    39. ConvBNReLU(hidden_channel, hidden_channel, stride=stride, groups=hidden_channel),
    40. # 1x1 pointwise conv(linear)
    41. nn.Conv2d(hidden_channel, out_channel, kernel_size=1, bias=False),
    42. nn.BatchNorm2d(out_channel),
    43. ])
    44. self.conv = nn.Sequential(*layers)
    45. def forward(self, x):
    46. if self.use_shortcut:
    47. return x + self.conv(x)
    48. else:
    49. return self.conv(x)
    50. class MobileNetV2(nn.Module):
    51. def __init__(self, num_classes=1000, alpha=1.0, round_nearest=8):
    52. super(MobileNetV2, self).__init__()
    53. block = InvertedResidual
    54. input_channel = _make_divisible(32 * alpha, round_nearest)
    55. last_channel = _make_divisible(1280 * alpha, round_nearest)
    56. inverted_residual_setting = [
    57. # t, c, n, s
    58. # t:将输入特征矩阵深度调整t倍
    59. # c:输入channel
    60. # n:bottle(倒残差结构重复的次数)
    61. # s:每个block中,第一个bottleneck的步长
    62. [1, 16, 1, 1],
    63. [6, 24, 2, 2],
    64. [6, 32, 3, 2],
    65. [6, 64, 4, 2],
    66. [6, 96, 3, 1],
    67. [6, 160, 3, 2],
    68. [6, 320, 1, 1],
    69. ]
    70. features = []
    71. # conv1 layer
    72. features.append(ConvBNReLU(3, input_channel, stride=2))
    73. # building inverted residual residual blockes
    74. for t, c, n, s in inverted_residual_setting:
    75. output_channel = _make_divisible(c * alpha, round_nearest)
    76. for i in range(n):
    77. stride = s if i == 0 else 1
    78. features.append(block(input_channel, output_channel, stride, expand_ratio=t))
    79. input_channel = output_channel
    80. # building last several layers
    81. features.append(ConvBNReLU(input_channel, last_channel, 1))
    82. # combine feature layers
    83. self.features = nn.Sequential(*features)
    84. # building classifier
    85. self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
    86. self.classifier = nn.Sequential(
    87. nn.Dropout(0.2),
    88. nn.Linear(last_channel, num_classes)
    89. )
    90. # weight initialization
    91. for m in self.modules():
    92. if isinstance(m, nn.Conv2d):
    93. nn.init.kaiming_normal_(m.weight, mode='fan_out')
    94. if m.bias is not None:
    95. nn.init.zeros_(m.bias)
    96. elif isinstance(m, nn.BatchNorm2d):
    97. nn.init.ones_(m.weight)
    98. nn.init.zeros_(m.bias)
    99. elif isinstance(m, nn.Linear):
    100. nn.init.normal_(m.weight, 0, 0.01)
    101. nn.init.zeros_(m.bias)
    102. def forward(self, x):
    103. x = self.features(x)
    104. x = self.avgpool(x)
    105. x = torch.flatten(x, 1)
    106. x = self.classifier(x)
    107. return x
    108. if __name__ == "__main__":
    109. net = MobileNetV2(num_classes=10)
    110. in_data = torch.randn(1, 3, 224, 224)
    111. out = net(in_data)
    112. print(out)

    完成网络的定义之后,可以单独执行一下这个文件,用来验证网络定义的是否正确。如果可以正确输出,就没问题。

    在这里输出为

    tensor([[-0.1906,  0.1323, -0.0054,  0.0503, -0.4200, -0.2074, -0.1114,  0.4141,
              0.2739, -0.0870]], grad_fn=)

    说明网络定义正确。

    四.开始训练

     加载数据集

    首先定义一个字典,用于用于对train和val进行预处理,包括裁剪成224*224大小,训练集随机水平翻转(一般验证集不需要此操作),转换成张量,图像归一化。

    然后利用DataLoader模块加载数据集,并设置batch_size为16,同时,设置数据加载器的工作进程数nw,加快速度。

    1. import os
    2. import sys
    3. import json
    4. import torch
    5. import torch.nn as nn
    6. import torch.optim as optim
    7. from torchvision import transforms, datasets
    8. from tqdm import tqdm
    9. from model_v2 import MobileNetV2
    10. def main():
    11. device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    12. print(f"using {device} device.")
    13. data_transform = {
    14. "train": transforms.Compose([transforms.RandomResizedCrop(224),
    15. transforms.RandomHorizontalFlip(),
    16. transforms.ToTensor(),
    17. transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),
    18. "val": transforms.Compose([transforms.Resize(256),
    19. transforms.CenterCrop(224),
    20. transforms.ToTensor(),
    21. transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])}
    22. # 获取数据集路径
    23. image_path = os.path.join(os.getcwd(), "data_set", "raw_data") # flower data set path
    24. assert os.path.exists(image_path), "{} path does not exist.".format(image_path)
    25. # 加载数据集,准备读取
    26. train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"), transform=data_transform["train"])
    27. validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"), transform=data_transform["val"])
    28. nw = min([os.cpu_count(), 16 if 16 > 1 else 0, 8]) # number of workers
    29. print('Using {} dataloader workers every process'.format(nw))
    30. # 加载数据集
    31. train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=16, shuffle=True, num_workers=nw)
    32. validate_loader = torch.utils.data.DataLoader(validate_dataset, batch_size=16, shuffle=False, num_workers=nw)
    33. train_num = len(train_dataset)
    34. val_num = len(validate_dataset)
    35. print(f"using {train_num} images for training, {val_num} images for validation.")

    生成json文件

    将训练数据集的类别标签转换为字典格式,并将其写入名为'class_indices.json'的文件中。

    1. train_dataset中获取类别标签到索引的映射关系,存储在flower_list变量中。
    2. 使用列表推导式将flower_list中的键值对反转,得到一个新的字典cla_dict,其中键是原始类别标签,值是对应的索引。
    3. 使用json.dumps()函数将cla_dict转换为JSON格式的字符串,设置缩进为4个空格。
    4. 使用with open()语句以写入模式打开名为'class_indices.json'的文件,并将JSON字符串写入文件
    1. # {'cane':0, 'cavallo':1, 'elefante':2, 'farfalla':3, 'gallina':4, 'gatto':5, 'mucca':6, 'pecora':7, 'ragno':8, 'scoiattolo':9}
    2. flower_list = train_dataset.class_to_idx
    3. cla_dict = dict((val, key) for key, val in flower_list.items())
    4. # write dict into json file
    5. json_str = json.dumps(cla_dict, indent=4)
    6. with open('class_indices.json', 'w') as json_file:
    7. json_file.write(json_str)

    加载预训练模型开始训练

    首先定义网络对象net,在这里我们使用了迁移学习来使网络训练效果更好;这里需要注意的是因为预训练参数是基于ImageNet数据集训练的,类别为1000,而我们这里需要做的是10分类,所以需要删掉最后一层参数,只保留其他部分;训练10轮,并使用train_bar = tqdm(train_loader, file=sys.stdout)来可视化训练进度条,之后再进行反向传播和参数更新;同时,每一轮训练完成都要进行学习率更新;之后开始对验证集进行计算精确度,完成后保存模型。

    1. # load pretrain weights
    2. # download url: https://download.pytorch.org/models/mobilenet_v2-b0353104.pth
    3. net = MobileNetV2(num_classes=10)
    4. model_weight_path = "./mobilenet_v2.pth"
    5. assert os.path.exists(model_weight_path), f"file {model_weight_path} dose not exist."
    6. pre_weights = torch.load(model_weight_path, map_location='cpu')
    7. # delete classifier weights,因为预训练参数是基于ImageNet数据集训练的,类别为1000,所以需要删掉最后一层参数,只保留其他部分
    8. pre_dict = {k: v for k, v in pre_weights.items() if net.state_dict()[k].numel() == v.numel()}
    9. missing_keys, unexpected_keys = net.load_state_dict(pre_dict, strict=False)
    10. # freeze features weights
    11. for param in net.features.parameters():
    12. param.requires_grad = False
    13. net.to(device)
    14. # define loss function
    15. loss_function = nn.CrossEntropyLoss()
    16. # construct an optimizer
    17. params = [p for p in net.parameters() if p.requires_grad]
    18. optimizer = optim.Adam(params, lr=0.0001)
    19. epochs = 5
    20. best_acc = 0.0
    21. train_steps = len(train_loader)
    22. for epoch in range(epochs):
    23. # train
    24. net.train()
    25. running_loss = 0.0
    26. train_bar = tqdm(train_loader, file=sys.stdout)
    27. for step, data in enumerate(train_bar):
    28. images, labels = data
    29. optimizer.zero_grad()
    30. logits = net(images.to(device))
    31. loss = loss_function(logits, labels.to(device))
    32. loss.backward()
    33. optimizer.step()
    34. # print statistics
    35. running_loss += loss.item()
    36. train_bar.desc = f"train epoch[{epoch + 1}/{epochs}] loss:{loss:.3f}"
    37. # validate
    38. net.eval()
    39. acc = 0.0 # accumulate accurate number / epoch
    40. with torch.no_grad():
    41. val_bar = tqdm(validate_loader, file=sys.stdout)
    42. for val_data in val_bar:
    43. val_images, val_labels = val_data
    44. outputs = net(val_images.to(device))
    45. # loss = loss_function(outputs, test_labels)
    46. predict_y = torch.max(outputs, dim=1)[1]
    47. acc += torch.eq(predict_y, val_labels.to(device)).sum().item()
    48. val_bar.desc = f"valid epoch[{epoch + 1}/{epochs}]"
    49. val_accurate = acc / val_num
    50. print('[epoch %d] train_loss: %.3f val_accuracy: %.3f' %
    51. (epoch + 1, running_loss / train_steps, val_accurate))
    52. if val_accurate > best_acc:
    53. best_acc = val_accurate
    54. torch.save(net, "./MobileNetV2.pth")
    55. print('Finished Training')
    56. if __name__ == '__main__':
    57. main()

     最后对代码进行整理,完整的train.py如下

    1. import os
    2. import sys
    3. import json
    4. import torch
    5. import torch.nn as nn
    6. import torch.optim as optim
    7. from torchvision import transforms, datasets
    8. from tqdm import tqdm
    9. from model_v2 import MobileNetV2
    10. def main():
    11. device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    12. print(f"using {device} device.")
    13. data_transform = {
    14. "train": transforms.Compose([transforms.RandomResizedCrop(224),
    15. transforms.RandomHorizontalFlip(),
    16. transforms.ToTensor(),
    17. transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),
    18. "val": transforms.Compose([transforms.Resize(256),
    19. transforms.CenterCrop(224),
    20. transforms.ToTensor(),
    21. transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])}
    22. # 获取数据集路径
    23. image_path = os.path.join(os.getcwd(), "data_set", "raw_data") # flower data set path
    24. assert os.path.exists(image_path), "{} path does not exist.".format(image_path)
    25. # 加载数据集,准备读取
    26. train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"), transform=data_transform["train"])
    27. validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"), transform=data_transform["val"])
    28. nw = min([os.cpu_count(), 16 if 16 > 1 else 0, 8]) # number of workers
    29. print('Using {} dataloader workers every process'.format(nw))
    30. # 加载数据集
    31. train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=16, shuffle=True, num_workers=nw)
    32. validate_loader = torch.utils.data.DataLoader(validate_dataset, batch_size=16, shuffle=False, num_workers=nw)
    33. train_num = len(train_dataset)
    34. val_num = len(validate_dataset)
    35. print(f"using {train_num} images for training, {val_num} images for validation.")
    36. # {'cane':0, 'cavallo':1, 'elefante':2, 'farfalla':3, 'gallina':4, 'gatto':5, 'mucca':6, 'pecora':7, 'ragno':8, 'scoiattolo':9}
    37. flower_list = train_dataset.class_to_idx
    38. cla_dict = dict((val, key) for key, val in flower_list.items())
    39. # write dict into json file
    40. json_str = json.dumps(cla_dict, indent=4)
    41. with open('class_indices.json', 'w') as json_file:
    42. json_file.write(json_str)
    43. # load pretrain weights
    44. # download url: https://download.pytorch.org/models/mobilenet_v2-b0353104.pth
    45. net = MobileNetV2(num_classes=10)
    46. model_weight_path = "./mobilenet_v2.pth"
    47. assert os.path.exists(model_weight_path), f"file {model_weight_path} dose not exist."
    48. pre_weights = torch.load(model_weight_path, map_location='cpu')
    49. # delete classifier weights,因为预训练参数是基于ImageNet数据集训练的,类别为1000,所以需要删掉最后一层参数,只保留其他部分
    50. pre_dict = {k: v for k, v in pre_weights.items() if net.state_dict()[k].numel() == v.numel()}
    51. missing_keys, unexpected_keys = net.load_state_dict(pre_dict, strict=False)
    52. # freeze features weights
    53. # 预训练模型中,我们只希望微调最后几层,因此冻结前面的权重和偏置参数
    54. for param in net.features.parameters():
    55. param.requires_grad = False
    56. net.to(device)
    57. # define loss function
    58. loss_function = nn.CrossEntropyLoss()
    59. # construct an optimizer
    60. params = [p for p in net.parameters() if p.requires_grad]
    61. optimizer = optim.Adam(params, lr=0.0001)
    62. epochs = 5
    63. best_acc = 0.0
    64. train_steps = len(train_loader)
    65. for epoch in range(epochs):
    66. # train
    67. net.train()
    68. running_loss = 0.0
    69. train_bar = tqdm(train_loader, file=sys.stdout)
    70. for step, data in enumerate(train_bar):
    71. images, labels = data
    72. optimizer.zero_grad()
    73. logits = net(images.to(device))
    74. loss = loss_function(logits, labels.to(device))
    75. loss.backward()
    76. optimizer.step()
    77. # print statistics
    78. running_loss += loss.item()
    79. train_bar.desc = f"train epoch[{epoch + 1}/{epochs}] loss:{loss:.3f}"
    80. # validate
    81. net.eval()
    82. acc = 0.0 # accumulate accurate number / epoch
    83. with torch.no_grad():
    84. val_bar = tqdm(validate_loader, file=sys.stdout)
    85. for val_data in val_bar:
    86. val_images, val_labels = val_data
    87. outputs = net(val_images.to(device))
    88. # loss = loss_function(outputs, test_labels)
    89. predict_y = torch.max(outputs, dim=1)[1]
    90. acc += torch.eq(predict_y, val_labels.to(device)).sum().item()
    91. val_bar.desc = f"valid epoch[{epoch + 1}/{epochs}]"
    92. val_accurate = acc / val_num
    93. print('[epoch %d] train_loss: %.3f val_accuracy: %.3f' %
    94. (epoch + 1, running_loss / train_steps, val_accurate))
    95. if val_accurate > best_acc:
    96. best_acc = val_accurate
    97. torch.save(net, "./MobileNetV2.pth")
    98. print('Finished Training')
    99. if __name__ == '__main__':
    100. main()

    五.模型预测

    新建一个predict.py文件用于预测,将输入图像处理后转换成张量格式,img = torch.unsqueeze(img, dim=0)是在输入图像张量 img 的第一个维度上增加一个大小为1的维度,因此将图像张量的形状从 [通道数, 高度, 宽度 ] 转换为 [1, 通道数, 高度, 宽度]。然后加载模型进行预测,并打印出结果,同时可视化。

    1. import os
    2. import json
    3. import torch
    4. from PIL import Image
    5. from torchvision import transforms
    6. import matplotlib.pyplot as plt
    7. from model_v2 import MobileNetV2
    8. def main():
    9. device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    10. data_transform = transforms.Compose(
    11. [transforms.Resize(256),
    12. transforms.CenterCrop(224),
    13. transforms.ToTensor(),
    14. transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
    15. # load image
    16. img_path = "./ea34b0062bf4063ed1584d05fb1d4e9fe777ead218ac104497f5c978a6ebb3bf_640.jpg"
    17. img = Image.open(img_path)
    18. plt.imshow(img)
    19. # [N, C, H, W]
    20. img = data_transform(img)
    21. # expand batch dimension
    22. img = torch.unsqueeze(img, dim=0)
    23. # read class_indict
    24. json_path = './class_indices.json'
    25. assert os.path.exists(json_path), "file: '{}' dose not exist.".format(json_path)
    26. with open(json_path, "r") as f:
    27. class_indict = json.load(f)
    28. # create model
    29. model = MobileNetV2(num_classes=10).to(device)
    30. # load model weights
    31. model = torch.load("./MobileNetV2.pth")
    32. model.eval()
    33. with torch.no_grad():
    34. # predict class
    35. output = torch.squeeze(model(img.to(device))).cpu()
    36. predict = torch.softmax(output, dim=0)
    37. predict_cla = torch.argmax(predict).numpy()
    38. print_res = f"class: {class_indict[str(predict_cla)]} prob: {predict[predict_cla].numpy():.3}"
    39. plt.title(print_res)
    40. for i in range(len(predict)):
    41. print(f"class: {class_indict[str(i)]:10} prob: {predict[i].numpy():.3}")
    42. plt.show()
    43. if __name__ == '__main__':
    44. main()

     预测结果

    六.模型可视化

    将生成的pth文件导入netron工具,可视化结果为

    发现很不清晰,因此将它转换成多用于嵌入式设备部署的onnx格式

    编写onnx.py

    1. import torch
    2. import torchvision
    3. from model_v2 import MobileNetV2
    4. device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    5. model = MobileNetV2(num_classes=10).to(device)
    6. model=torch.load("/home/lm/MobileNet/MobileNetV2.pth")
    7. model.eval()
    8. example = torch.ones(1, 3, 244, 244)
    9. example = example.to(device)
    10. torch.onnx.export(model, example, "MobileNetV2.onnx", verbose=True, opset_version=11)

     

     七.批量数据预测

    现在新建一个dta文件夹,里面放入五类带预测的样本,编写代码完成对整个文件夹下所有样本的预测,即批量预测。

    batch_predict.py如下:

    1. import os
    2. import json
    3. import torch
    4. from PIL import Image
    5. from torchvision import transforms
    6. from model_v2 import MobileNetV2
    7. def main():
    8. device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    9. data_transform = transforms.Compose(
    10. [transforms.Resize(256),
    11. transforms.CenterCrop(224),
    12. transforms.ToTensor(),
    13. transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
    14. # load image
    15. # 指向需要遍历预测的图像文件夹
    16. imgs_root = "./data/imgs"
    17. # 读取指定文件夹下所有jpg图像路径
    18. img_path_list = [os.path.join(imgs_root, i) for i in os.listdir(imgs_root) if i.endswith(".jpg")]
    19. # read class_indict
    20. json_file = open('./class_indices.json', "r")
    21. class_indict = json.load(json_file)
    22. # create model
    23. model = MobileNetV2(num_classes=10).to(device)
    24. model = torch.load("./MobileNetV2.pth")
    25. # prediction
    26. model.eval()
    27. batch_size = 8 # 每次预测时将多少张图片打包成一个batch
    28. with torch.no_grad():
    29. for ids in range(0, len(img_path_list) // batch_size):
    30. img_list = []
    31. for img_path in img_path_list[ids * batch_size: (ids + 1) * batch_size]:
    32. img = Image.open(img_path)
    33. img = data_transform(img)
    34. img_list.append(img)
    35. # batch img
    36. # 将img_list列表中的所有图像打包成一个batch
    37. batch_img = torch.stack(img_list, dim=0)
    38. # predict class
    39. output = model(batch_img.to(device)).cpu()
    40. predict = torch.softmax(output, dim=1)
    41. probs, classes = torch.max(predict, dim=1)
    42. for idx, (pro, cla) in enumerate(zip(probs, classes)):
    43. print(f"image: {img_path_list[ids*batch_size+idx]} class: {class_indict[str(cla.numpy())]} prob: {pro.numpy():.3}")
    44. if __name__ == '__main__':
    45. main()

    运行之后,输出

    image: ./data/imgs/ea34b4062cf6043ed1584d05fb1d4e9fe777ead218ac104497f5c97ca5edb3bd_640.jpg  class: ragno  prob: 1.0
    image: ./data/imgs/e833b20820f4073ed1584d05fb1d4e9fe777ead218ac104497f5c97ca5ecb5b1_640.jpg  class: ragno  prob: 1.0
    image: ./data/imgs/ea37b0062bfc093ed1584d05fb1d4e9fe777ead218ac104497f5c97faeebb5bb_640.jpg  class: farfalla  prob: 1.0
    image: ./data/imgs/ea37b70b20fd003ed1584d05fb1d4e9fe777ead218ac104497f5c97faeebb5bb_640.jpg  class: farfalla  prob: 0.998
    image: ./data/imgs/ea34b4062cf7013ed1584d05fb1d4e9fe777ead218ac104497f5c97ca5edb3bd_640.jpg  class: ragno  prob: 0.922
    image: ./data/imgs/ea36b0082af5053ed1584d05fb1d4e9fe777ead218ac104497f5c97faee8b1b8_640.jpg  class: farfalla  prob: 0.998
    image: ./data/imgs/ea34b3072ff1043ed1584d05fb1d4e9fe777ead218ac104497f5c97ca5edb3bd_640.jpg  class: ragno  prob: 0.957
    image: ./data/imgs/ea37b20d2bfc063ed1584d05fb1d4e9fe777ead218ac104497f5c97faee8b1b8_640.jpg  class: farfalla  prob: 0.998
    image: ./data/imgs/ea34b3072ff1033ed1584d05fb1d4e9fe777ead218ac104497f5c97ca5ecb3b9_640.jpg  class: ragno  prob: 0.983
    image: ./data/imgs/ea36b7072ff0023ed1584d05fb1d4e9fe777ead218ac104497f5c97faeebb5bb_640.jpg  class: farfalla  prob: 0.989
    image: ./data/imgs/ea36b7072ff7083ed1584d05fb1d4e9fe777ead218ac104497f5c97faee9bdba_640.jpg  class: farfalla  prob: 0.999
    image: ./data/imgs/e834b2082afc093ed1584d05fb1d4e9fe777ead218ac104497f5c97ca5edb3bd_640.jpg  class: ragno  prob: 0.906
    image: ./data/imgs/e832b5072ef4053ed1584d05fb1d4e9fe777ead218ac104497f5c97ca5edb3bd_640.jpg  class: ragno  prob: 0.982
    image: ./data/imgs/ea36b4092dfc033ed1584d05fb1d4e9fe777ead218ac104497f5c97faee9bdba_640.jpg  class: farfalla  prob: 0.998
    image: ./data/imgs/e83cb4072bf21c22d2524518b7444f92e37fe5d404b0144390f8c47ba7ebb0_640.jpg  class: ragno  prob: 1.0
    image: ./data/imgs/ea36b10929f6023ed1584d05fb1d4e9fe777ead218ac104497f5c97faeebb5bb_640.jpg  class: farfalla  prob: 0.999

    完成预期功能

    八.模型改进

    这里采用了迁移学习的方法,可以有效的提高训练精度,更快的收敛;同时由于模型的轻量化,模型预测速度很快,可以有效的部署在嵌入式设备上,典型应用如手机的人脸识别等。

    同时,这里采用的是MobileNetV2,后续也将尝试MobileNetV3网络。

    还有其他方法会在之后进行补充。

  • 相关阅读:
    四氯四碘荧光素二钾,CAS号: 632-68-8
    vr火灾逃生安全科普软件开展消防突击教育安全有效
    Python内置函数input()详解
    vue-element-admin 综合开发四:axios安装和封装、mock安装/学习/使用
    BS EN 12104-2023 软木地砖检测
    【GDB】 .gdbinit 文件
    cv.dnn.NMSBoxes(bbox, confs, self.confThreshold, self.nmsThreshold)
    LiteFlow v2.9.4发布!一款能让你系统支持热更新,编排,脚本编写逻辑的国产规则引擎框架
    C#:实现SHA1加密算法(附完整源码)
    Immutable是什么?
  • 原文地址:https://blog.csdn.net/qq_46454669/article/details/134010321