yolov5 create_dataloader原码及解析

创建数据集的调用关系
create_dataloader（…）----->LoadImagesAndLabels（…）

create_dataloader

part 1 : 参数

def create_dataloader(path,
                      imgsz,
                      batch_size,
                      stride,
                      single_cls=False,
                      hyp=None,
                      augment=False,
                      cache=False,
                      pad=0.0,
                      rect=False,
                      rank=-1,
                      workers=8,
                      image_weights=False,
                      quad=False,
                      prefix='',
                      shuffle=False):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

返回值是train_loader, dataset

(1)rect

parser.add_argument('--rect', action='store_true', help='rectangular training')
1

下图分别是方形推理方式和矩阵推理方式

在这里插入图片描述

矩阵推理会加速模型的推理过程，减少一些冗余信息

【参考博客】：手把手带你调参Yolo v5 (v6.2)（二）

【参考博客】：olov5中的Rectangular training和Rectangular inference
在yolov5中被调用的代码如下：

        # Rectangular Training
        if self.rect:
            # Sort by aspect ratio
            s = self.shapes  # wh


            ar = s[:, 1] / s[:, 0]  #输入图像高宽比 # aspect ratio
            irect = ar.argsort()#将ar中的宽高比从小到大排列，提取其对应的index(索引)，然后输出

            #按照索引排序（irect）整理
            self.im_files = [self.im_files[i] for i in irect]
            self.label_files = [self.label_files[i] for i in irect]
            self.labels = [self.labels[i] for i in irect]
                            #self.labels = list(labels)
            self.shapes = s[irect]  # wh

            #------重排ar----------------
            ar = ar[irect]#将ar按irect中索引顺序排列

            #设置训练图像的大小  Set training image shapes
            shapes = [[1, 1]] * nb
            for i in range(nb):#nb：number of batches
                ari = ar[bi == i]#bi：batch index,ari为每一个batch最后匹配到的结果
                mini, maxi = ari.min(), ari.max()#取当前ari列表中最大最小值

                #始终要记得ar代表的是是高宽比，shape<---wh
                if maxi < 1:
                    shapes[i] = [maxi, 1]#高大宽小
                elif mini > 1:
                    shapes[i] = [1, 1 / mini]#高小宽大

            self.batch_shapes = np.ceil(np.array(shapes) * img_size / stride + pad).astype(int) * stride
            #对于shapes每一个都给出最小的padding

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

ar = ar[irect]将排序索引转化为排序结果
举一个简单的例子：

x=np.array([1,4,3,-1,6,9])
print(f"排序前：{x}")
y=x.argsort()
print(f"排序后x：{x}")
print(f"排序后y：{y}")
x=x[y]
print(x)

输出：
排序前：[ 1  4  3 -1  6  9]
排序后x：[ 1  4  3 -1  6  9]
排序后y：[3 0 2 1 4 5]
[-1  1  3  4  6  9]
1
2
3
4
5
6
7
8
9
10
11
12
13

ari = ar[bi == i]这个循环的过程
基于之前的代码再举一个例子

batch_size=2
n=6
bi = np.floor(np.arange(n) / batch_size).astype(int)
nb = bi[-1] + 1
print(f"bi:{bi}",f"nb:{nb}")
shapes=[]
for i in range(nb):  # nb：number of batches
    print(bi == i)
    ari = x[bi == i]
    mini, maxi = ari.min(), ari.max()
    print(mini,maxi)
输出：
bi:[0 0 1 1 2 2] nb:3
[ True  True False False False False]
-1 1
[False False  True  True False False]
3 4
[False False False False  True  True]
6 9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

(2)quad

parser.add_argument('--quad', action='store_true', help='quad dataloader')
1

好处是在比默认 640 大的数据集上训练效果更好
副作用是在 640 大小的数据集上训练效果可能会差一些

(3)图像权重 image_weights

在训练过程中，当设置参数–image_weights为True时，会计算图像采集的权重，若图像权重越大,那么该图像被采样的概率也越大。后面遍历图像时,则按照重新采集的索引dataset.indices进行计算。

类别权重

 model.class_weights = labels_to_class_weights(dataset.labels, nc).to(device) * nc  # attach class weights
1

labels_to_class_weights在utils/general.py中

def labels_to_class_weights(labels, nc=80):
    # Get class weights (inverse frequency) from training labels
    if labels[0] is None:  # no labels loaded
        return torch.Tensor()
    #将标签信息在水平方向上拼接起来
    labels = np.concatenate(labels, 0)  # labels.shape = (866643, 5) for COCO

    #label标签里面的维度信息？？
    classes = labels[:, 0].astype(int)  # labels = [class xywh]
    weights = np.bincount(classes, minlength=nc)  # occurrences per class

    # Prepend gridpoint count (for uCE training)
    # gpi = ((320 / 32 * np.array([1, 2, 4])) ** 2 * 3).sum()  # gridpoints per image
    # weights = np.hstack([gpi * len(labels)  - weights.sum() * 9, weights * 9]) ** 0.5  # prepend gridpoints to start

    weights[weights == 0] = 1  # replace empty bins with 1
    weights = 1 / weights  # number of targets per class
    #     每一个类别    标注框的数量
    weights /= weights.sum()  # normalize
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

其实画一个图什么都明了了。
假设A,B,C 三类的标签数量分别是2，4，8，在进行完weights = 1 / weights就变成了1/2,1/4,1/8，最后再进行归一化获得各个类别的权重

请添加图片描述

会看见我留了一个问题：

label标签里面的维度信息？？

这个问题搞清楚了有利于你理解整个架构

如果需要额外的图像增强，可以按照requirement.txt中下载

albumentations>=1.0.3

相关阅读:
哈希表超详解
 Naive 组件库动态渲染icon图标
 Spring底层原理之 BeanFactory 与 ApplicationContext
8.15 Day41---Linux文件系统命令
 Linux 下如何调试代码
 图像几何变换
 【刷题】二叉树遍历思路解析
 鹿蜀：一个基于日常开发任务体现开发人员工作状况的系统
 【计算机网络】三次握手与四次挥手(过程详解)
自动化测试框架Pytest（二）——前后置处理
原文地址：https://blog.csdn.net/weixin_50862344/article/details/126796709