Towards Class-Oriented Poisoning Attacks Against Neural Networks 论文笔记

#论文笔记#

1. 论文信息

论文名称	Towards Class-Oriented Poisoning Attacks Against Neural Networks
作者	Bingyin Zhao
会议/出版社	WACV 2022
pdf	📄在线pdf
代码	无

基于类别的 availability attacks，不同于原本的 availability attacks 只考虑降低模型的整体准确率，本文还考虑了降低特定类的准确率或迫使模型将其他类都预测为目标类。

2. introduction

本文提出了面向类别的 availability attacks，通过梯度优化的方法生成 posion data。使用该 posion data 训练出的模型在特定类别上的准确率发生异常。

availability attacks 的优化目标，bi-level optimization problem

$\underset{\mathcal{D}_{p}}{\arg \max } \sum_{(\boldsymbol{x}, y) \in \mathcal{D}_{\text {val }}} L\left[\mathcal{F}_{\theta^{*}}(\boldsymbol{x}), y, \theta^{*}\right]$
S.t. $\theta^{*} \in \underset{\theta^{*} \in \Theta}{\arg \min } \sum_{(\boldsymbol{x}, y) \in \mathcal{D}_{t r} \cup \mathcal{D}_{p}} L\left[\mathcal{F}_{\theta^{*}}(\boldsymbol{x}), y, \theta\right]$ ,
威胁模型
- 攻击者知道，算法结构，超参数以及训练数据集
- 攻击者可以对训练数据集注入有毒数据并且修改标签

3. method

两种攻击方式：

Class-Oriented availability attacks 可以为两种：

COEG class-oriented error-generic

目标：让模型将所有的输入都分类成目标类（supplanter class）

目标函数： $\underset{\mathcal{D}_{p}}{\arg \max } \sum_{(\boldsymbol{x}, y) \in \mathcal{D}_{\text {val }}} L\left[\mathcal{F}_{\theta^{*}}(\boldsymbol{x}), y, \theta^{*}\right]$
s.t. $\quad \theta^{*} \in \underset{\theta^{*} \in \Theta}{\arg \min } \sum_{(\boldsymbol{x}, y) \in \mathcal{D}_{t r} \cup \mathcal{D}_{p}} L\left[\mathcal{F}_{\theta^{*}}(\boldsymbol{x}), y_{s}, \theta\right]$ ,

$y_s$ 表示目标类
在这里插入图片描述

COES class-oriented error-specific

目标：降低 victim classes 的准确率，保持 non-victim classes(其他类) 的准确率不变

目标函数： $\underset{\mathcal{D}_{p}}{\arg \max } \sum_{(\boldsymbol{x}, y) \in \mathcal{D}_{v a l}} L\left[\mathcal{F}_{\theta^{*}}(\boldsymbol{x}), y_{v}, \theta^{*}\right]$
s.t. $\quad \theta^{*} \in \underset{\theta^{*} \in \Theta}{\arg \min } \sum_{(\boldsymbol{x}, y) \in \mathcal{D}_{t r \cup \mathcal{D}_{p}}} L\left[\mathcal{F}_{\theta^{*}}(\boldsymbol{x}), y_{\bar{v}}, \theta\right]$ ,

$y_v$ 表示 victim classes， $y_{\bar{v}}$ 表示 non-victim classes

在这里插入图片描述

两种攻击方式的训练方法

COEG Attack

目标函数：
- $L=\lambda \cdot L_{f_{y_{s}}}-L_{f_{y_{o}}}$
- $L_{f_{y_{s}}}=f_{y_{s}}(\boldsymbol{x})$
- $f_{y_k}$ as the corresponding logit to the categorical label $y_k$
- $f_{y_{o}}(\boldsymbol{x})$ is the logit output of the groundtruth class
poisoned image $x_{p}$

$\boldsymbol{x}_{\boldsymbol{p}}=\boldsymbol{x}_{\boldsymbol{o}}-\epsilon \cdot \operatorname{sign}\left(\nabla_{\boldsymbol{x}_{o}}\left(\lambda \cdot L_{f_{y_{s}}}-L_{f_{y_{o}}}\right)\right)$

算法流程
COES Attack

COES 既要降低目标类的准确率，又要保持其他类的准确率。

加毒的过程分为：
1. 在每一类中选取相同的数量的图片
2. 通过算法2提升或者减少每幅图像与 label 信息对应的特征信息
3. 改变目标类的标签
目标函数：

$L= {λ⋅Lfys−Lfyo, if xo∈CvLfyo, otherwise$
{λ⋅Lfys−Lfyo,Lfyo, if xo∈Cv otherwise
L={λ⋅Lfys−Lfyo,Lfyo, if xo∈Cv otherwise

poisoned image $x_{p}$

${xo−ϵ⋅sign(∇xo(λ⋅Lfys−Lfyo)), if xo∈Cvxo+ϵ⋅sign(∇xo(Lfyo)), otherwise$
{xo−ϵ⋅sign(∇xo(λ⋅Lfys−Lfyo)),xo+ϵ⋅sign(∇xo(Lfyo)), if xo∈Cv otherwise

算法2：

4. experiments

评价指标：

Change-to-Target (CTT)：其他类被分到目标类的比例
Change-from-Target (CFT) ：目标类被错误分类的比例

🟠数据集一
MNIST

🟠数据集二
CIFAR-10

🟠数据集三
ImageNet-ILSVRC2012

COEG 在三个数据集上的实验结果：
COES 在 cifar10 上的实验结果：

相关阅读:
【迅为iMX6Q】开发板烧写工具 MfgTool2.exe 打不开问题的解决
springboot~自定义favicon加载问题
基于Java Swing和BouncyCastle的证书生成工具
SpringCloud（6）：feign详解
后缀表达式的转换（栈的运用）
【Orangepi Zero2 全志H616】驱动串口通信
K8s 多集群实践思考和探索
Linux教程：Centos如何使用MiniKube从零开始部署Kubernetes集群服务
Redis 第一次作业
js原型链

原文地址：https://blog.csdn.net/weiyuxin107/article/details/127984495