【无标题】

ShuffleExchangeExec

  private lazy val writeMetrics =
    SQLShuffleWriteMetricsReporter.createShuffleWriteMetrics(sparkContext)
  private[sql] lazy val readMetrics =
    SQLShuffleReadMetricsReporter.createShuffleReadMetrics(sparkContext)
1
2
3
4

用在了两个地方，承接的是前后两个stage 的metrics

  /**
   * A [[ShuffleDependency]] that will partition rows of its child based on
   * the partitioning scheme defined in `newPartitioning`. Those partitions of
   * the returned ShuffleDependency will be the input of shuffle.
   */
  @transient
  lazy val shuffleDependency : ShuffleDependency[Int, InternalRow, InternalRow] = {
    val dep = ShuffleExchangeExec.prepareShuffleDependency(
      inputRDD,
      child.output,
      outputPartitioning,
      serializer,
      writeMetrics)
    metrics("numPartitions").set(dep.partitioner.numPartitions)
    val executionId = sparkContext.getLocalProperty(SQLExecution.EXECUTION_ID_KEY)
    SQLMetrics.postDriverMetricUpdates(
      sparkContext, executionId, metrics("numPartitions") :: Nil)
    dep
  }

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

  protected override def doExecute(): RDD[InternalRow] = {
    // Returns the same ShuffleRowRDD if this plan is used by multiple plans.
    if (cachedShuffleRDD == null) {
      cachedShuffleRDD = new ShuffledRowRDD(shuffleDependency, readMetrics)
    }
    cachedShuffleRDD
  }
1
2
3
4
5
6
7

在这里插入图片描述

一般情况是，两个metrics 相同。 write 在前，read 在后

如果下个shuffle read task 没有完成或者失败，就会出现read 比write 少的情况。

相关阅读:
送你个低代码福利，错过要再等一年
springboot-鑫源停车场管理系统毕业设计 -附源码 290915
分布式系统设计策略
【操作系统】1.3.1 操作系统的运行机制
Java 将HTML转为Word
spring authorization server 0.3.1 - 默认示例
Jenkins java8安装版本安装
oracle存储过程实现定时备份表和处理重复数据
fft分析-ADC/DAC
ssh/scp断点续传rsync

原文地址：https://blog.csdn.net/zhixingheyi_tian/article/details/134541068