• SparkML


    SparkML

    SparkML_lr_train :读取py处理后的train表用于训练,将训练模型保存好。
    SparkML_lr_predict :读取训练好的模型,读取py处理后的test表用于预测。将预测结果写入normal_data中,根据id修改stream_is_normal的值。

    提交spark任务

    bin/spark-submit \
    --class SparkML_lr_train \
    --master yarn \
    --deploy-mode cluster \
    ./SparkML_lr_train1.jar \
    10
    
    
    bin/spark-submit \
    --class SparkML_lr_train \
    --master yarn \
    --deploy-mode client \
    ./SparkML_lr_train4.jar \
    10
    
    
    bin/spark-submit \
    --class SparkML_lr_predict \
    --master yarn \
    --deploy-mode client \
    ./SparkML_lr_predict.jar \
    10
    
    
    bin/spark-submit \
    --class lr_train\
    --master yarn \
    --deploy-mode client \
    ./lr_train.jar \
    10
    
    
    bin/spark-submit \
    --class lr_predict\
    --master yarn \
    --deploy-mode client \
    ./lr_predict.jar \
    10
    
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40

    启动hadoop(启动脚本)
    hdp.sh start
    启动spark(命令行启动)
    sbin/start-all.sh

    
    bin/spark-submit \
    --class SparkSQL_lr_train \
    --master yarn \
    --deploy-mode client \
    ./SparkSQL_lr_train.jar \
    10
    
    
    
    bin/spark-submit \
    --class SparkML_lr_predict \
    --master yarn \
    --deploy-mode client \
    ./SparkML_lr_predict.jar \
    10
    
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18

    livy传参

    {
      "file": "hdfs://hadoop102:8020/spark_jar/SparkSQL_lr_train1.jar",
      "className": "SparkSQL_lr_train",
      "driverMemory": "1g",
      "executorMemory": "1g",
      "numExecutors": 1,
      "driverCores": 1,
      "executorCores": 1,
      "conf":{
          "spark.master":"yarn",
          "deploy-mode":"client "
      },
      "args": ["10"]
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    {
      "file": "hdfs://hadoop102:8020/spark_jar/SparkML_lr_predict1.jar",
      "className": "SparkML_lr_predict",
      "driverMemory": "1g",
      "executorMemory": "1g",
      "numExecutors": 1,
      "driverCores": 1,
      "executorCores": 1,
      "conf":{
          "spark.master":"yarn",
          "deploy-mode":"client "
      },
      "args": ["10"]
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
  • 相关阅读:
    ES6中的基础知识点 — Promise
    跳表的设计与应用场景
    Python学习之Python代码如何打包成应用
    sqlmap --os-shell(写入木马获取getshell)
    y2系电机连接片冲裁模具设计
    Traefik v3.0 Docker 全面使用指南:基础篇
    粉丝提问:26想转行做Python开发,是不是已经晚了?
    【自动驾驶】针对低速无人车的线控底盘技术
    Linux的NIS配置
    【实用代码】日志转Json详细解析 (LogToJson)
  • 原文地址:https://blog.csdn.net/qq_45972323/article/details/133843432