• 浅析数据迁移工具Sqoop



    title: Sqoop系列


    第一章 Sqoop理论

    1.1 概述

    sqoop 是 apache 旗下一款“Hadoop 和关系数据库服务器之间传送数据”的工具。

    导入(Import)数据:MySQL,Oracle 导入数据到 Hadoop 的 HDFS、HIVE、HBASE 等数据存储系统
    导出(Export)数据:从 Hadoop 的文件系统中导出数据到关系数据库 MySQL等

    Sqoop 的本质还是一个命令行工具,和 HDFS,Hive 相比,并没有什么高深的理论。

    在这里插入图片描述

    1.2 工作机制

    将导入或导出命令翻译成 MapReduce 程序来实现
    在翻译出的 MapReduce 中主要是对 InputFormat 和 OutputFormat 进行定制

    1.3 Sqoop安装

    1.3.1 Sqoop1.4.6 安装

    安装 Sqoop 的前提是已经具备 Java 和 Hadoop 的环境

    安装包:sqoop-1.4.6.bin_hadoop-2.0.4-alpha.tar.gz

    安装过程:

    1、上传解压重命名
    [root@hadoop0 software]# tar -zxvf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz
    [root@hadoop0 software]# mv sqoop-1.4.6.bin__hadoop-2.0.4-alpha sqoop
    
    2、修改配置文件
    [root@hadoop0 conf]# mv sqoop-env-template.sh sqoop-env.sh
    添加如下的信息:
    export HADOOP_COMMON_HOME=/software/hadoop
    export HADOOP_MAPRED_HOME=/software/hadoop
    export HBASE_HOME=/software/hbase
    export HIVE_HOME=/software/hive
    export ZOOCFGDIR=/software/zk/conf
    
    3、 加入 MySQL 驱动包到 Sqoop的lib目录下 
    mysql-connector-java-5.1.40-bin.jar放到Sqoop的lib目录下
    
    4、配置环境变量
    export SQOOP_HOME=/software/sqoop
    export PATH=$PATH:$SQOOP_HOME/bin
    最后配置完了之后注意source一下。
    
    5、验证安装是否成功
    [root@hadoop0 sqoop]# sqoop-version
    Warning: /software/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /software/sqoop/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    20/08/23 00:34:13 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
    Sqoop 1.4.6
    git commit id c0c5a81723759fa575844a0a1eae8f510fa32c25
    Compiled by root on Mon Apr 27 14:38:36 CST 2015
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32

    1.3.2 Sqoop1.4.7 安装

    官网链接:http://sqoop.apache.org/

    更新比较慢了。

    安装 Sqoop 的前提是已经具备 Java 和 Hadoop 的环境

    安装包:sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz

    安装过程:

    1、上传解压重命名
    [root@hadoop10 software]# tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
    [root@hadoop10 software]# mv sqoop-1.4.7.bin__hadoop-2.6.0 sqoop
    
    2、修改配置文件
    [root@hadoop10 conf]# mv sqoop-env-template.sh sqoop-env.sh
    添加如下的信息:
    export HADOOP_COMMON_HOME=/software/hadoop
    export HADOOP_MAPRED_HOME=/software/hadoop
    export HBASE_HOME=/software/hbase
    export HIVE_HOME=/software/hive
    export ZOOCFGDIR=/software/zk/conf
    
    3、 加入 MySQL 驱动包到 Sqoop的lib目录下 
    mysql-connector-java-5.1.40-bin.jar放到Sqoop的lib目录下
    
    4、配置环境变量
    export SQOOP_HOME=/software/sqoop
    export PATH=$PATH:$SQOOP_HOME/bin
    最后配置完了之后注意source一下。
    
    5、验证安装是否成功
    [root@hadoop10 software]# sqoop-version
    Warning: /software/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /software/sqoop/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    2021-10-19 11:37:31,693 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    Sqoop 1.4.7
    git commit id 2328971411f57f0cb683dfb79d19d4d19d185dd8
    Compiled by maugli on Thu Dec 21 15:59:58 STD 2017
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32

    1.4 Sqoop基本命令介绍

    首先,我们可以使用 sqoop help 来查看,sqoop 支持哪些命令

    [root@hadoop0 sqoop]# sqoop help
    Warning: /software/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /software/sqoop/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    20/08/23 00:35:03 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
    usage: sqoop COMMAND [ARGS]
    
    Available commands:
      codegen            Generate code to interact with database records
      create-hive-table  Import a table definition into Hive
      eval               Evaluate a SQL statement and display the results
      export             Export an HDFS directory to a database table
      help               List available commands
      import             Import a table from a database to HDFS
      import-all-tables  Import tables from a database to HDFS
      import-mainframe   Import datasets from a mainframe server to HDFS
      job                Work with saved jobs
      list-databases     List available databases on a server
      list-tables        List available tables in a database
      merge              Merge results of incremental imports
      metastore          Run a standalone Sqoop metastore
      version            Display version information
    
    See 'sqoop help COMMAND' for information on a specific command.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25

    然后得到这些支持的命令之后,如果不知道使用方式,可以使用 sqoop help 命令名称 的方式来查看某条具体命令的使用方式,比如

    sqoop help import
    
    • 1

    在这里插入图片描述

    第二章 Sqoop实践

    2.1 Sqoop数据导入导出

    Sqoop : 数据迁入迁出的工具
    
    RDBMS --  import   -> Hadoop
    Hadoop --  export   -> RDBMS
    
    删除hive表
    	drop table student;
    	drop table if exists student;
    
    删除HBase表
    disable 'student'
    drop 'student'
    
    下面是使用啦:
    
    
    列出MySQL数据有哪些数据库:
    sqoop list-databases \
    --connect jdbc:mysql://localhost:3306/ \
    --username root \
    --password 111111
    
    sqoop list-databases --connect jdbc:mysql://localhost:3306/ --username root --password 111111
    
    列出MySQL中的某个数据库有哪些数据表:
    sqoop list-tables \
    --connect jdbc:mysql://localhost:3306/mysql \
    --username root \
    --password 111111
    
    创建一张跟mysql中的help_keyword表一样的hive表hk:
    使用默认的default库
    sqoop create-hive-table \
    --connect jdbc:mysql://localhost:3306/mysql \
    --username root \
    --password 111111 \
    --table help_keyword \
    --hive-table hk
    
    
    另开一个hive窗口
    hive> use default;
    hive> show tables;
    hive> desc hk;
    hive> drop table hk;
    hive> show tables;
    
    
    使用myhive库
    sqoop create-hive-table \
    --connect jdbc:mysql://localhost:3306/mysql \
    --username root \
    --password 111111 \
    --table help_keyword \
    --hive-database myhive \
    --hive-table hk
    
    
    第一个:导入
    
    
    导入MySQL表中数据到HDFS中:
    
    // 普通导入:导入mysql库中的help_keyword的数据到HDFS上的默认路径:/user/root/help_keyword
    sqoop import   \
    --connect jdbc:mysql://hadoop10:3306/mysql   \
    --username root  \
    --password 111111   \
    --table help_keyword   \
    -m 1
    
    
    
    -m 1就是一个MapTask
    
    可能会报错,给上面的localhost改成ip地址或者主机名hadoop10即可。
    
    
    查看结果:hadoop fs -cat /user/root/help_keyword/p*
    先删除原来的结果文件: hadoop fs -rm -r /user/root/help_keyword
    下面会生成4个文件,也就是说默认的m是4个。
    sqoop import   \
    --connect jdbc:mysql://hadoop10:3306/mysql   \
    --username root  \
    --password 111111  \
    --table help_keyword
    
    
    // 导入: 指定分隔符和导入路径
    sqoop import   \
    --connect jdbc:mysql://hadoop10:3306/mysql   \
    --username root  \
    --password 111111   \
    --table help_keyword   \
    --target-dir /user/root/my_help_keyword  \
    --fields-terminated-by '\t'  \
    -m 1
    
    查看:
    hadoop fs -cat /user/root/my_help_keyword/p*
    
    // 导入数据:带where条件
    sqoop import   \
    --connect jdbc:mysql://hadoop10:3306/mysql   \
    --username root  \
    --password 111111   \
    --where "name='STRING' " \
    --table help_keyword   \
    --target-dir /sqoop/root/myoutport1  \
    -m 1
    查看:
    hadoop fs -cat /sqoop/root/myoutport1/part-m-00000
    
    
    // 导入:指定自定义查询SQL
    sqoop import   \
    --connect jdbc:mysql://hadoop10:3306/mysql   \
    --username root  \
    --password 111111   \
    --target-dir /user/root/myimport  \
    --query 'select help_keyword_id,name from help_keyword WHERE $CONDITIONS and  name = "STRING"'  \
    --split-by  help_keyword_id \
    --fields-terminated-by '\t'  \
    -m 1
    查看:
    hadoop fs -cat /user/root/myimport/*
    
    下面是导入所有,其实没有添加条件,但是WHERE \$CONDITIONS 必须要有。
    sqoop import   \
    --connect jdbc:mysql://hadoop10:3306/mysql   \
    --username root  \
    --password 111111   \
    --target-dir /user/root/myimport10  \
    --query "select help_keyword_id,name from help_keyword WHERE \$CONDITIONS"  \
    --split-by  help_keyword_id \
    --fields-terminated-by '\t'  \
    -m 2
    查看:
    hadoop fs -cat /user/root/myimport10/*
    
    下面会报错
    sqoop import   \
    --connect jdbc:mysql://hadoop10:3306/mysql   \
    --username root  \
    --password 111111   \
    --target-dir /user/root/myimport222  \
    --query "select help_keyword_id,name from help_keyword"  \
    --split-by help_keyword_id \
    --fields-terminated-by '\t'  \
    -m 3
    
    具体报错:
    2021-10-19 15:41:56,392 ERROR tool.ImportTool: Import failed: java.io.IOException: Query [select help_keyword_id,name from help_keyword] must contain '$CONDITIONS' in WHERE clause.
    
    在以上需要按照自定义SQL语句导出数据到HDFS的情况下:
    1、引号问题,要么外层使用单引号,内层使用双引号,$CONDITIONS的$符号不用转义, 要么外层使用双引号,那么内层使用单引号,然后$CONDITIONS的$符号需要转义
    2、自定义的SQL语句中必须带有WHERE \$CONDITIONS
    
    
    
    
    导入MySQL数据库中的表数据到Hive中:
    
    // 普通导入:数据存储在默认的default hive库中,表名就是对应的mysql的表名:
    sqoop import   \
    --connect jdbc:mysql://hadoop10:3306/mysql   \
    --username root  \
    --password 111111   \
    --table help_keyword   \
    --hive-import \
    -m 1
    
    先删除了前面测试的:
    hadoop fs -rm -r /user/root/help_keyword
    查看:
    hadoop fs -cat /user/hive/warehouse/help_keyword/p*     // 查看数据
    hadoop fs -cat /user/root/help_keyword/p*
    
    // 指定行分隔符和列分隔符,
    指定hive-import,
    指定覆盖导入,(好像没什么用)
    指定自动创建hive表,
    指定表名,
    指定删除中间结果数据目录
    sqoop import  \
    --connect jdbc:mysql://hadoop10:3306/mysql  \
    --username root  \
    --password 111111  \
    --table help_keyword  \
    --fields-terminated-by "\t"  \
    --lines-terminated-by "\n"  \
    --hive-import  \
    --hive-overwrite  \
    --create-hive-table  \
    --delete-target-dir \
    --hive-database  myhive \
    --hive-table new_help_keyword
    上面的这个命令可能会因为版本不同的原因不成功。
    查看:
    hadoop fs -rmr /user/hive/warehouse/myhive.db/new_help_keyword
    hadoop fs -cat /user/hive/warehouse/myhive.db/new_help_keyword/p*
    
    
    // 增量导入 导入到hdfs
    sqoop import   \
    --connect jdbc:mysql://hadoop10:3306/mysql   \
    --username root  \
    --password 111111   \
    --table help_keyword  \
    --target-dir /user/root/myimport  \
    --incremental  append  \
    --check-column  help_keyword_id \
    --last-value 500  \
    -m 1
    查看:
    hadoop fs -cat /user/root/myimport/p*
    
    Incremental import arguments:
       --check-column         Source column to check for incremental change
       --incremental     Define an incremental import of type 'append' or 'lastmodified'
       --last-value            Last imported value in the incremental check column
    
       last-value  500  只导入 id > 500 的所有记录。。
    
       
       
    概念:
    1、增量导入
    2、全量导入
    
    
    导出:
     大数据存储系统 --》RDBMS
    注意:导出的RDBMS的表必须自己预先创建,不会自动创建
    
    create database sqoopdb default character set utf8 COLLATE utf8_general_ci; 
    use sqoopdb;
    CREATE TABLE sqoopstudent ( 
       id INT NOT NULL PRIMARY KEY, 
       name VARCHAR(20), 
       sex VARCHAR(20),
       age INT,
       department VARCHAR(20)
    );
    
    // 导出HDFS数据到MySQL:
    sqoop export \
    --connect jdbc:mysql://hadoop10:3306/sqoopdb  \
    --username root \
    --password 111111 \
    --table sqoopstudent \
    --export-dir /user/hive/warehouse/myhive.db/student \
    --fields-terminated-by ','
    
    
    // 导出hive数据到MySQL:
    sqoop export \
    --connect jdbc:mysql://localhost:3306/sqoopdb \
    --username root \
    --password 111111 \
    --table sqoopstudent \
    --export-dir /user/hive/warehouse/myhive.db/student \
    --fields-terminated-by ','
    
    
    
    一些其他操作:
    
    列出mysql数据库中的所有数据库
    sqoop list-databases \
    --connect jdbc:mysql://localhost:3306/ \
    -username root \
    -password 111111
    
    连接mysql并列出数据库中的表
    sqoop list-tables \
    --connect jdbc:mysql://localhost:3306/mysql \
    -username root \
    -password 111111
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    • 134
    • 135
    • 136
    • 137
    • 138
    • 139
    • 140
    • 141
    • 142
    • 143
    • 144
    • 145
    • 146
    • 147
    • 148
    • 149
    • 150
    • 151
    • 152
    • 153
    • 154
    • 155
    • 156
    • 157
    • 158
    • 159
    • 160
    • 161
    • 162
    • 163
    • 164
    • 165
    • 166
    • 167
    • 168
    • 169
    • 170
    • 171
    • 172
    • 173
    • 174
    • 175
    • 176
    • 177
    • 178
    • 179
    • 180
    • 181
    • 182
    • 183
    • 184
    • 185
    • 186
    • 187
    • 188
    • 189
    • 190
    • 191
    • 192
    • 193
    • 194
    • 195
    • 196
    • 197
    • 198
    • 199
    • 200
    • 201
    • 202
    • 203
    • 204
    • 205
    • 206
    • 207
    • 208
    • 209
    • 210
    • 211
    • 212
    • 213
    • 214
    • 215
    • 216
    • 217
    • 218
    • 219
    • 220
    • 221
    • 222
    • 223
    • 224
    • 225
    • 226
    • 227
    • 228
    • 229
    • 230
    • 231
    • 232
    • 233
    • 234
    • 235
    • 236
    • 237
    • 238
    • 239
    • 240
    • 241
    • 242
    • 243
    • 244
    • 245
    • 246
    • 247
    • 248
    • 249
    • 250
    • 251
    • 252
    • 253
    • 254
    • 255
    • 256
    • 257
    • 258
    • 259
    • 260
    • 261
    • 262
    • 263
    • 264
    • 265
    • 266
    • 267
    • 268
    • 269
    • 270
    • 271
    • 272
    • 273
    • 274
    • 275
    • 276
    • 277
    • 278
    • 279
    • 280

    2.2 导入导出中可能遇到的问题总结

    2.2.1 找不到StringUtils

    执行下面的命令的时候:

    [root@hadoop10 software]# sqoop list-databases --connect jdbc:mysql://localhost:3306/ --username root --password 111111
    Warning: /software/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /software/sqoop/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    2021-10-19 11:44:21,417 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    2021-10-19 11:44:21,468 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    2021-10-19 11:44:21,532 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/lang/StringUtils
    	at org.apache.sqoop.manager.MySQLManager.initOptionDefaults(MySQLManager.java:73)
    	at org.apache.sqoop.manager.SqlManager.(SqlManager.java:89)
    	at com.cloudera.sqoop.manager.SqlManager.(SqlManager.java:33)
    	at org.apache.sqoop.manager.GenericJdbcManager.(GenericJdbcManager.java:51)
    	at com.cloudera.sqoop.manager.GenericJdbcManager.(GenericJdbcManager.java:30)
    	at org.apache.sqoop.manager.CatalogQueryManager.(CatalogQueryManager.java:46)
    	at com.cloudera.sqoop.manager.CatalogQueryManager.(CatalogQueryManager.java:31)
    	at org.apache.sqoop.manager.InformationSchemaManager.(InformationSchemaManager.java:38)
    	at com.cloudera.sqoop.manager.InformationSchemaManager.(InformationSchemaManager.java:31)
    	at org.apache.sqoop.manager.MySQLManager.(MySQLManager.java:65)
    	at org.apache.sqoop.manager.DefaultManagerFactory.accept(DefaultManagerFactory.java:67)
    	at org.apache.sqoop.ConnFactory.getManager(ConnFactory.java:184)
    	at org.apache.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:272)
    	at org.apache.sqoop.tool.ListDatabasesTool.run(ListDatabasesTool.java:44)
    	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
    	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
    	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
    	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
    	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
    	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
    Caused by: java.lang.ClassNotFoundException: org.apache.commons.lang.StringUtils
    	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
    	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    	... 20 more
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36

    解决方案:将旧版的commons-lang-2.6.jar包下载并导入到sqoop目录下的lib目录下即可。

    在这里插入图片描述

    2.2.2 HIVE_CONF_DIR 问题

    2021-10-19 14:32:34,277 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
    2021-10-19 14:32:34,278 ERROR tool.CreateHiveTableTool: Encountered IOException running create table job: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
    	at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:50)
    	at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
    	at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
    	at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
    	at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
    	at org.apache.sqoop.tool.CreateHiveTableTool.run(CreateHiveTableTool.java:57)
    	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
    	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
    	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
    	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
    	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
    	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
    Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
    	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
    	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    	at java.lang.Class.forName0(Native Method)
    	at java.lang.Class.forName(Class.java:264)
    	at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
    	... 11 more
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24

    解决方案:

    网上有的地方说是要配置hive conf,进行如下配置
    export HADOOP_CLASSPATH=/software/hadoop/lib
    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/software/hive/lib/*
    export HIVE_CONF_DIR=$HIVE_HOME/conf
    
    但是我们这边,经过测试,加上上面也没有效果。
    再次确认是sqoop的lib中缺少hive的依赖包。进入到sqoop的目录下,你会发现果然没有hive的依赖包。
    进入到hive目录中.将hive-exec-3.1.2.jar拷到sqoop的lib目录下。
    将hive的依赖包拷过来以后,成功导入数据。
    
    [root@hadoop10 lib]# cp hive-exec-3.1.2.jar /software/sqoop/lib/
    [root@hadoop10 lib]# pwd
    /software/hive/lib
    [root@hadoop10 lib]# 
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15

    2.2.3 在命令中使用localhost问题总结(Communications link failure。0ms)

    [root@hadoop10 lib]# hadoop fs -rm -r /user/root/help_keyword
    Deleted /user/root/help_keyword
    [root@hadoop10 lib]# sqoop import   \
    > --connect jdbc:mysql://localhost:3306/mysql   \
    > --username root  \
    > --password 111111   \
    > --table help_keyword   \
    > -m 1
    Warning: /software/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /software/sqoop/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    2021-10-19 15:11:03,501 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    2021-10-19 15:11:03,569 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    2021-10-19 15:11:03,643 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    2021-10-19 15:11:03,643 INFO tool.CodeGenTool: Beginning code generation
    Tue Oct 19 15:11:03 CST 2021 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
    2021-10-19 15:11:04,093 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `help_keyword` AS t LIMIT 1
    2021-10-19 15:11:04,110 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `help_keyword` AS t LIMIT 1
    2021-10-19 15:11:04,117 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /software/hadoop
    Note: /tmp/sqoop-root/compile/553aaec3a5c8ad15b4bbc9e369955b27/help_keyword.java uses or overrides a deprecated API.
    Note: Recompile with -Xlint:deprecation for details.
    2021-10-19 15:11:05,134 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/553aaec3a5c8ad15b4bbc9e369955b27/help_keyword.jar
    2021-10-19 15:11:05,143 WARN manager.MySQLManager: It looks like you are importing from mysql.
    2021-10-19 15:11:05,143 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
    2021-10-19 15:11:05,143 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
    2021-10-19 15:11:05,143 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
    2021-10-19 15:11:05,149 INFO mapreduce.ImportJobBase: Beginning import of help_keyword
    2021-10-19 15:11:05,150 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
    2021-10-19 15:11:05,228 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
    2021-10-19 15:11:05,586 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
    2021-10-19 15:11:05,649 INFO client.RMProxy: Connecting to ResourceManager at hadoop10/192.168.22.136:8032
    2021-10-19 15:11:05,920 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/root/.staging/job_1634625143234_0004
    Tue Oct 19 15:11:14 CST 2021 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
    2021-10-19 15:11:14,804 INFO db.DBInputFormat: Using read commited transaction isolation
    2021-10-19 15:11:15,659 INFO mapreduce.JobSubmitter: number of splits:1
    2021-10-19 15:11:16,162 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1634625143234_0004
    2021-10-19 15:11:16,162 INFO mapreduce.JobSubmitter: Executing with tokens: []
    2021-10-19 15:11:16,331 INFO conf.Configuration: resource-types.xml not found
    2021-10-19 15:11:16,331 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
    2021-10-19 15:11:16,376 INFO impl.YarnClientImpl: Submitted application application_1634625143234_0004
    2021-10-19 15:11:16,400 INFO mapreduce.Job: The url to track the job: http://hadoop10:8088/proxy/application_1634625143234_0004/
    2021-10-19 15:11:16,400 INFO mapreduce.Job: Running job: job_1634625143234_0004
    2021-10-19 15:11:25,485 INFO mapreduce.Job: Job job_1634625143234_0004 running in uber mode : false
    2021-10-19 15:11:25,485 INFO mapreduce.Job:  map 0% reduce 0%
    2021-10-19 15:11:29,602 INFO mapreduce.Job: Task Id : attempt_1634625143234_0004_m_000000_0, Status : FAILED
    Error: java.lang.RuntimeException: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
    
    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.setDbConf(DBInputFormat.java:170)
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:161)
    	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:77)
    	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:157)
    	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
    	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
    	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
    	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at javax.security.auth.Subject.doAs(Subject.java:422)
    	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
    	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
    Caused by: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
    
    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:223)
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.setDbConf(DBInputFormat.java:168)
    	... 11 more
    Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
    
    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    	at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
    	at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:989)
    	at com.mysql.jdbc.MysqlIO.(MysqlIO.java:341)
    	at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2251)
    	at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2284)
    	at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2083)
    	at com.mysql.jdbc.ConnectionImpl.(ConnectionImpl.java:806)
    	at com.mysql.jdbc.JDBC4Connection.(JDBC4Connection.java:47)
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    	at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
    	at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:410)
    	at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:328)
    	at java.sql.DriverManager.getConnection(DriverManager.java:664)
    	at java.sql.DriverManager.getConnection(DriverManager.java:247)
    	at org.apache.sqoop.mapreduce.db.DBConfiguration.getConnection(DBConfiguration.java:302)
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:216)
    	... 12 more
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
    	at java.net.PlainSocketImpl.socketConnect(Native Method)
    	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    	at java.net.Socket.connect(Socket.java:589)
    	at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:211)
    	at com.mysql.jdbc.MysqlIO.(MysqlIO.java:300)
    	... 28 more
    
    2021-10-19 15:11:34,707 INFO mapreduce.Job: Task Id : attempt_1634625143234_0004_m_000000_1, Status : FAILED
    Error: java.lang.RuntimeException: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
    
    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.setDbConf(DBInputFormat.java:170)
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:161)
    	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:77)
    	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:157)
    	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
    	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
    	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
    	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at javax.security.auth.Subject.doAs(Subject.java:422)
    	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
    	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
    Caused by: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
    
    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:223)
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.setDbConf(DBInputFormat.java:168)
    	... 11 more
    Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
    
    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    	at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
    	at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:989)
    	at com.mysql.jdbc.MysqlIO.(MysqlIO.java:341)
    	at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2251)
    	at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2284)
    	at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2083)
    	at com.mysql.jdbc.ConnectionImpl.(ConnectionImpl.java:806)
    	at com.mysql.jdbc.JDBC4Connection.(JDBC4Connection.java:47)
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    	at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
    	at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:410)
    	at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:328)
    	at java.sql.DriverManager.getConnection(DriverManager.java:664)
    	at java.sql.DriverManager.getConnection(DriverManager.java:247)
    	at org.apache.sqoop.mapreduce.db.DBConfiguration.getConnection(DBConfiguration.java:302)
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:216)
    	... 12 more
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
    	at java.net.PlainSocketImpl.socketConnect(Native Method)
    	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    	at java.net.Socket.connect(Socket.java:589)
    	at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:211)
    	at com.mysql.jdbc.MysqlIO.(MysqlIO.java:300)
    	... 28 more
    
    2021-10-19 15:11:38,729 INFO mapreduce.Job: Task Id : attempt_1634625143234_0004_m_000000_2, Status : FAILED
    Error: java.lang.RuntimeException: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
    
    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.setDbConf(DBInputFormat.java:170)
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:161)
    	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:77)
    	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:157)
    	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
    	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
    	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
    	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at javax.security.auth.Subject.doAs(Subject.java:422)
    	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
    	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
    Caused by: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
    
    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:223)
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.setDbConf(DBInputFormat.java:168)
    	... 11 more
    Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
    
    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    	at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
    	at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:989)
    	at com.mysql.jdbc.MysqlIO.(MysqlIO.java:341)
    	at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2251)
    	at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2284)
    	at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2083)
    	at com.mysql.jdbc.ConnectionImpl.(ConnectionImpl.java:806)
    	at com.mysql.jdbc.JDBC4Connection.(JDBC4Connection.java:47)
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    	at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
    	at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:410)
    	at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:328)
    	at java.sql.DriverManager.getConnection(DriverManager.java:664)
    	at java.sql.DriverManager.getConnection(DriverManager.java:247)
    	at org.apache.sqoop.mapreduce.db.DBConfiguration.getConnection(DBConfiguration.java:302)
    	at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:216)
    	... 12 more
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
    	at java.net.PlainSocketImpl.socketConnect(Native Method)
    	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    	at java.net.Socket.connect(Socket.java:589)
    	at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:211)
    	at com.mysql.jdbc.MysqlIO.(MysqlIO.java:300)
    	... 28 more
    
    2021-10-19 15:11:44,839 INFO mapreduce.Job:  map 100% reduce 0%
    2021-10-19 15:11:45,852 INFO mapreduce.Job: Job job_1634625143234_0004 failed with state FAILED due to: Task failed task_1634625143234_0004_m_000000
    Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 killedReduces: 0
    
    2021-10-19 15:11:45,898 INFO mapreduce.Job: Counters: 8
    	Job Counters 
    		Failed map tasks=4
    		Launched map tasks=4
    		Other local map tasks=4
    		Total time spent by all maps in occupied slots (ms)=12619
    		Total time spent by all reduces in occupied slots (ms)=0
    		Total time spent by all map tasks (ms)=12619
    		Total vcore-milliseconds taken by all map tasks=12619
    		Total megabyte-milliseconds taken by all map tasks=12921856
    2021-10-19 15:11:45,905 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
    2021-10-19 15:11:45,906 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 40.3111 seconds (0 bytes/sec)
    2021-10-19 15:11:45,911 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
    2021-10-19 15:11:45,911 INFO mapreduce.ImportJobBase: Retrieved 0 records.
    2021-10-19 15:11:45,911 ERROR tool.ImportTool: Import failed: Import job failed!
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    • 134
    • 135
    • 136
    • 137
    • 138
    • 139
    • 140
    • 141
    • 142
    • 143
    • 144
    • 145
    • 146
    • 147
    • 148
    • 149
    • 150
    • 151
    • 152
    • 153
    • 154
    • 155
    • 156
    • 157
    • 158
    • 159
    • 160
    • 161
    • 162
    • 163
    • 164
    • 165
    • 166
    • 167
    • 168
    • 169
    • 170
    • 171
    • 172
    • 173
    • 174
    • 175
    • 176
    • 177
    • 178
    • 179
    • 180
    • 181
    • 182
    • 183
    • 184
    • 185
    • 186
    • 187
    • 188
    • 189
    • 190
    • 191
    • 192
    • 193
    • 194
    • 195
    • 196
    • 197
    • 198
    • 199
    • 200
    • 201
    • 202
    • 203
    • 204
    • 205
    • 206
    • 207
    • 208
    • 209
    • 210
    • 211
    • 212
    • 213
    • 214
    • 215
    • 216
    • 217
    • 218
    • 219
    • 220
    • 221
    • 222
    • 223
    • 224
    • 225
    • 226
    • 227
    • 228
    • 229
    • 230
    • 231
    • 232
    • 233
    • 234
    • 235
    • 236
    • 237
    • 238
    • 239
    • 240
    • 241
    • 242
    • 243
    • 244
    • 245

    解决方案:

    改成ip地址192.168.22.136或者主机名hadoop10即可访问

    参考:https://www.aboutyun.com/thread-9302-1-1.html

    2.3 和HBase相关的操作

    导入 MySQL 数据到 HBase 
    先手动创建表
    create 'new_help_keyword','info'
    提示:sqoop1.4.6只支持HBase1.0.1之前的版本的自动创建HBase表的功能
    sqoop import \
    --connect jdbc:mysql://localhost:3306/mysql \
    --username root \
    --password 111111 \
    --table help_keyword \
    --columns "help_keyword_id,name" \
    --column-family "info" \
    --hbase-create-table \
    --hbase-row-key "help_keyword_id" \
    --hbase-table "new_help_keyword" \
    
    
    
    字段解释:
    --connect jdbc:mysql://localhost:3306/mysql       表示远程或者本地 Mysql 服务的 URI 
    --hbase-create-table                            表示在 HBase 中建立表。 
    --hbase-table new_help_keyword      表示在 HBase 中建立表 new_help_keyword。 
    --hbase-row-key help_keyword_id      表示hbase表的rowkey是mysql表的help_keyword_id 字段。 
    --column-family "info"       表示在表 new_help_keyword 中建立列簇 info。 
    --username root           表示使用用户 root 连接 mysql。 
    --password 111111           连接 mysql 的用户密码 
    --table help_keyword         表示导出 mysql 数据库的 help_keyword 表。 
    
    
    导出 HBase 数据到 MySQL 
    很遗憾,现在还没有直接的命令将 HBase 的数据导出到 MySQL 
    一般采用如下方法: 
    1、将 HBase 数据,转成 HDFS 文件,然后再由 sqoop 导入 
    2、直接使用 HBase 的 Java API 读取表数据,直接向 mysql 导入,不需要使用 sqoop 
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33

    2.4 和HBase相关的操作过程中遇到的问题

    [root@hadoop10 bin]# sqoop import \
    > --connect jdbc:mysql://localhost:3306/mysql \
    > --username root \
    > --password 111111 \
    > --table help_keyword \
    > --columns "help_keyword_id,name" \
    > --column-family "info" \
    > --hbase-create-table \
    > --hbase-row-key "help_keyword_id" \
    > --hbase-table "new_help_keyword" \
    > 
    Warning: /software/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /software/sqoop/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    2021-10-19 16:06:22,807 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    2021-10-19 16:06:22,868 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    2021-10-19 16:06:22,950 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    2021-10-19 16:06:22,950 INFO tool.CodeGenTool: Beginning code generation
    Tue Oct 19 16:06:23 CST 2021 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
    2021-10-19 16:06:23,428 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `help_keyword` AS t LIMIT 1
    2021-10-19 16:06:23,440 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `help_keyword` AS t LIMIT 1
    2021-10-19 16:06:23,451 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /software/hadoop
    Note: /tmp/sqoop-root/compile/410f0acf253deeca3bc21db9a3f628e3/help_keyword.java uses or overrides a deprecated API.
    Note: Recompile with -Xlint:deprecation for details.
    2021-10-19 16:06:24,697 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/410f0acf253deeca3bc21db9a3f628e3/help_keyword.jar
    2021-10-19 16:06:24,717 WARN manager.MySQLManager: It looks like you are importing from mysql.
    2021-10-19 16:06:24,717 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
    2021-10-19 16:06:24,717 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
    2021-10-19 16:06:24,717 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
    2021-10-19 16:06:24,819 INFO mapreduce.ImportJobBase: Beginning import of help_keyword
    2021-10-19 16:06:24,820 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
    2021-10-19 16:06:24,968 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
    2021-10-19 16:06:25,001 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
    Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.HBaseAdmin.(Lorg/apache/hadoop/conf/Configuration;)V
    	at org.apache.sqoop.mapreduce.HBaseImportJob.jobSetup(HBaseImportJob.java:163)
    	at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:268)
    	at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
    	at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:127)
    	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:520)
    	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
    	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
    	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
    	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
    	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
    	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
    	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48

    解决方案:

    Sqoop1.4.7更新很慢,和HBase2.3.6 兼容性有些问题。将HBase的版本更换成HBase1.3.1 就可以测试通估。



    声明:
            文章中代码及相关语句为自己根据相应理解编写,文章中出现的相关图片为自己实践中的截图和相关技术对应的图片,若有相关异议,请联系删除。感谢。转载请注明出处,感谢。


    By luoyepiaoxue2014

    B站: https://space.bilibili.com/1523287361 点击打开链接
    微博地址: http://weibo.com/luoyepiaoxue2014 点击打开链接

  • 相关阅读:
    MyBatis学习:实现dao层接口,调用类的方法以执行SQL
    裙式给料机的全球市场在2030年前将达到64亿美元!
    2、Calcite 源码编译与运行
    python、SQL日新增人数统计
    Pytorch中tensor的数据类型显示和转换方法
    硬件开发笔记(十二):RK3568底板电路电源模块和RTC模块原理图分析
    电镀含镍废水如何回收镍?电镀废水除镍的工艺方法有哪些?
    【蓝桥】数树数
    升讯威在线客服系统的并发高性能数据处理技术:对接百度自动翻译
    P6775 [NOI2020] 制作菜品
  • 原文地址:https://blog.csdn.net/luoyepiaoxue2014/article/details/128064976