目录
在这里展示是vmware10的版本,官网可以下载,
镜像文件:CentOS-6.8-x86_64-bin-DVD1.iso,官网下载
2.1创建新的虚拟机

2.选择自定义,然后下一步

3.不用修改,直接下一步

4.选择稍后安装操作系统,下一步

5.选择linux,centos64位,下一步

6.虚拟机名称起hadoop01,位置选择D盘位置,建议不要选择C盘,后期占内存大,下一步

7.直接下一步

8.建议选择2g内存,下一步

9.选择仅主机模式网络,下一步

10.选择系统自动推荐的,下一步


11.选择创建新虚拟磁盘,下一步

12.大小选择20g,将虚拟磁盘拆分多个文件,下一步

13.不用修改,直接下一步

14.不用修改,点击完成

15.编辑虚拟机设置
16.选择ISO映像文件,镜像文件:CentOS-6.8-x86_64-bin-DVD1.iso,官网下载

17.选择Install or upgrade an existing system,回车

18. 选择skip,回车

19.选择next,选择语言页面选择默认下一步,键盘默认下一步

20.Basic Storage Devices,选择Next

21.选择Yes,discrard any data
22.不用修改主机名,直接下一步

23.不用修改时间地区,直接下一步

24.设置密码,下一步,

25.选择Use Anyway,

26.选择Use All Space ,下一步

27.选择Write changes to disk

28.默认安装桌面版本,下一步

29.等待过程有些久

30.选择Reboot
31.直接选择Forward,接下来直接选择Forward
32.选择Yes

33.选择Finish,然后选择yes

1.登录用户和密码,选择log in

2.打开终端,open in Terminal

3.配置虚拟机静态ip
- [root@hadoop01 ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0
- DEVICE=eth0
- HWADDR=00:0C:29:1C:AE:A7
- TYPE=Ethernet
- UUID=a01f4ce4-e877-4696-aa24-caeef5395b9f
- ONBOOT=yes #修改成yes
- NM_CONTROLLED=yes
- BOOTPROTO=static #修改成静态
- IPADDR=192.168.86.101 #子网ip,101是自设的
- NETMASK=255.255.255.0 #子网掩码
- GETWAY=192.168.86.1 #子网ip,最后一位是固定的

4.配置好了重启网络
[root@hadoop01 ~]# service network restart
5.查看网络ip
- [root@hadoop01 ~]# ifconfig
- eth0 Link encap:Ethernet HWaddr 00:0C:29:1C:AE:A7
- inet addr:192.168.86.101 Bcast:192.168.86.255 Mask:255.255.255.0
- inet6 addr: fe80::20c:29ff:fe1c:aea7/64 Scope:Link
- UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- RX packets:2093404 errors:0 dropped:0 overruns:0 frame:0
- TX packets:2229452 errors:0 dropped:0 overruns:0 carrier:0
- collisions:0 txqueuelen:1000
- RX bytes:1148990236 (1.0 GiB) TX bytes:2157139065 (2.0 GiB)
-
- lo Link encap:Local Loopback
- inet addr:127.0.0.1 Mask:255.0.0.0
- inet6 addr: ::1/128 Scope:Host
- UP LOOPBACK RUNNING MTU:65536 Metric:1
- RX packets:660419 errors:0 dropped:0 overruns:0 frame:0
- TX packets:660419 errors:0 dropped:0 overruns:0 carrier:0
- collisions:0 txqueuelen:0
- RX bytes:92983459 (88.6 MiB) TX bytes:92983459 (88.6 MiB)
6.查看防火墙状态
[root@hadoop01 ~]# service iptables status
7.关闭防火墙(暂时关闭,重启后会失效)
[root@hadoop01 ~]# service iptables stop
8.检查防火墙状态
- [root@hadoop01 ~]# service iptables status
- iptables: Firewall is not running. #出现这个表示关闭的防火墙
9.永久关闭防火墙
- [root@hadoop01 ~]# chkconfig --list iptables
- iptables 0:off 1:off 2:on 3:on 4:on 5:on 6:off
- [root@hadoop01 ~]# chkconfig iptables off #永久关闭
- [root@hadoop01 ~]# chkconfig --list iptables
- iptables 0:off 1:off 2:off 3:off 4:off 5:off 6:off #永久关闭成功
1.打开主机名的配置文件,进行修改
- [root@hadoop01 ~]# vim /etc/sysconfig/network
-
- NETWORKING=yes
- HOSTNAME=hadoop01
2.修改ip的映射关系
- [root@hadoop01 ~]# vim /etc/hosts
-
- 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
- ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
- 192.168.86.101 hadoop01
- 192.168.86.102 hadoop02
- 192.168.86.103 hadoop03
3.重启虚拟机
[root@hadoop01 software]# reboot
4.检查主机名
- [root@hadoop01 software]# hostname
- hadoop01
- [root@hadoop01 software]# ping hadoop01
- PING hadoop01 (192.168.86.101) 56(84) bytes of data.
- 64 bytes from hadoop01 (192.168.86.101): icmp_seq=1 ttl=64 time=0.057 ms
- 64 bytes from hadoop01 (192.168.86.101): icmp_seq=2 ttl=64 time=0.047 ms
- 64 bytes from hadoop01 (192.168.86.101): icmp_seq=3 ttl=64 time=0.049 ms
- ^C
- --- hadoop01 ping statistics ---
- 3 packets transmitted, 3 received, 0% packet loss, time 2795ms
- rtt min/avg/max/mdev = 0.047/0.051/0.057/0.004 ms
在这里使用的是jdk-8u144-linux-x64.tar.gz,官网下载
1.建立专门放置安装包
- [root@hadoop01 ~]# mkdir /opt/software/
- [root@hadoop01 ~]# cd /opt/software/
- [root@hadoop01 software]# ll
- total 0
2.上传jdk到指定路径
可以使用xshell或者其他也可以,上传文件
- [root@hadoop01 software]# ls
- jdk-8u144-linux-x64.tar.gz
3.建立一个解压后的文件夹
- [root@hadoop01 software]# mkdir /opt/module/
- [root@hadoop01 software]# cd /opt/module/
- [root@hadoop01 module]# ll
- total 0
4.解压jdk
- [root@hadoop01 software]# tar -zxvf jdk-8u144-linux-x64.tar.gz -C /opt/module/
- [root@hadoop01 module]# java -version #安装完成后,查看jdk版本
- java version "1.8.0_144"
- Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
- Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
5.如果版本号不对应
- [root@hadoop01 software]# rpm -qa | grep java
- tzdata-java-2016c-1.el6.noarch
- java-1.7.0-openjdk-1.7.0.99-2.6.5.1.e16.x86_64
- java-1.6.0-openjdk-1.6.0.38-1.13.10.4.e16.x86_64
- [root@hadoop01 software]# rpm -e --nodeps java-1.7.0-openjdk-1.7.0.99-2.6.5.1.e16.x86_64 #卸载安装包
- [root@hadoop01 software]# rpm -e --nodeps java-1.6.0-openjdk-1.6.0.38-1.13.10.4.e16.x86_64
- [root@hadoop01 software]# rpm -qa | grep java
- tzdata-java-2016c-1.el6.noarch
6.配置环境变量
- [root@hadoop01 software]# vim /etc/profile #添加jdk路径
- export JAVA_HOME=/opt/module/jdk1.8.0_144
- export PATH=$PATH:$JAVA_HOME/bin
- [root@hadoop01 software]# source /etc/profile
7.再次查看jdk版本
- [root@hadoop01 software]# vim /etc/profile
- [root@hadoop01 software]# java -version
- java version "1.8.0_144"
- Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
- Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
1.右键打开克隆

2. 点击下一页


3.选择创建完整克隆

4.名称和路径

5.选择继续


重复一遍克隆一台虚拟机
名称为haoop03的虚拟机
6.并且启动

1.查看网卡信息
- [root@hadoop02 ~]# ifconfig
- eth1 Link encap:Ethernet HWaddr 00:0C:29:96:83:5A
- inet addr:192.168.86.102 Bcast:192.168.86.255 Mask:255.255.255.0
- inet6 addr: fe80::20c:29ff:fe96:835a/64 Scope:Link
- UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- RX packets:1648800 errors:0 dropped:0 overruns:0 frame:0
- TX packets:1521767 errors:0 dropped:0 overruns:0 carrier:0
- collisions:0 txqueuelen:1000
- RX bytes:1229868487 (1.1 GiB) TX bytes:187659267 (178.9 MiB)
-
- lo Link encap:Local Loopback
- inet addr:127.0.0.1 Mask:255.0.0.0
- inet6 addr: ::1/128 Scope:Host
- UP LOOPBACK RUNNING MTU:65536 Metric:1
- RX packets:72557 errors:0 dropped:0 overruns:0 frame:0
- TX packets:72557 errors:0 dropped:0 overruns:0 carrier:0
- collisions:0 txqueuelen:0
- RX bytes:5452856 (5.2 MiB) TX bytes:5452856 (5.2 MiB)
2.把网卡信息中eth1修改成eth0
- [root@hadoop02 ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0
-
- DEVICE=eth0
- HWADDR=00:0C:29:96:83:5A #查询ifconfig的eth1第一行的物理地址
- TYPE=Ethernet
- UUID=a01f4ce4-e877-4696-aa24-caeef5395b9f
- ONBOOT=yes
- NM_CONTROLLED=yes
- BOOTPROTO=static
- IPADDR=192.168.86.102 #修改hadoop02的IP地址
- NETMASK=255.255.255.0
- GETWAY=192.168.86.1
- #只需要修改物理地址和ip地址就可以
3.修改网卡文件
- [root@hadoop02 ~]# vim /etc/udev/rules.d/70-persistent-net.rules
-
- # This file was automatically generated by the /lib/udev/write_net_rules
- # program, run by the persistent-net-generator.rules rules file.
- #
- # You can modify it, as long as you keep each rule on a single
- # line, and change only the value of the NAME= key.
-
-
- # PCI device 0x8086:0x100f (e1000)
- SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:0c:29:96:83:5a", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"
4.修改主机名
- [root@hadoop02 ~]# vim /etc/sysconfig/network
-
- NETWORKING=yes
- HOSTNAME=hadoop02
5.重启虚拟机,主机名生效
[root@hadoop02 ~]# reboot
6.修改hadoop03网卡同上
在这里hadoop-2.7.3.tar.gz,官网下载
1.上传hadoop-2.7.3.tar.gz到指定位置
- [root@hadoop01 software]# ls
- hadoop-2.7.3.tar.gz
2.解压文件
[root@hadoop01 software]# tar -zxvf hadoop-2.7.3.tar.gz -C /opt/module/
3.配置hadoop-env.sh
在esc状态下:set nu显示行号
打开hadoop-env.sh修改jdk路径
- [root@hadoop01 ~]# cd /opt/module/hadoop-2.7.3/etc/hadoop
- [root@hadoop01 hadoop]# vim hadoop-env.sh
- 25 export JAVA_HOME=/opt/module/jdk1.8.0_144
-
4.添加hadoop的路径
- [root@hadoop01 hadoop]# vim /etc/profile #最后一行添加
- export HADOOP_HOME=/opt/module/hadoop-2.7.3
- export PATH=$PATH:$HADOOP_HOME/bin
- export PATH=$PATH:$HADOOP_HOME/sbin
5.修改让文件生效
[root@hadoop01 software]# source /etc/profile
6.hadoop02,hadoop03重复3-5的步骤配置就可以
1.hadoop01生成公钥和私钥
- [root@hadoop01 ~]# cd .ssh
- [root@hadoop01 .ssh]# pwd
- /root/.ssh
- [root@hadoop01 .ssh]# ssh-keygen -t rsa
- [root@hadoop01 .ssh]# ssh-copy-id hadoop01 #yes回车,输入密码
- [root@hadoop01 .ssh]# ssh-copy-id hadoop02
- [root@hadoop01 .ssh]# ssh-copy-id hadoop03
- [root@hadoop01 .ssh]# ssh-copy-id localhost
2.hadoop02,hadoop03重复步骤配置就可以
1.创建目录
[root@hadoop01 ~]# mkdir bin
2.创建文件
[root@hadoop01 bin]# touch xsync
3.编写集群同步脚本
- [root@hadoop01 bin]# vim xsync
-
- #!/bin/bash
- #1 获取输入参数个数,如果没有参数,直接退出
- pcount=$#
- if((pcount==0)); then
- echo no args;
- exit;
- fi
-
- #2 获取文件名称
- p1=$1
- fname=`basename $p1`
- echo fname=$fname
-
- #3 获取上级目录到绝对路径
- pdir=`cd -P $(dirname $p1); pwd`
- echo pdir=$pdir
-
- #4 获取当前用户名称
- user=`whoami`
-
- #5 循环
- for((host=1; host<4; host++)); do
- #echo $pdir/$fname $user@hadoop$host:$pdir
- echo --------------- hadoop0$host ----------------
- rsync -rvl $pdir/$fname $user@hadoop0$host:$pdir
- done
4.文件加上权限
[root@hadoop01 bin]# chmod 777 xsync
5.同步目录
[root@hadoop01 bin]# /root/bin/xsync /root/bin
1.配置core-site.xml
- [root@hadoop01 hadoop]# vim core-site.xml
- <configuration>
- <!-- 指定HDFS中NameNode的地址 -->
- <property>
- <name>fs.defaultFS</name>
- <value>hdfs://hadoop01:9000</value>
- </property>
-
- <!-- 指定hadoop运行时产生文件的存储目录 -->
- <property>
- <name>hadoop.tmp.dir</name>
- <value>/opt/module/hadoop-2.7.3/data/tmp</value>
- </property>
- </configuration>
- ~
2.配置hdfs-site.xml
- [root@hadoop01 hadoop]# vim hdfs-site.xml
- <configuration>
- <property>
- <name>dfs.replication</name>
- <value>3</value>
- </property>
-
- <property>
- <name>dfs.namenode.secondary.http-address</name>
- <value>hadoop01:50090</value>
- </property>
- </configuration>
3.配置slaves
- [root@hadoop01 hadoop]# vim slaves
-
- hadoop01
- hadoop02
- hadoop03
1.配置yarn-env.sh
- [root@hadoop01 hadoop]# vim yarn-env.sh
- 23 export JAVA_HOME=/opt/module/jdk1.8.0_144
2.配置yarn-site.xml
- [root@hadoop01 hadoop]# vim yarn-site.xml
- <configuration>
-
- <!-- reducer获取数据的方式 -->
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
-
- <!-- 指定YARN的ResourceManager的地址 -->
- <property>
- <name>yarn.resourcemanager.hostname</name>
- <value>hadoop01</value>
- </property>
- <!-- 日志聚集功能使能 -->
- <property>
- <name>yarn.log-aggregation-enable</name>
- <value>true</value>
- </property>
- <!-- 日志保留时间设置7天 -->
- <property>
- <name>yarn.log-aggregation.retain-seconds</name>
- <value>604800</value>
- </property>
-
- </configuration>
3.配置mapred-env.sh
- [root@hadoop01 hadoop]# vim mapred-env.sh
- export JAVA_HOME=/opt/module/jdk1.8.0_144
-
-
- export HADOOP_JOB_HISTORYSERVER_HEAPSIZE=1000
-
- export HADOOP_MAPRED_ROOT_LOGGER=INFO,RFA
4.配置mapred-site.xml
- [root@hadoop01 hadoop]# mv mapred-site.xml.template mapred-site.xml #修改名字
- [root@hadoop01 hadoop]# vim mapred-site.xml
- <configuration>
- <!-- 指定mr运行在yarn上 -->
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- <property>
- <name>mapreduce.jobhistory.address</name>
- <value>hadoop01:10020</value>
- </property>
- <property>
- <name>mapreduce.jobhistory.webapp.address</name>
- <value>hadoop01:19888</value>
- </property>
-
- </configuration>
5.同步集群到hadoop02,hadoop03
[root@hadoop01 hadoop]# /root/bin/xsync /opt/module/hadoop-2.7.3/
1.第一次启动集群,格式化namenode
[root@hadoop01 hadoop-2.7.3]# bin/hdfs namenode -format
2.启动进程
[root@hadoop01 hadoop-2.7.3]# sbin/start-dfs.sh
3.查看进程
- [root@hadoop01 hadoop-2.7.3]# jps
- 10496 Jps
- 28469 SecondaryNameNode
- 28189 NameNode
- 28286 DataNode
- [root@hadoop02 ~]# jps
- 27242 Jps
- 3614 DataNode
- [root@hadoop03 ~]# jps
- 27242 Jps
- 3614 DataNode
4.访问端口号50070

5.启动yarn集群
[root@hadoop01 hadoop-2.7.3]# sbin/start-yarn.sh
6.查看yarn进程
- [root@hadoop01 hadoop-2.7.3]# jps
- 49155 NodeManager
- 28469 SecondaryNameNode
- 48917 ResourceManager
- 10600 Jps
- 28189 NameNode
- 28286 DataNode
- [root@hadoop02 ~]# jps
- 3736 NodeManager
- 27242 Jps
- 3614 DataNode
- [root@hadoop03 ~]# jps
- 3736 NodeManager
- 27242 Jps
- 3614 DataNode
7.查看yarn端口号8088

1.在Windows找到hosts文件

2.进行修改,添加ip和主机名
- # Copyright (c) 1993-2009 Microsoft Corp.
- #
- # This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
- #
- # This file contains the mappings of IP addresses to host names. Each
- # entry should be kept on an individual line. The IP address should
- # be placed in the first column followed by the corresponding host name.
- # The IP address and the host name should be separated by at least one
- # space.
- #
- # Additionally, comments (such as these) may be inserted on individual
- # lines or following the machine name denoted by a '#' symbol.
- #
- # For example:
- #
- # 102.54.94.97 rhino.acme.com # source server
- # 38.25.63.10 x.acme.com # x client host
-
- # localhost name resolution is handled within DNS itself.
- # 127.0.0.1 localhost
- # ::1 localhost
- # 最后一行添加
- 192.168.86.101 hadoop01
- 192.168.86.102 hadoop02
- 192.168.86.103 hadoop03
1.创建目录
- [root@hadoop01 ~]# mkdir /mnt/cdrom
- [root@hadoop01 ~]# cd /mnt
- [root@hadoop01 mnt]# ll
- total 4
- dr-xr-xr-x. 7 root root 4096 May 23 2016 cdrom
2.挂载光驱
- [root@hadoop01 mnt]# mount -t auto /dev/cdrom /mnt/cdrom
- [root@hadoop01 mnt]# cd /etc/yum.repos.d/
- [root@hadoop01 yum.repos.d]# mkdir bak
- [root@hadoop01 yum.repos.d]# mv CentOS-* bak
3.创建配置CentOS-DVD.repo
- [root@hadoop01 yum.repos.d]# touch CentOS-DVD.repo
- [root@hadoop01 yum.repos.d]# vim CentOS-DVD.repo
-
- [centos6-dvd]
- name=Welcome to local source yum
- baseurl=file:///mnt/cdrom
- enabled=1
- gpgcheck=0
4.加载yum源
- [root@hadoop01 yum.repos.d]# yum clean all
- [root@hadoop01 yum.repos.d]# yum repolist all
1. 配置mapred-site.xml
- [root@hadoop01 hadoop]# vim mapred-site.xml
- <configuration>
- <!-- 指定mr运行在yarn上 -->
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- <property>
- <name>mapreduce.jobhistory.address</name>
- <value>hadoop01:10020</value>
- </property>
- <property>
- <name>mapreduce.jobhistory.webapp.address</name>
- <value>hadoop01:19888</value>
- </property>
-
- </configuration>
2. 查看启动历史服务器文件目录
- [root@hadoop01 hadoop-2.7.3]# ls sbin/ | grep mr
- mr-jobhistory-daemon.sh
3.启动历史服务器
[root@hadoop01 hadoop-2.7.2]$ sbin/mr-jobhistory-daemon.sh start historyserver
4.查看历史服务器是否启动
[root@hadoop01 hadoop-2.7.2]$ jps
5.查看jobhistory,端口号19888
http://hadoop01:19888/jobhistory
1.配置yarn-site.xml
- [root@hadoop01 hadoop]# vim yarn-site.xml
- <configuration>
-
- <!-- reducer获取数据的方式 -->
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
-
- <!-- 指定YARN的ResourceManager的地址 -->
- <property>
- <name>yarn.resourcemanager.hostname</name>
- <value>hadoop01</value>
- </property>
- <!-- 日志聚集功能使能 -->
- <property>
- <name>yarn.log-aggregation-enable</name>
- <value>true</value>
- </property>
- <!-- 日志保留时间设置7天 -->
- <property>
- <name>yarn.log-aggregation.retain-seconds</name>
- <value>604800</value>
- </property>
-
- </configuration>
2.关闭nodemanager 、resourcemanager和historymanager
- [root@hadoop01 hadoop-2.7.3]$ sbin/yarn-daemon.sh stop resourcemanager
- [root@hadoop01 hadoop-2.7.3]$ sbin/yarn-daemon.sh stop nodemanager
- [root@hadoop01 hadoop-2.7.3]$ sbin/mr-jobhistory-daemon.sh stop historyserver
3. 启动nodemanager 、resourcemanager和historymanager
- [root@hadoop01 hadoop-2.7.3]$ sbin/yarn-daemon.sh start resourcemanager
- [root@hadoop01 hadoop-2.7.3]$ sbin/yarn-daemon.sh start nodemanager
- [root@hadoop01 hadoop-2.7.3]$ sbin/mr-jobhistory-daemon.sh start historyserver
4.删除hdfs上已经存在的hdfs文件
[root@hadoop01 hadoop-2.7.3]$ bin/hdfs dfs -rm -R /user/root/output
5.执行wordcount程序
[root@hadoop01 hadoop-2.7.3]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /user/root/input /user/root/output
6.查看日志,端口号19888

1.安装mysql
[root@hadoop01 ~]# yum install mysql-server -y
2.启动mysql
[root@hadoop01 ~]# service mysql start
3.初始化密码,不设置密码,直接回车
[root@hadoop01 ~]# /usr/bin/mysql_secure_installation
4.启动并登录
- [root@hadoop01 ~]# service mysqld restart
- [root@hadoop01 ~]# mysql -u root -p123456
5.用远程端连接mysql
- mysql> GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123456' WITH
- GRANT OPTION;
- Query OK, 0 rows affected (0.01 sec)
- mysql> FLUSH PRIVILEGES;
- Query OK, 0 rows affected (0.00 sec)

6.退出mysql
- mysql> exit
- Bye
- [root@hadoop01 ~]#
1.上传hive的安装包
apache-hive-2.1.1-bin.tar.gz在这里是用的这个版本
[root@hadoop01 software]# ls #查看对应安装包是否上传
2.解压安装包
[root@hadoop01 software]# tar -zxvf apache-hive-2.1.1-bin.tar.gz -C /opt/module/
3.修改hive-env.sh文件并加入环境变量
- [root@hadoop01 module]# mv apache-hive-2.1.1-bin/ hive
- [root@hadoop01 module]# cd hive/conf/
- [root@hadoop01 conf]# mv hive-env.sh.template hive-env.sh
- [root@hadoop01 conf]# vim hive-env.sh
- 47 # Set HADOOP_HOME to point to a specific hadoop install directory
- 48 HADOOP_HOME=/opt/module/hadoop-2.7.3
- 49
- 50 # Hive Configuration Directory can be controlled by:
- 51 export HIVE_CONF_DIR=/opt/module/hive/conf
1.上传mysql的安装包
mysql-connector-java-5.1.27.tar.gz,官网下载
2.复制mysql驱动jar包到hive环境中
[root@hadoop01 mysql-connector-java-5.1.27]# cp mysql-connector-java-5.1.27-bin.jar /opt/module/hive/lib
3.配置hive的mysql数据源
- [root@hadoop01 mysql-connector-java-5.1.27]# cd /opt/module/hive/conf/
- [root@hadoop01 conf]# vim hive-site.xml
-
- <?xml version="1.0"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <configuration>
- <property>
- <name>javax.jdo.option.ConnectionURL</name>
- <value>jdbc:mysql://hadoop01:3306/hive?createDatabaseIfNotExist=true</value>
- <description>JDBC connect string for a JDBC metastore</description>
- </property>
- <property>
- <name>javax.jdo.option.ConnectionDriverName</name>
- <value>com.mysql.jdbc.Driver</value>
- <description>Driver class name for a JDBC metastore</description>
- </property>
- <property>
- <name>javax.jdo.option.ConnectionUserName</name>
- <value>root</value>
- <description>username to use against metastore database</description>
- </property>
- <property>
- <name>javax.jdo.option.ConnectionPassword</name>
- <value>123456</value>
- <description>password to use against metastore database</description>
- </property>
- </configuration>
4.在HDFS上创建/tmp和/user/hive/warehouse两个目录并修改他们的同组权限可写
- [root@hadoop01 hadoop-2.7.3]$ bin/hadoop fs -mkdir /tmp
- [root@hadoop01 hadoop-2.7.3]$ bin/hadoop fs -mkdir -p /user/hive/warehouse
- [root@hadoop01 hadoop-2.7.3]$ bin/hadoop fs -chmod g+w /tmp
- [root@hadoop01 hadoop-2.7.3]$ bin/hadoop fs -chmod g+w /user/hive/warehouse
5.初始化元数据库,在hive安装目录的bin下
[root@hadoop01 hive]# bin/schematool -dbType mysql -initSchema
6.登录hive的shell进行测试
- [root@hadoop01 hive]# bin/hive
- which: no hbase in (/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/module/jdk1.8.0_144/bin:/opt/module/hadoop-2.7.3/bin:/opt/module/hadoop-2.7.3/sbin:/root/bin)
- SLF4J: Class path contains multiple SLF4J bindings.
- SLF4J: Found binding in [jar:file:/opt/module/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: Found binding in [jar:file:/opt/module/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
- SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
- SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
-
- Logging initialized using configuration in jar:file:/opt/module/hive/lib/hive-common-2.1.1.jar!/hive-log4j2.properties Async: true
- Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
- hive> exit;
- [root@hadoop01 hive]#

1.在三台服务器中设置相同的上海时区
- [root@hadoop01 ~] # cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
- [root@hadoop02 ~] # cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
- [root@hadoop03 ~] # cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
2.hadoop01机器上配置
hadoop01作为主时间同步服务器,其它机器时间以它进行时间同步。
- [root@hadoop01 ~]# vim /etc/ntp.conf
- 10
- 11 # Permit all access over the loopback interface. This could
- 12 # be tightened as well, but to do so would effect some of
- 13 # the administrative functions.
- 14 restrict 192.168.86.101 nomodify notrap nopeer noquery #修改ip地址
- 15 restrict 127.0.0.1
- 16 restrict -6 ::1
- 17
- 18 # Hosts on local network are less restricted.
- 19 restrict 192.168.86.1 mask 255.255.255.0 nomodify notrap
- 20
- 21 # Use public servers from the pool.ntp.org project.
- 22 # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- 23 #server 0.centos.pool.ntp.org iburst
- 24 #server 1.centos.pool.ntp.org iburst
- 25 #server 2.centos.pool.ntp.org iburst
- 26 #server 3.centos.pool.ntp.org iburst
- 27 server 127.127.1.0
- 28 fudge 127.127.1.0 stratum 10
3.在hadoop02/hadoop03上分别配置
- 6 # Permit time synchronization with our time source, but do not
- 7 # permit the source to query or modify the service on this system.
- 8 restrict default kod nomodify notrap nopeer noquery
- 9 restrict -6 default kod nomodify notrap nopeer noquery
- 10
- 11 # Permit all access over the loopback interface. This could
- 12 # be tightened as well, but to do so would effect some of
- 13 # the administrative functions.
- 14 restrict 192.168.86.101 nomodify notrap nopeer noquery
- 15 restrict 127.0.0.1
- 16 restrict -6 ::1
- 17
- 18 # Hosts on local network are less restricted.
- 19 trict 192.168.86.1 mask 255.255.255.0 nomodify notrap
- 20
- 21 # Use public servers from the pool.ntp.org project.
- 22 # Please consider joining the pool (http://www.pool.ntp.org/join.html).
- 23 #server 0.centos.pool.ntp.org iburst
- 24 #server 1.centos.pool.ntp.org iburst
- 25 #server 2.centos.pool.ntp.org iburst
- 26 #server 3.centos.pool.ntp.org iburst
- 27 server 192.168.86.101
- 28 Fudge 192.168.86.101 stratum 10
1.在三台虚拟机中启动ntpd服务器
- [root@hadoop01 ~]#service ntpd start
- [root@hadoop02 ~]#service ntpd start
- [root@hadoop03 ~]#service ntpd start
2.在hadoop02或hadoop03中修改任意一台机器时间
[root@hadoop02 ~]# date -s "2017-9-11 11:11:11"
3.在这台机器中向时间服务器hadoop01发送同步请求(手动同步测试)
[root@hadoop02 ~]# ntpdate 192.168.86.101
查看时间同步结果:
[root@hadoop02 ~]# date
4.在hadoop02或hadoop03中修改任意一台机器时间
[root@hadoop02 ~]# date -s "2017-9-11 11:11:11"
5.十分钟后查看机器是否与时间服务器同步(自动同步测试)
[root@hadoop02 ~]# date
1.上传安装包
zookeeper-3.4.10.tar.gz
2.解压安装包
- [root@hadoop01 software]# tar -zxvf zookeeper-3.4.10.tar.gz -C /opt/module/
- [root@hadoop01 software]# cd /opt/module/
- [root@hadoop01 module]# mv zookeeper-3.4.10/ zookeeper
- [root@hadoop01 conf]# cd /opt/module/zookeeper/conf
1.修改zoo_sample.cfg名称
[root@hadoop01 conf]# mv zoo_sample.cfg zoo.cfg
2.创建一个存放zk数据的目录:
[root@hadoop01 zookeeper]# mkdir /opt/module/zookeeper/data
3.配置zoo.cfg
- [root@hadoop01 zookeeper]# vim conf/zoo.cfg
-
- # The number of milliseconds of each tick
- tickTime=2000
- # The number of ticks that the initial
- # synchronization phase can take
- initLimit=10
- # The number of ticks that can pass between
- # sending a request and getting an acknowledgement
- syncLimit=5
- # the directory where the snapshot is stored.
- # do not use /tmp for storage, /tmp here is just
- # example sakes.
- dataDir=/opt/module/zookeeper/data
- # the port at which the clients will connect
- clientPort=2181
- # the maximum number of client connections.
- # increase this if you need to handle more clients
- #maxClientCnxns=60
- #
- # Be sure to read the maintenance section of the
- # administrator guide before turning on autopurge.
- #
- # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
- #
- # The number of snapshots to retain in dataDir
- #autopurge.snapRetainCount=3
- # Purge task interval in hours
- # Set to "0" to disable auto purge feature
- #autopurge.purgeInterval=1
- #配置集群的机器
- server.1=hadoop01:2888:3888
- server.2=hadoop02:2888:3888
- server.3=hadoop03:2888:3888
4.测试
- [root@hadoop01 data]# cd /opt/module/zookeeper/data/
- [root@hadoop01 data]# touch myid
- [root@hadoop01 data]# echo 1 > myid
- [root@hadoop01 data]# cat myid
- 1
1.同步文件
[root@hadoop01 zookeeper]# /root/bin/xsync /opt/module/zookeeper
2.写入
- [root@hadoop01 zookeeper]# cd /opt/module/zookeeper/data/
- [root@hadoop01 data]# cat myid
- 1
- [root@hadoop02 data]# echo 2 > myid
- [root@hadoop02 data]# cat myid
- 2
- [root@hadoop03 module]# cd /opt/module/zookeeper/data/
- [root@hadoop03 data]# echo 3 > myid
- [root@hadoop03 data]# cat myid
- 3
- [root@hadoop01 data]# cd /opt/module/zookeeper/
- [root@hadoop02 data]# cd /opt/module/zookeeper/
- [root@hadoop03 data]# cd /opt/module/zookeeper/
- 在三台机器中执行命令:
- # bin/zkServer.sh start ---启动命令
- # bin/zkServer.sh stop ---停止命令
- # bin/zkServer.sh status ---查看状态命令
- [root@hadoop01 zookeeper]# bin/zkServer.sh status
- ZooKeeper JMX enabled by default
- Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg
- Mode: follower
-
- [root@hadoop02 zookeeper]# bin/zkServer.sh status
- ZooKeeper JMX enabled by default
- Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg
- Mode: leader
-
- [root@hadoop03 zookeeper]# bin/zkServer.sh status
- ZooKeeper JMX enabled by default
- Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg
- Mode: follower
1.上传安装包
hbase-1.3.1-bin.tar.gz
2.解压
[root@hadoop01 software]# tar -zxvf hbase-1.3.1-bin.tar.gz -C /opt/module/
1.hbase-env.sh修改内容:
- [root@hadoop01 conf]# pwd
- /opt/module/hbase/conf
- [root@hadoop01 conf]# vim hbase-env.sh
- 27 export JAVA_HOME=/opt/module/jdk1.8.0_144
- 129 export HBASE_MANAGES_ZK=false
2.hbase-site.xml修改内容
- [root@hadoop01 conf]# vim hbase-site.xml
- <configuration>
- <property>
- <name>hbase.rootdir</name>
- <value>hdfs://hadoop01:9000/hbase</value>
- </property>
-
- <property>
- <name>hbase.cluster.distributed</name>
- <value>true</value>
- </property>
-
- <!-- 0.98后的新变动,之前版本没有.port,默认端口为60000 -->
- <property>
- <name>hbase.master.port</name>
- <value>16000</value>
- </property>
-
- <property>
- <name>hbase.zookeeper.quorum</name>
- <value>hadoop01:2181,hadoop02:2181,hadoop03:2181</value>
- </property>
-
- <property>
- <name>hbase.zookeeper.property.dataDir</name>
- <value>/opt/module/zookeeper/data</value>
- </property>
- </configuration>
3.配置regionservers
- [root@hadoop01 conf]# vim regionservers
-
- hadoop01
- hadoop02
- hadoop03
4.软连接hadoop配置文件到hbase
- [root@hadoop01 module]$ ln -s /opt/module/hadoop-2.7.2/etc/hadoop/core-site.xml
- /opt/module/hbase/conf/core-site.xml
- [root@hadoop01 module]$ ln -s /opt/module/hadoop-2.7.2/etc/hadoop/hdfs-site.xml
- /opt/module/hbase/conf/hdfs-site.xml
[root@hadoop02 module]$ xsync hbase/
1.启动
- [root@hadoop02 hbase]# bin/start-hbase.sh
- starting master, logging to /opt/module/hbase/bin/../logs/hbase-root-master-hadoop02.out
- hadoop03: starting regionserver, logging to /opt/module/hbase/bin/../logs/hbase-root-regionserver-hadoop03.out
- hadoop01: starting regionserver, logging to /opt/module/hbase/bin/../logs/hbase-root-regionserver-hadoop01.out
- hadoop02: starting regionserver, logging to /opt/module/hbase/bin/../logs/hbase-root-regionserver-hadoop02.out
2.关闭
[root@hadoop02 hbase]$ bin/stop-hbase.sh

1.上传安装包
spark-2.1.1-bin-hadoop2.7.tgz,官网下载
2.解压
[root@hadoop01 software]# tar -zxvf spark-2.1.1-bin-hadoop2.7.tgz -C /opt/module/
1.配置slaves
- [root@hadoop01 module]# mv spark-2.1.1-bin-hadoop2.7 spark
- [root@hadoop01 conf]# mv slaves.template slaves
- [root@hadoop01 conf]# vim slaves
-
- #
- # Licensed to the Apache Software Foundation (ASF) under one or more
- # contributor license agreements. See the NOTICE file distributed with
- # this work for additional information regarding copyright ownership.
- # The ASF licenses this file to You under the Apache License, Version 2.0
- # (the "License"); you may not use this file except in compliance with
- # the License. You may obtain a copy of the License at
- #
- # http://www.apache.org/licenses/LICENSE-2.0
- #
- # Unless required by applicable law or agreed to in writing, software
- # distributed under the License is distributed on an "AS IS" BASIS,
- # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- # See the License for the specific language governing permissions and
- # limitations under the License.
- #
-
- # A Spark Worker will be started on each of the machines listed below.
- hadoop01
- hadoop02
- hadoop03
2.修改spark-env.sh文件
- [root@hadoop01 conf]# vim spark-env.sh
- SPARK_MASTER_HOST=hadoop01
- SPARK_MASTER_PORT=7077
3.配置spark-config.sh文件
- [root@hadoop01 sbin]# vim spark-config.sh
- export JAVA_HOME=/opt/module/jdk1.8.0_144
[root@hadoop01 module]$ xsync spark/
- [root@hadoop01 spark]$ sbin/start-all.sh
- [root@hadoop01 spark]$ util.sh
- ================root@hadoop01================
- 3330 Jps
- 3238 Worker
- 3163 Master
- ================root@hadoop02================
- 2966 Jps
- 2908 Worker
- ================root@hadoop03================
- 2978 Worker
- 3036 Jps
5.查看UI页面,端口号8080

1.修改spark-default.conf.template名称
[root@hadoop01 conf]$ mv spark-defaults.conf.template spark-defaults.conf
2.修改spark-default.conf文件,开启Log:
注意:HDFS上的目录需要提前存在。
没有就创建目录
[root@hadoop01 conf]# hdfs dfs -mkdir directory
- [root@hadoop01 conf]# vim spark-defaults.conf
-
- #
- # Licensed to the Apache Software Foundation (ASF) under one or more
- # contributor license agreements. See the NOTICE file distributed with
- # this work for additional information regarding copyright ownership.
- # The ASF licenses this file to You under the Apache License, Version 2.0
- # (the "License"); you may not use this file except in compliance with
- # the License. You may obtain a copy of the License at
- #
- # http://www.apache.org/licenses/LICENSE-2.0
- #
- # Unless required by applicable law or agreed to in writing, software
- # distributed under the License is distributed on an "AS IS" BASIS,
- # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- # See the License for the specific language governing permissions and
- # limitations under the License.
- #
-
- # Default system properties included when running spark-submit.
- # This is useful for setting default environmental settings.
-
- # Example:
- # spark.master spark://master:7077
- # spark.eventLog.enabled true
- # spark.eventLog.dir hdfs://namenode:8021/directory
- # spark.serializer org.apache.spark.serializer.KryoSerializer
- # spark.driver.memory 5g
- # spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
- spark.eventLog.enabled true
- spark.eventLog.dir hdfs://hadoop01:9000/directory
3.修改spark-env.sh文件,添加如下配置:
- [root@hadoop01 conf]# vim spark-env.sh
- export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080
- -Dspark.history.retainedApplications=30
- -Dspark.history.fs.logDirectory=hdfs://hadoop01:9000/directory"
4.同步到其他集群
[root@hadoop01 conf]# xsync /opt/module/spark/conf
5.启动历史服务
[root@hadoop01 spark]# sbin/start-history-server.sh
6.再次执行任务
- [root@hadoop102 spark]$ bin/spark-submit \
- --class org.apache.spark.examples.SparkPi \
- --master spark://hadoop102:7077 \
- --executor-memory 1G \
- --total-executor-cores 2 \
- ./examples/jars/spark-examples_2.11-2.1.1.jar \
- 100
7.查看历史服务,端口号18080

1.上传安装包
apache-flume-1.7.0-bin.tar.gz,官网下载
2.解压
[root@hadoop01 software]# tar -zxvf apache-flume-1.7.0-bin.tar.gz -C /opt/module/
3.修改apache-flume-1.7.0-bin的名称为flume
[root@hadoop01 software]# mv apache-flume-1.7.0-bin flume
1.将flume/conf下的flume-env.sh.template文件修改为flume-env.sh,并配置flume-env.sh文件
- [root@hadoop01 software]# mv flume-env.sh.template flume-env.sh
- [root@hadoop01 software]# vim flume-env.sh
- export JAVA_HOME=/opt/module/jdk1.8.0_144
[root@hadoop01 flume]# /root/bin/xsync flume/