本文共 14861 字,大约阅读时间需要 49 分钟。
Hadoop
安装 Ubuntu环境
192.168.1.64 HNClient192.168.1.65 HNNameSUSE,Ubuntu的vi不能使用退格键删除数据
删除的时候,要按ESC,再按X才能删除数据插入数据,使用i在当前行之下新开一行,使用o在HNClient上操作
norman@HNClient:~$ sudo vi /etc/hostnamenorman@HNClient:~$ HNClientnorman@HNClient:~$ sudo apt-get install openssh-servernorman@HNClient:~$ sudo vi /etc/hosts
192.168.1.64 HNClient192.168.1.65 HNNamenorman@HNClient:~$ ssh-keygen (下面直接默认回车)
Generating public/private rsa key pair.Enter file in which to save the key (/home/norman/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/norman/.ssh/id_rsa.Your public key has been saved in /home/norman/.ssh/id_rsa.pub.The key fingerprint is:SHA256:rj3kM5OeqxceqGP6DcofXa+hZFReLQmKqksqoYL+YH4 norman@HNClientThe key's randomart image is:+---[RSA 2048]----+| . || . . . o || . . . + . || . o . . || . ..S ||.. o.o+. ||+= o.++o+. ||Xo.E+ +X+ ||oo.=+ |+----[SHA256]-----+norman@HNClient:~$ ssh localhost (ssh localhost,还是需要密码认证)
norman@localhost's password: Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.13.0-36-generic i686)251 packages can be updated.
79 updates are security updates.New release '18.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.Last login: Wed Oct 31 23:14:08 2018 from 192.168.1.65
norman@HNClient:~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
norman@HNClient:~$ ssh localhost (ssh localhost,不需要密码认证了)
Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.13.0-36-generic i686)251 packages can be updated.
79 updates are security updates.New release '18.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.Last login: Wed Oct 31 23:18:02 2018 from 127.0.0.1
norman@HNClient:~$ ssh HNName (ssh HNName,还是需要密码认证)
norman@hnname's password:norman@HNClient:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub norman@HNName
norman@HNClient:~$ ssh HNName (ssh HNName,不需要密码就能登陆HNName了)
Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.13.0-36-generic i686)254 packages can be updated.
79 updates are security updates.New release '18.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.Last login: Wed Oct 31 23:23:21 2018 from 192.168.1.64
norman@HNName:~$在HNName上操作
norman@HNName:~$ sudo vi /etc/hosts
192.168.1.64 HNClient192.168.1.65 HNNamenorman@HNName:~$ ssh-keygen (下面直接默认回车)
Generating public/private rsa key pair.Enter file in which to save the key (/home/norman/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/norman/.ssh/id_rsa.Your public key has been saved in /home/norman/.ssh/id_rsa.pub.The key fingerprint is:SHA256:YXrPGdhKYkPsAroDlIZJ4sYdbrpHyvaMQccMV3GJn9I norman@HNNameThe key's randomart image is:+---[RSA 2048]----+|.. . oo.. ||.+ oo.. ||oO.= = + ||+.B. + E + ||oo =. B S o ||+.= o = + o ||o . . + ||..* || . o |+----[SHA256]-----+norman@HNClient:~$ ssh localhost (ssh localhost,还是需要密码认证)norman@localhost's password: Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.13.0-36-generic i686)251 packages can be updated.
79 updates are security updates.New release '18.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.Last login: Wed Oct 31 22:55:29 2018 from 127.0.0.1
norman@HNName:~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
norman@HNName:~$ ssh localhost (ssh localhost,不需要密码认证了)Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.13.0-36-generic i686)254 packages can be updated.
79 updates are security updates.New release '18.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.Last login: Wed Oct 31 23:00:28 2018 from 127.0.0.1
norman@HNName:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub norman@hnclient
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/norman/.ssh/id_rsa.pub"The authenticity of host 'hnclient (192.168.1.64)' can't be established.ECDSA key fingerprint is SHA256:w5dwBrXor00JfFtpGXc0G/+deJJwmAxKmjXE32InhgA.Are you sure you want to continue connecting (yes/no)? yes/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keysnorman@hnclient's password:Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'norman@hnclient'"
and check to make sure that only the key(s) you wanted were added.norman@HNName:~$ ssh hnclient (ssh hnclient,不需要密码就能登陆hnclient了)
Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.13.0-36-generic i686)251 packages can be updated.
79 updates are security updates.New release '18.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.Last login: Wed Oct 31 23:05:13 2018 from 192.168.1.58
norman@HNClient:~$ exitnorman@HNName:~$ sudo apt-get install openjdk-7-jdk
[sudo] password for norman: Reading package lists... DoneBuilding dependency tree Reading state information... DonePackage openjdk-7-jdk is not available, but is referred to by another package.This may mean that the package is missing, has been obsoleted, oris only available from another sourceE: Package 'openjdk-7-jdk' has no installation candidate
是因为Ubuntu16.04的安装源已经默认没有openjdk7了,所以要自己手动添加仓库,如下:
norman@HNName:~$ sudo add-apt-repository ppa:openjdk-r/ppa (添加oracle openjdk ppa source)( add-apt-repository ppa: xxx/ppa 这句话的意思是获取最新的个人软件包档案源,将其添加至当前apt库中,并自动导入公钥。)
norman@HNName:~$ sudo apt-get updatenorman@HNName:~$ sudo apt-get install openjdk-7-jdknorman@HNName:~$ java -versionjava version "1.7.0_95"OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-3)OpenJDK Client VM (build 24.95-b01, mixed mode, sharing)norman@HNName:~$ wget
norman@HNName:~$ tar -zxvf hadoop-1.2.0-bin.tar.gznorman@HNName:~$ sudo cp -r hadoop-1.2.0 /usr/local/hadoopnorman@HNName:~$ dir /usr/local/hadoopbin hadoop-ant-1.2.0.jar hadoop-tools-1.2.0.jar NOTICE.txtbuild.xml hadoop-client-1.2.0.jar ivy README.txtc++ hadoop-core-1.2.0.jar ivy.xml sbinCHANGES.txt hadoop-examples-1.2.0.jar lib shareconf hadoop-minicluster-1.2.0.jar libexec srccontrib hadoop-test-1.2.0.jar LICENSE.txt webappsnorman@HNName:~$ sudo vi $HOME/.bashrc (末尾添加以下)
export HADOOP_PREFIX=/usr/local/hadoopexport PATH=$PATH:$HADOOP_PREFIX/binnorman@HNName:~$ exec bash
norman@HNName:~$ $PATHnorman@HNName:~$ sudo vi /usr/local/hadoop/conf/hadoop-env.sh
( The java implementation to use. Required.)export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-i386( Extra Java runtime options. Empty by default. 设置禁用IPv6)
export HADOOP_OPTS=-Djava.net.preferIP4Stack=trueInstalling Apache Hadoop (Single Node)
norman@HNName:~$ sudo vi /usr/local/hadoop/conf/core-site.xml<configuration><property><name>fs.default.name</name><value>hdfs://HNName:10001</value></property><property>
<name>hadoop.tmp.dir</name><value>/usr/local/hadoop/tmp</value></property></configuration>norman@HNName:~$ sudo vi /usr/local/hadoop/conf/mapred-site.xml
<configuration><property><name>mapred.job.tracker</name><value>HNName:10002</value></property></configuration>norman@HNName:~$ sudo mkdir /usr/local/hadoop/tmp
norman@HNName:~$ sudo chown norman /usr/local/hadoop/tmpnorman@HNName:~$ hadoop namenode -format (能看到以下说明成功)18/11/01 19:07:36 INFO common.Storage: Storage directory /usr/local/hadoop/tmp/dfs/name has been successfully formatted.norman@HNName:~$ hadoop-daemons.sh start namenode (出以下错误)
localhost: mkdir: cannot create directory ?usr/local/hadoop/libexec/../logs? Permission deniedlocalhost: chown: cannot access '/usr/local/hadoop/libexec/../logs': No such file or directorylocalhost: starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-norman-namenode-HNName.outlocalhost: /usr/local/hadoop/bin/hadoop-daemon.sh: line 137: /usr/local/hadoop/libexec/../logs/hadoop-norman-namenode-HNName.out: No such file or directorylocalhost: head: cannot open '/usr/local/hadoop/libexec/../logs/hadoop-norman-namenode-HNName.out' for reading: No such file or directorylocalhost: /usr/local/hadoop/bin/hadoop-daemon.sh: line 147: /usr/local/hadoop/libexec/../logs/hadoop-norman-namenode-HNName.out: No such file or directorylocalhost: /usr/local/hadoop/bin/hadoop-daemon.sh: line 148: /usr/local/hadoop/libexec/../logs/hadoop-norman-namenode-HNName.out: No such file or directorynorman@HNName:~$ ll /usr/local
total 44drwxr-xr-x 11 root root 4096 Nov 1 02:02 ./drwxr-xr-x 11 root root 4096 Feb 28 2018 ../drwxr-xr-x 2 root root 4096 Feb 28 2018 bin/drwxr-xr-x 2 root root 4096 Feb 28 2018 etc/drwxr-xr-x 2 root root 4096 Feb 28 2018 games/drwxr-xr-x 15 root root 4096 Nov 1 20:05 hadoop/drwxr-xr-x 2 root root 4096 Feb 28 2018 include/drwxr-xr-x 4 root root 4096 Feb 28 2018 lib/lrwxrwxrwx 1 root root 9 Jul 26 23:29 man -> share/man/drwxr-xr-x 2 root root 4096 Feb 28 2018 sbin/drwxr-xr-x 8 root root 4096 Feb 28 2018 share/drwxr-xr-x 2 root root 4096 Feb 28 2018 src/norman@HNName:~$ sudo chown norman /usr/local/hadoop
norman@HNName:~$ ll /usr/local
total 44drwxr-xr-x 11 root root 4096 Nov 1 02:02 ./drwxr-xr-x 11 root root 4096 Feb 28 2018 ../drwxr-xr-x 2 root root 4096 Feb 28 2018 bin/drwxr-xr-x 2 root root 4096 Feb 28 2018 etc/drwxr-xr-x 2 root root 4096 Feb 28 2018 games/drwxr-xr-x 15 norman root 4096 Nov 1 20:05 hadoop/drwxr-xr-x 2 root root 4096 Feb 28 2018 include/drwxr-xr-x 4 root root 4096 Feb 28 2018 lib/lrwxrwxrwx 1 root root 9 Jul 26 23:29 man -> share/man/drwxr-xr-x 2 root root 4096 Feb 28 2018 sbin/drwxr-xr-x 8 root root 4096 Feb 28 2018 share/drwxr-xr-x 2 root root 4096 Feb 28 2018 src/norman@HNName:~$ hadoop-daemons.sh start namenode
localhost: starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-norman-namenode-HNName.outnorman@HNName:~$ start-all.sh
norman@HNName:~$ jps23297 DataNode23610 TaskTracker23484 JobTracker23739 Jps23102 NameNode23416 SecondaryNameNodenorman@HNName:~$ dir /usr/local/hadoop/bin
hadoop hadoop-daemon.sh rcc start-all.sh start-dfs.sh start-mapred.sh stop-balancer.sh stop-jobhistoryserver.sh task-controllerhadoop-config.sh hadoop-daemons.sh slaves.sh start-balancer.sh start-jobhistoryserver.sh stop-all.sh stop-dfs.sh stop-mapred.shManaging HDFS
(下载文本文件)复制网页内容到war_and_peace.txt (下载任意数据)QCLCD201701.zip,QCLCD201702.zip,然后解压出201701hourly.txt, 201702hourly.txt在HNClient上操作
将数据 war_and_peace.txt 放到 /home/norman/data/book将数据201701hourly.txt,201702hourly.txt放到 /home/norman/data/weathernorman@HNClient:~$ sudo mkdir -p /home/norman/data/book
norman@HNClient:~$ sudo mkdir -p /home/norman/data/weathernorman@HNClient:~$ sudo chown norman /home/norman/data/weathernorman@HNClient:~$ sudo chown norman /home/norman/data/booknorman@HNClient:~$ sudo add-apt-repository ppa:openjdk-r/ppa
norman@HNClient:~$ sudo apt-get updatenorman@HNClient:~$ sudo apt-get install openjdk-7-jdknorman@HNClient:~$ java -versionnorman@HNClient:~$ wget norman@HNClient:~$ tar -zxvf hadoop-1.2.0-bin.tar.gznorman@HNClient:~$ sudo cp -r hadoop-1.2.0 /usr/local/hadoopnorman@HNClient:~$ sudo vi $HOME/.bashrc (末尾添加以下)export HADOOP_PREFIX=/usr/local/hadoopexport PATH=$PATH:$HADOOP_PREFIX/binnorman@HNClient:~$ exec bash
norman@HNClient:~$ $PATHnorman@HNClient:~$ sudo vi /usr/local/hadoop/conf/hadoop-env.sh(The java implementation to use. Required.)export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-i386( Extra Java runtime options. Empty by default. 设置禁用IPv6)export HADOOP_OPTS=-Djava.net.preferIP4Stack=truenorman@HNClient:~$ sudo vi /usr/local/hadoop/conf/core-site.xml
<configuration><property><name>fs.default.name</name><value>hdfs://HNName:10001</value></property><property>
<name>hadoop.tmp.dir</name><value>/usr/local/hadoop/tmp</value></property></configuration>norman@HNClient:~$ sudo vi /usr/local/hadoop/conf/mapred-site.xml
<configuration><property><name>mapred.job.tracker</name><value>HNName:10002</value></property></configuration>norman@HNClient:~$ hadoop fs -mkdir test
norman@HNClient:~$ hadoop fs -lsFound 1 itemsdrwxr-xr-x - norman supergroup 0 2018-11-02 01:17 /user/norman/testnorman@HNClient:~$ hadoop fs -mkdir hdfs://hnname:10001/data/small
norman@HNClient:~$ hadoop fs -mkdir hdfs://hnname:10001/data/big网页打开http://192.168.1.65:50070
=/
norman@HNClient:~$ hadoop fs -rmr test (测试删除)
Deleted hdfs://HNName:10001/user/norman/testnorman@HNClient:~$ hadoop fs -moveFromLocal /home/norman/data/book/war_and_peace.txt hdfs://hnname:10001/data/small/war_and_peace.txt
可以看到以下数据
norman@HNClient:~$ hadoop fs -copyToLocal hdfs://hnname:10001/data/small/war_and_peace.txt /home/norman/data/book/war_and_peace.bak.txt (测试复制到本地)
norman@HNClient:~$ hadoop fs -put /home/norman/data/weather hdfs://hnname:10001/data/big
可以看到以下数据
norman@HNClient:~$ hadoop dfsadmin -report
Configured Capacity: 19033165824 (17.73 GB)Present Capacity: 13114503168 (12.21 GB)DFS Remaining: 12005150720 (11.18 GB)DFS Used: 1109352448 (1.03 GB)DFS Used%: 8.46%Under replicated blocks: 19Blocks with corrupt replicas: 0Missing blocks: 0Datanodes available: 1 (1 total, 0 dead)
Name: 192.168.1.65:50010
Decommission Status : NormalConfigured Capacity: 19033165824 (17.73 GB)DFS Used: 1109352448 (1.03 GB)Non DFS Used: 5918662656 (5.51 GB)DFS Remaining: 12005150720(11.18 GB)DFS Used%: 5.83%DFS Remaining%: 63.07%Last contact: Fri Nov 02 01:49:43 GMT-08:00 2018norman@HNClient:~$ hadoop dfsadmin -safemode enter (upgrade的时候,需要用到safemode)
Safe mode is ONnorman@HNClient:~$ hadoop dfsadmin -safemode leave
Safe mode is OFF在HNName上操作
norman@HNName:~$ hadoop fsck -blocksStatus: HEALTHYTotal size: 1100586452 BTotal dirs: 13Total files: 4Total blocks (validated): 19 (avg. block size 57925602 B)Minimally replicated blocks: 19 (100.0 %)Over-replicated blocks: 0 (0.0 %)Under-replicated blocks: 19 (100.0 %)Mis-replicated blocks: 0 (0.0 %)Default replication factor: 3Average block replication: 1.0Corrupt blocks: 0Missing replicas: 38 (200.0 %)Number of data-nodes: 1Number of racks: 1FSCK ended at Fri Nov 02 01:54:46 GMT-08:00 2018 in 1049 millisecondsThe filesystem under path '/' is HEALTHY
norman@HNName:~$ hadoop fsck /data/big
Status: HEALTHYTotal size: 1097339705 BTotal dirs: 2Total files: 2Total blocks (validated): 17 (avg. block size 64549394 B)Minimally replicated blocks: 17 (100.0 %)Over-replicated blocks: 0 (0.0 %)Under-replicated blocks: 17 (100.0 %)Mis-replicated blocks: 0 (0.0 %)Default replication factor: 3Average block replication: 1.0Corrupt blocks: 0Missing replicas: 34 (200.0 %)Number of data-nodes: 1Number of racks: 1FSCK ended at Fri Nov 02 19:33:55 GMT-08:00 2018 in 14 millisecondsThe filesystem under path '/data/big' is HEALTHY
转载于:https://blog.51cto.com/2290153/2312517