1.添加环境变量:
vi etc/hadoop/hadoop-env.sh 及profile里配置:
export JAVA_HOME=/opt/jdk1.8.0_25
export HADOOP_PREFIX=/home/zjy/hadoop
2,修改配置文件:
etc/hadoop/core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
etc/hadoop/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
--3。建议无密码登录认证:
#ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
#cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
--4。格式化文件系统:
bin/hdfs namenode -format
--5。启动
sbin/start-dfs.sh
--6.查看
日志:$HADOOP_HOME/logs
前台查看namenode信息: http://localhost:50070/
--7。创建HDFS
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username> Copy the input files into the distributed filesystem:
--8。运行自带的例子:
$ bin/hdfs dfs -put etc/hadoop input //将 :/etc/hadoop 装入文件系统
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output 'dfs[a-z.]+'
$ bin/hdfs dfs -get output output //从HDFS里取文件
$ cat output/* //查看取出的的文件
$ bin/hdfs dfs -cat output/* //也是查看文件。
--9。停止:
sbin/stop-dfs.sh //停止hadoop
--YARN (新版本特性) 在单机模式下配置:
修改配置文件:
etc/hadoop/mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
-------
etc/hadoop/yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
---------
启动:yarn
sbin/start-yarn.sh
停止yarn:
sbin/stop-yarn.sh
------------------------------------------------
常见问题:
1。/tmp/hadoop-zjy-secondarynamenode.pid: Permission denied
fix:chmod -R 777 tmp/
2。Java HotSpot(TM) Client VM warning: You have loaded library /home/zjy/hadoop/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
14/11/09 23:09:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
fix:
vi ~/.profile
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib"
source ~/.profile
3. 往hdfs里扔文件时,报:
put: File /input/file1.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
:原因:是由于多次format导致dfs版本不一致
fix:删除文件系统,重新: hdfs namenode -format
或是把文件系统版本改成与 $HADOOP_PREFIX/TMP/dfs/version中的版本一致。
4。Name node is in safe mode (安全模式)
fix:bin/hadoop dfsadmin -safemode leave
用户可以通过dfsadmin -safemode value 来操作安全模式,
参数value的说明如下:
enter - 进入安全模式
leave - 强制NameNode离开安全模式
get - 返回安全模式是否开启的信息
wait - 等待,一直到安全模式结束。
-------------
5。org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost/user/zjy/input
目录没有加进去。
fix:hadoop fs -put conf input
6。2014-11-10 01:01:53,107 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: localhost/127.0.0.1:9000
原因:/etc/hosts文件 127.0.0.1 localhost的映射关系导致
fix:vi /etc/hosts
# 127.0.0.1 localhost
再 stop-all.sh
重新格式化:hdfs namenode -format
7。There are no datanodes in the cluster.
----------
执行下面命令,重新进行启动dataNode即可。
Hadoop启动 格式化集群
以下用hadoop用户执行
hadoop namenode -format -clusterid clustername
启动hdfs 执行
start-dfs.sh
开启 hadoop dfs服务
启动Yarn
开启 yarn 资源管理服务
start-yarn.sh
启动httpfs
开启 httpfs 服 务
httpfs.sh start
-------------------------------------
8。Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to YFCS-S6-APP/10.200.25.154:9000. Exiting
打开hdfs-site.xml里配置的datanode和namenode对应的目录,分别打开current文件夹里的VERSION,可以看到clusterID项正如日志里记录的一样,确实不一致,修改datanode里VERSION文件的clusterID 与namenode里的一致,再重新启动dfs(执行start-dfs.sh)再执行jps命令可以看到datanode已正常启动。
出现该问题的原因:在第一次格式化dfs后,启动并使用了hadoop,后来又重新执行了格式化命令(hdfs namenode -format),这时namenode的clusterID会重新生成,而datanode的clusterID 保持不变。
如果改成一致不能解决,则删除datanode目录下的文件:rm -rf /home/zjy/hadoop/tmp/dfs/data/current/* 再重启
9.eclipse远程运行hadoop自定义的代码报:Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=Administrator, access=WRITE, inode="/":zjy:supergroup:drwxr-xr-x
:原因:本地windows开发环境未安装ssh,或未配置正确,远程hdfs服务器相关目录没有操作权限
1。解决:安装ssh,并配置好ssh .在本地添加相关用户,最好能添加到Administrators组。
2.hdfs服务端 :hdfs dfs -chmod 777 -R /tmp ;hdfs dfs -chmod 777 -R /user/
实例运行成功:
zjy@zjy:/home/zjy/hadoop$bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep /input /output 'dfs[a-z.]+'
Java HotSpot(TM) Client VM warning: You have loaded library /home/zjy/hadoop/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
14/11/10 17:52:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/11/10 17:52:34 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/11/10 17:52:34 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String).
14/11/10 17:52:34 INFO input.FileInputFormat: Total input paths to process : 1
14/11/10 17:52:35 INFO mapreduce.JobSubmitter: number of splits:1
14/11/10 17:52:35 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1415670144299_0004
14/11/10 17:52:35 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
14/11/10 17:52:35 INFO impl.YarnClientImpl: Submitted application application_1415670144299_0004
14/11/10 17:52:35 INFO mapreduce.Job: The url to track the job: http://zjy:8088/proxy/application_1415670144299_0004/
14/11/10 17:52:35 INFO mapreduce.Job: Running job: job_1415670144299_0004
14/11/10 17:52:41 INFO mapreduce.Job: Job job_1415670144299_0004 running in uber mode : false
14/11/10 17:52:41 INFO mapreduce.Job: map 0% reduce 0%
14/11/10 17:52:47 INFO mapreduce.Job: map 100% reduce 0%
14/11/10 17:52:54 INFO mapreduce.Job: map 100% reduce 100%
14/11/10 17:52:54 INFO mapreduce.Job: Job job_1415670144299_0004 completed successfully
14/11/10 17:52:54 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=6
FILE: Number of bytes written=194175
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=115
HDFS: Number of bytes written=86
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=3082
Total time spent by all reduces in occupied slots (ms)=3570
Total time spent by all map tasks (ms)=3082
Total time spent by all reduce tasks (ms)=3570
Total vcore-seconds taken by all map tasks=3082
Total vcore-seconds taken by all reduce tasks=3570
Total megabyte-seconds taken by all map tasks=3155968
Total megabyte-seconds taken by all reduce tasks=3655680
Map-Reduce Framework
Map input records=1
Map output records=0
Map output bytes=0
Map output materialized bytes=6
Input split bytes=104
Combine input records=0
Combine output records=0
Reduce input groups=0
Reduce shuffle bytes=6
Reduce input records=0
Reduce output records=0
Spilled Records=0
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=162
CPU time spent (ms)=1160
Physical memory (bytes) snapshot=221036544
Virtual memory (bytes) snapshot=629686272
Total committed heap usage (bytes)=137498624
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=11
File Output Format Counters
Bytes Written=86
14/11/10 17:52:54 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/11/10 17:52:54 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String).
14/11/10 17:52:54 INFO input.FileInputFormat: Total input paths to process : 1
14/11/10 17:52:54 INFO mapreduce.JobSubmitter: number of splits:1
14/11/10 17:52:54 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1415670144299_0005
14/11/10 17:52:54 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
14/11/10 17:52:54 INFO impl.YarnClientImpl: Submitted application application_1415670144299_0005
14/11/10 17:52:54 INFO mapreduce.Job: The url to track the job: http://zjy:8088/proxy/application_1415670144299_0005/
14/11/10 17:52:54 INFO mapreduce.Job: Running job: job_1415670144299_0005
14/11/10 17:53:06 INFO mapreduce.Job: Job job_1415670144299_0005 running in uber mode : false
14/11/10 17:53:06 INFO mapreduce.Job: map 0% reduce 0%
14/11/10 17:53:12 INFO mapreduce.Job: map 100% reduce 0%
14/11/10 17:53:17 INFO mapreduce.Job: map 100% reduce 100%
14/11/10 17:53:18 INFO mapreduce.Job: Job job_1415670144299_0005 completed successfully
14/11/10 17:53:18 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=6
FILE: Number of bytes written=193153
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=220
HDFS: Number of bytes written=0
HDFS: Number of read operations=7
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=3259
Total time spent by all reduces in occupied slots (ms)=2955
Total time spent by all map tasks (ms)=3259
Total time spent by all reduce tasks (ms)=2955
Total vcore-seconds taken by all map tasks=3259
Total vcore-seconds taken by all reduce tasks=2955
Total megabyte-seconds taken by all map tasks=3337216
Total megabyte-seconds taken by all reduce tasks=3025920
Map-Reduce Framework
Map input records=0
Map output records=0
Map output bytes=0
Map output materialized bytes=6
Input split bytes=134
Combine input records=0
Combine output records=0
Reduce input groups=0
Reduce shuffle bytes=6
Reduce input records=0
Reduce output records=0
Spilled Records=0
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=157
CPU time spent (ms)=1200
Physical memory (bytes) snapshot=219996160
Virtual memory (bytes) snapshot=628498432
Total committed heap usage (bytes)=137498624
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=86
File Output Format Counters
Bytes Written=0
zjy@zjy:/home/zjy/hadoop$