Hadoop笔记
hadoop 的安装
我主要参考 国人写的这篇 [Hadoop安装教程_单机/伪分布式配置_Hadoop2.6.0/Ubuntu14.04] (http://www.powerxing.com/install-hadoop/)
hadoop fs的使用
基本同 linux shell下的文件操作一样 ls mkdir rm cat 不同点 多了一个get put;
put: 把文件放进hadoop fs
get: 把文件从hadoop fs 拿出来.
hadoop fs -ls
hadoop fs -mkdir hello
hadoop fs -put /etc/passwd hello
hadoop -ls
hadoop fs -ls
hadoop fs -ls -al
hadoop fs -ls -R
hadoop fs -cat hello/passwd
hadoop fs -get hello/passwd
实验mapreducer
实验使用 Hadoop Streaming
Writing an Hadoop MapReduce Program in Python
The “trick” behind the following Python code is that we will use the Hadoop Streaming API (see also the corresponding wiki entry) for helping us passing data between our Map and Reduce code via STDIN (standard input) and STDOUT (standard output).
hadoop jar ./share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar -file /home/hadoop/mapper.py -mapper /home/hadoop/mapper.py -file /home/hadoop/reducer.py -reducer /home/hadoop/reducer.py -input hello/* -output hello/gutenberg-output