How to learn hadoop

This is an email I sent to my colleagues sharing my hadoop learning experience. I hope it can help a little bit to others in the Internet as well.


I thought it’s probably good to share some experience on my learning of hadoop, to help others who avoid mistakes I’ve had and share some useful links. I learned from Apache distribution, so this probably apply to just Apache (maybe partially other distribution). But anyway, I hope this will help you in somewhat way. So here’s my suggestion:

1. To start, you may wanna follow this tutorial to make a mapreduce job running in single mode.

This will definitely help you understand how hadoop works, and it provides a really good prototype for you to scale out into real cluster. Inside this tutorial you’ll learn how to set up a single mode cluster, how to write a mapreduce job. Here‘s another article on how to set up single mode cluster and it’s easier to read.

2. Set up a multi-node cluster

Follow this article will help you set up a real cluster. We don’t have enough machine to do that, but by using vmware player you won’t worry about it (just create 2-3 ubuntu instance in it).

3. Run a bigger hadoop job

When your scale out starting process a much bigger data set, there will always be some issues appearing. So you may want to do it in real cluster to know what it feels like. This doesn’t have to be a complicated job, but the input dataset should be HUGE. So you can know how the cluster running, distribute data, control its node those stuff.

Other than above, hadoop documentation and stackoverflow are always my top choices for trouble shooting. Hadoop Wiki is also a very valuable resource to answer common questions you may have during learning hadoop.

Besides, Hadoop Definite Guide is a very popular book recommend by many user, another book Hadoop In Action is one I used the most during set up clusters.

Hope this helps. Cheers

funny gifs

How to learn hadoop

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s