Getting Started with Hadoop


To begin playing around with what Hadoop does, I decided to go down the path of using HortonWorks Sandbox.  One of the first things the setup has you do, is install Oracle VirtualBox, which is a virtual machine.  Within that virtual machine is where the Sandbox will run.  One note, the browser IP is wrong in the tutorial, it should be to open the Sandbox GUI.

I then proceeded to follow the “Hello World” tutorial with I was able to import some actual data from the NYSE and run some Hive and Pig queries.  I have a substantial SQL background (but is not essential) so it was a breeze.

I’m impressed on how easy and well written the tutorial was.  Great way to get started!