When: Mon November 12, 2012 6:00 pm
Organization: Boston Predictive Analytics
Location: Microsoft (NERD), 1 Memorial Drive, Cambridge, MA
Also, by end of this weekend (11/4) I will be sending out instructions for software preparation (i.e. install Python) and cloud (i.e. Amazon).
5:30-6:00 Networking, Software Install, Cloud Setup
6:00-6:10 M/R and Workshop Overview - John Verostek
6:10-7:20 Map/Reduce Tutorial - Vipin Sachdeva (IBM Research Labs)
The Map/Reduce Programming Framework will be introduced using a hands-on Word Count example using Python. Next the basics of Hadoop Map/Reduce and File Server will be covered. A demo will be given of running the Python M/R program using Hadoop.
7:20-7:25 Short Break
7:25-8:30 Applications using Amazon Elastic M/R - J Singh (EarlyStageIT)
Since running a large file on a laptop at some point crashes the machine given memory and processor limitations this same program will be used on Amazon's Elastic Map Reduce. In addition to the word count example, a Facebook application will also be walked through. For this dataset, everyone who attends the workshop will have the option to sign into a workshop prep page with their Facebook account and give permission to share their likes. The data is automatically anonymized and sent to an Amazon S3 file. The exercise will find likes common to people in the sample. What might someone do after the analysis of such data? Design an advertising campaign, perhaps (but designing an ad campaign is not part of the workshop).
If you have not used Amazon Web Services before, please sign up for it ahead of time. We will be using their EC2, S3 and Elastic Map Reduce (EMR) services. Signing up requires a credit card. You'll spend about $2.50 during the training, give or take, definitely less than $10. Getting an Amazon AWS account also requires a cell phone — Amazon needs to call you with a code to set up your AWS account.