Agile Data Science: Building Data Analytics Applications with Hadoop

Customers Who Bought This Also Bought

More About This Book

Overview
Product Details
Related Subjects
Meet the Author
Table of Contents

Overview

Mining big data requires a deep investment in people and time. How can you be sure you’re building the right models? With this hands-on book, you’ll learn a flexible toolset and methodology for building effective analytics applications with Hadoop.

Using lightweight tools such as Python, Apache Pig, and the D3.js library, your team will create an agile environment for exploring data, starting with an example application to mine your own email inboxes. You’ll learn an iterative approach that enables you to quickly change the kind of analysis you’re doing, depending on what the data is telling you. All example code in this book is available as working Heroku apps.

Create analytics applications by using the agile big data development methodology
Build value from your data in a series of agile sprints, using the data-value stack
Gain insight by using several data structures to extract multiple features from a single dataset
Visualize data with charts, and expose different aspects through interactive reports
Use historical data to predict the future, and translate predictions into action
Get feedback from users after each sprint to keep your project on track

Product Details

ISBN-13: 9781449326265
Publisher: O'Reilly Media, Incorporated
Publication date: 11/4/2013
Edition number: 1
Pages: 178
Sales rank: 451143
Product dimensions: 6.90 (w) x 9.00 (h) x 0.30 (d)

Meet the Author

Russell Jurney cut his data teeth in casino gaming, building web apps to analyze the performance of slot machines in the US and Mexico. After dabbling in entrepreneurship, interactive media and journalism, he moved to silicon valley to build analytics applications at scale at Ning and LinkedIn. He lives on the ocean in Pacifica, California with his wife Kate and two fuzzy dogs.

Read More Show Less

Preface;
Who This Book Is For;
How This Book Is Organized;
Conventions Used in This Book;
Using Code Examples;
Safari® Books Online;
How to Contact Us;
Setup;
Chapter 1: Theory;
1.1 Agile Big Data;
1.2 Big Words Defined;
1.3 Agile Big Data Teams;
1.4 Agile Big Data Process;
1.5 Code Review and Pair Programming;
1.6 Agile Environments: Engineering Productivity;
1.7 Realizing Ideas with Large-Format Printing;
Chapter 2: Data;
2.1 Email;
2.2 Working with Raw Data;
2.3 SQL;
2.4 NoSQL;
2.5 Data Perspectives;
Chapter 3: Agile Tools;
3.1 Scalability = Simplicity;
3.2 Agile Big Data Processing;
3.3 Setting Up a Virtual Environment for Python;
3.4 Serializing Events with Avro;
3.5 Collecting Data;
3.6 Data Processing with Pig;
3.7 Publishing Data with MongoDB;
3.8 Searching Data with ElasticSearch;
3.9 Reflecting on our Workflow;
3.10 Lightweight Web Applications;
3.11 Presenting Our Data;
3.12 Conclusion;
Chapter 4: To the Cloud!;
4.1 Introduction;
4.2 GitHub;
4.3 dotCloud;
4.4 Amazon Web Services;
4.5 Instrumentation;
Climbing the Pyramid;
Chapter 5: Collecting and Displaying Records;
5.1 Putting It All Together;
5.2 Collect and Serialize Our Inbox;
5.3 Process and Publish Our Emails;
5.4 Presenting Emails in a Browser;
5.5 Agile Checkpoint;
5.6 Listing Emails;
5.7 Searching Our Email;
5.8 Conclusion;
Chapter 6: Visualizing Data with Charts;
6.1 Good Charts;
6.2 Extracting Entities: Email Addresses;
6.3 Visualizing Time;
6.4 Conclusion;
Chapter 7: Exploring Data with Reports;
7.1 Building Reports with Multiple Charts;
7.2 Linking Records;
7.3 Extracting Keywords from Emails with TF-IDF;
7.4 Conclusion;
Chapter 8: Making Predictions;
8.1 Predicting Response Rates to Emails;
8.2 Personalization;
8.3 Conclusion;
Chapter 9: Driving Actions;
9.1 Properties of Successful Emails;
9.2 Better Predictions with Naive Bayes;
9.3 P(Reply | From & To);
9.4 P(Reply | Token);
9.5 Making Predictions in Real Time;
9.6 Logging Events;
9.7 Conclusion;
Colophon;

Customer Reviews

Be the first to write a review

( 0 )

5 Star

(0)

4 Star

(0)

3 Star

(0)

2 Star

(0)

1 Star

(0)

If you find inappropriate content, please report it to Barnes & Noble

Overview

Pick Up In Store

Available on NOOK devices and apps

NOOK Devices

Samsung Galaxy Tab 4 NOOK

NOOK HD/HD+ Tablet

NOOK

NOOK Color

NOOK Tablet

Tablet/Phone

NOOK for Windows 8 Tablet

NOOK for iOS

NOOK for Android

NOOK Kids for iPad

PC/Mac

NOOK for Windows 8

NOOK for PC

NOOK for Mac

NOOK for Web

All Available Formats & Editions

Customers Who Bought This Also Bought

More About This Book

Overview

Product Details

Meet the Author

Table of Contents

Customer Reviews

5 Star

4 Star

3 Star

2 Star

1 Star

Agile Data Science: Building Data Analytics Applications with Hadoop

Overview

Pick Up In Store

Available on NOOK devices and apps NOOK Devices Samsung Galaxy Tab 4 NOOK NOOK HD/HD+ Tablet NOOK NOOK Color NOOK Tablet Tablet/Phone NOOK for Windows 8 Tablet NOOK for iOS NOOK for Android NOOK Kids for iPad PC/Mac NOOK for Windows 8 NOOK for PC NOOK for Mac NOOK for Web

All Available Formats & Editions

Customers Who Bought This Also Bought

More About This Book

Overview

Product Details

Related Subjects

Meet the Author

Table of Contents

Customer Reviews

5 Star

4 Star

3 Star

2 Star

1 Star

Available on NOOK devices and apps

NOOK Devices

Samsung Galaxy Tab 4 NOOK

NOOK HD/HD+ Tablet

NOOK

NOOK Color

NOOK Tablet

Tablet/Phone

NOOK for Windows 8 Tablet

NOOK for iOS

NOOK for Android

NOOK Kids for iPad

PC/Mac

NOOK for Windows 8

NOOK for PC

NOOK for Mac

NOOK for Web