HDInsight Essentials

More About This Book

Overview
Product Details
Related Subjects
Meet the Author

Overview

In Detail

We live in an era in which data is generated with every action and a lot of these are unstructured; from Twitter feeds, Facebook updates, photos and digital sensor inputs. Current relational databases cannot handle the volume, velocity and variations of data. HDInsight gives you the ability to gain the full value of Big Data with a modern, cloud-based data platform that manages data of any size and type, whether structured or unstructured.

A hands-on guide that shows you how to seamlessly store and process Big Data of all types through Microsofts modern data platform; which provides simplicity, ease of management, and an open enterprise-ready Hadoop service all running in the Cloud. You will then learn how to analyze your Hadoop data with PowerPivot, Power View, Excel, and other Microsoft BI tools; thanks to integration with the Microsoft data platform, this will give you a solid foundation to build your own HDInsight solution, both on premise and on Cloud.

Firstly, we will provide an overview of Hadoop and Microsoft Big Data strategy, where HDinsight plays a key role. We will then show you how to set up your HDInsight cluster and take you through the 4 stages of collecting, processing, analysing and reporting. For each of these stages, you will see a practical example with working code.

You will then learn core Hadoop concepts like HDFS and MapReduce. You will also get a closer look at how Microsofts HDInsight leverages Hortonworks Data Platform that uses Apache Hadoop. You will then be guided through Hadoop commands and programming using open source software, such as Hive and Pig with HDInsight. Finally, you will learn to analyze and report using PowerPivot, Power View, Excel, and other Microsoft BI tools.

This guide provides step-by-step instructions on how to build a Big Data solution using HDInsight with open source software, provide useful Excel reports, and open up the full value of HDInsight.

Approach

This book is a fast-paced guide full of step-by-step instructions on how to build a multi-node Hadoop cluster on Windows servers.

Who this book is for

If you are a data architect or developer who wants to understand how to transform your data using open source software, such as MapReduce, Hive, Pig and JavaScript, and also leverage the Windows infrastructure; this book is perfect for you. It is also ideal if you are part of a team who is starting or planning a Hadoop implementation, and you want to understand the key components of Hadoop, and how HDInsight provides added value in administration and reporting.

Product Details

ISBN-13: 9781849695374
Publisher: Packt Publishing Pvt. Ltd.
Publication date: 9/23/2013
Sold by: Barnes & Noble
Format: eBook
Edition number: 1
File size: 6 MB

Meet the Author

Rajesh Nadipalli has over 17 years of IT experience and held a technical leadership position at Cisco Systems. His key focus areas have been in Data Management, Enterprise Architecture, Business Intelligence, Data Warehousing and Extract Transform Load (ETL). He has demonstrated success by delivering scalable data management and BI solutions that empower business to make informed decisions.

In his current role as a Senior Solutions Architect at Zaloni; Raj evaluates Big Data goals for his clients, recommends a target state architecture, assists in proof of concepts and prepares them for a production implementation. In addition, Raj is an instructor for Hadoop for Developers, Hive, Pig and Hbase. His clients include Verizon, American Express, Netapp, Cisco, EMC and United Health Group.

Rajesh Nadipalli holds a MBA from NC State University and a BS in EE from University of Mumbai, India.

Customer Reviews

Average Rating 3

( 2 )

5 Star

(0)

4 Star

(0)

3 Star

(2)

2 Star

(0)

1 Star

(0)

Anonymous

Posted Thu Jan 09 00:00:00 EST 2014

I would like to congratulate Mr. Rajesh Nadipalli for publishing

I would like to congratulate Mr. Rajesh Nadipalli for publishing

HDInsight Essentials book. The below mentioned are some of my

comments that I feel would make this book indispensable in context of

Windows Azure developer/dev-ops specialist/data manager.

Upon reading the book, I would like to see a chapter for Mahout

Integration with HDInsight. If you are using HDInsight in the cloud

then Mahout comes pre-installed for your use whereas if you are

running a local HDInsight instance on Windows Server you must deploy

Mahout on your own.

I propose the following chapter structure:

Introduction – what is mahout, need and motivation for machine

learning jobs in context of BigData, Installing and setting up

the Mahout in HDInsight

Data transformation using Mahout – how mahout can be used for data transformation, running machine learning tasks, importing data from Pig, Hive and exporting the machine learning results to MS Excel Case Studies using Mahout – real life scenarios where mahout is deployed to deliver meaningful results extracted from BigData, some sample test code

There are plenty of use cases where Mahout is used while working with

big data. Some of the examples include building a recommendation

engine, classification engine, performing market basket analysis,

etc…. The typical process could be like:

1. Provisioning a cluster on Windows Azure (HDInsight)

2. Getting the data for analysis from source (using APIs, torrents,

etc…)

3. Extracting the data we need from the gathered data

4. Writing the mapreduce (depending upon the requirement, number of

map/reduce tasks)

5. Building the machine learning engine using Mahout
Was this review helpful? Yes NoThank you for your feedback. Report this reviewThank you, this review has been flagged.
Anonymous

Posted Fri Dec 27 00:00:00 EST 2013

I did not had go time to go through the all the chapters. The in

I did not had go time to go through the all the chapters. The initial chapters discuss about general big data need and available Hadoop distributions. Next chapters define how to deploy HDInsight and HDInsight Cluster Adminstration..

Have some examples added and content is followed along those line.

Was this review helpful? Yes NoThank you for your feedback. Report this reviewThank you, this review has been flagged.

If you find inappropriate content, please report it to Barnes & Noble

Overview

Available on NOOK devices and apps

NOOK Devices

Samsung Galaxy Tab 4 NOOK

NOOK HD/HD+ Tablet

NOOK

NOOK Color

NOOK Tablet

Tablet/Phone

NOOK for Windows 8 Tablet

NOOK for iOS

NOOK for Android

NOOK Kids for iPad

PC/Mac

NOOK for Windows 8

NOOK for PC

NOOK for Mac

All Available Formats & Editions

More About This Book

Overview

Product Details

Meet the Author

Customer Reviews

5 Star

4 Star

3 Star

2 Star

1 Star

HDInsight Essentials [NOOK Book]

Overview

Available on NOOK devices and apps NOOK Devices Samsung Galaxy Tab 4 NOOK NOOK HD/HD+ Tablet NOOK NOOK Color NOOK Tablet Tablet/Phone NOOK for Windows 8 Tablet NOOK for iOS NOOK for Android NOOK Kids for iPad PC/Mac NOOK for Windows 8 NOOK for PC NOOK for Mac

All Available Formats & Editions

More About This Book

Overview

Product Details

Related Subjects

Meet the Author

Customer Reviews

5 Star

4 Star

3 Star

2 Star

1 Star

I would like to congratulate Mr. Rajesh Nadipalli for publishing

I did not had go time to go through the all the chapters. The in

Available on NOOK devices and apps

NOOK Devices

Samsung Galaxy Tab 4 NOOK

NOOK HD/HD+ Tablet

NOOK

NOOK Color

NOOK Tablet

Tablet/Phone

NOOK for Windows 8 Tablet

NOOK for iOS

NOOK for Android

NOOK Kids for iPad

PC/Mac

NOOK for Windows 8

NOOK for PC

NOOK for Mac