Apache Hadoop YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop 2

Apache Hadoop YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop 2

by Arun Murthy, Vinod Vavilapalli, Doug Eadline, Joseph Niemiec
     
 

View All Available Formats & Editions

“This book is a critically needed resource for the newly released Apache Hadoop 2.0, highlighting YARN as the significant breakthrough that broadens Hadoop beyond the MapReduce paradigm.”
—From the Foreword by Raymie Stata, CEO of Altiscale

The Insider’s Guide to Building Distributed, Big Data Applications with

See more details below

Overview

“This book is a critically needed resource for the newly released Apache Hadoop 2.0, highlighting YARN as the significant breakthrough that broadens Hadoop beyond the MapReduce paradigm.”
—From the Foreword by Raymie Stata, CEO of Altiscale

The Insider’s Guide to Building Distributed, Big Data Applications with Apache Hadoop™ YARN

Apache Hadoop is helping drive the Big Data revolution. Now, its data processing has been completely overhauled: Apache Hadoop YARN provides resource management at data center scale and easier ways to create distributed applications that process petabytes of data. And now in Apache Hadoop™ YARN, two Hadoop technical leaders show you how to develop new applications and adapt existing code to fully leverage these revolutionary advances.

YARN project founder Arun Murthy and project lead Vinod Kumar Vavilapalli demonstrate how YARN increases scalability and cluster utilization, enables new programming models and services, and opens new options beyond Java and batch processing. They walk you through the entire YARN project lifecycle, from installation through deployment.

You’ll find many examples drawn from the authors’ cutting-edge experience—first as Hadoop’s earliest developers and implementers at Yahoo! and now as Hortonworks developers moving the platform forward and helping customers succeed with it.

Coverage includes

  • YARN’s goals, design, architecture, and components—how it expands the Apache Hadoop ecosystem
  • Exploring YARN on a single node
  • Administering YARN clusters and Capacity Scheduler
  • Running existing MapReduce applications
  • Developing a large-scale clustered YARN application
  • Discovering new open source frameworks that run under YARN

Read More

Editorial Reviews

From the Publisher

" This book is a desperately needed resource for administrators, developers, and power-users of the Hadoop YARN framework. It does an excellent job of documenting the (often unknown) history that inevitably lead up to YARN from previous versions of Hadoop, which provides a valuable canvas against which to present the remaining pragmatically-oriented text. Moving from the history of YARN, it wisely jumps right into getting the reader up and running with their own YARN setup (on a single machine or on a larger cluster) such that the rest of the text is not merely conjecturing, but real guidance for a real instance of YARN. Chapters 7 and 8 were the ones I was most looking forward to in the text from the start, as those "core" components of YARN are some of the ones which are least understood and yet concurrently most impacting on performance. They did not disappoint."

- Ellis H. Wilson III, Storage Scientist

Product Details

ISBN-13:
9780321934505
Publisher:
Addison-Wesley
Publication date:
04/02/2014
Series:
Addison-Wesley Data & Analytics Series
Pages:
336
Sales rank:
1,301,830
Product dimensions:
6.90(w) x 9.00(h) x 0.90(d)

Customer Reviews

Average Review:

Write a Review

and post it to your social network

     

Most Helpful Customer Reviews

See all customer reviews >