Type to search LearningTree.com

Do you mean "{{response.correctedQuery}}" ?

Sorry, no results were found for your query.

Please check your spelling and try your search again.

 

Big Data









Preferred method of contact?

Hadoop Programming with Java for Big Data Solutions

COURSE TYPE

Practitioner

Course Number

1251

Duration

4 Days

Enroll

About This Course: The availability of large data sets presents new opportunities and challenges to organizations of all sizes. This course explains Hadoop best practices and provides the Hadoop development training and programming skills to develop solutions that run on the Apache Hadoop platform. Additionally, you learn to test and deploy Big Data solutions on commodity clusters.

You Will Learn How To

  • Implement Hadoop jobs to extract business value from large and varied data sets
  • Write, customize and deploy Java MapReduce jobs to summarize data
  • Develop Hive and Pig queries to simplify data analysis
  • Test and debug jobs using MRUnit
  • Monitor task execution and cluster health

Course Outline

  • Introduction to Hadoop
  • Identifying the business benefits of Hadoop
  • Surveying the Hadoop ecosystem
  • Selecting a suitable distribution
  • Parallelizing Program Execution

Meeting the challenges of parallel programming

  • Investigating parallelisable challenges: algorithms, data and information exchange
  • Estimating the storage and complexity of Big Data

Parallel programming with MapReduce

  • Dividing and conquering large-scale problems
  • Uncovering jobs suitable for MapReduce
  • Solving typical business problems
  • Implementing Real-World MapReduce Jobs

Applying the Hadoop MapReduce paradigm

  • Configuring the development environment
  • Exploring the Hadoop distribution
  • Creating the components of MapReduce jobs
  • Introducing the Hadoop daemons
  • Analyzing the stages of MapReduce processing: splitting, mapping, shuffling and reducing

Building complex MapReduce jobs

  • Selecting and employing multiple mappers and reducers
  • Leveraging built-in mappers, reducers and partitioners
  • Analyzing time series data with secondary sort
  • Streaming tasks through various programming languages
  • Customizing MapReduce

Solving common data manipulation problems

  • Executing algorithms: parallel sorts, joins and searches
  • Analyzing log files, social media data and e-mails

Implementing partitioners and comparators

  • Identifying network-bound, CPU-bound and disk I/O-bound parallel algorithms
  • Dividing the workload efficiently using partitioners
  • Controlling grouping and sort order with comparators
  • Collecting metrics with counters
  • Persisting Big Data with Distributed Data Stores

Making the case for distributed data

  • Achieving high performance data throughput
  • Recovering from media failure through redundancy

Interfacing with Hadoop Distributed File System (HDFS)

  • Breaking down the structure and organization of HDFS
  • Loading raw data and retrieving results
  • Reading and writing data programmatically
  • Manipulating Hadoop SequenceFile types
  • Sharing reference data with DistributedCache

Structuring data with HBase

  • Migrating from structured to unstructured storage
  • Applying NoSQL concepts with schema on read
  • Connecting to HBase from MapReduce jobs
  • Comparing HBase to other types of NoSQL data stores
  • Simplifying Data Analysis with Query Languages

Unleashing the power of SQL with Hive

  • Structuring databases, tables, views and partitions
  • Integrating MapReduce jobs with Hive queries
  • Querying with HiveQL
  • Accessing Hive servers through JDBC
  • Extending HiveQL with User-Defined Functions (UDF)

Executing workflows with Pig

  • Developing Pig Latin scripts to consolidate workflows
  • Integrating Pig queries with Java
  • Interacting with data through the grunt console
  • Extending Pig with User-Defined Functions (UDF)
  • Managing and Deploying Big Data Solutions

Testing and debugging Hadoop code

  • Logging significant events for auditing and debugging
  • Debugging in local mode
  • Validating requirements with MRUnit

Deploying, monitoring and tuning performance

  • Deploying to a production cluster
  • Optimizing performance with administrative tools
  • Monitoring job execution through web user interfaces
Show complete outline
Show Less

Course Schedule

Attend this live, instructor-led course In-Class or Online via AnyWare.

Hassle-Free Enrollment: No advance payment required.
Tuition due 30 days after your course.

Nov 1 - 4 Herndon, VA/AnyWare Enroll Now

How would you like to attend?

Live, Online via Anyware
In-Class

Dec 19 - 22 New York/AnyWare Enroll Now

How would you like to attend?

Live, Online via Anyware
In-Class

Feb 14 - 17 Toronto/AnyWare Enroll Now

How would you like to attend?

Live, Online via Anyware
In-Class

Feb 21 - 24 Herndon, VA/AnyWare Enroll Now

How would you like to attend?

Live, Online via Anyware
In-Class

Apr 4 - 7 Ottawa/AnyWare Enroll Now

How would you like to attend?

Live, Online via Anyware
In-Class

Apr 18 - 21 New York/AnyWare Enroll Now

How would you like to attend?

Live, Online via Anyware
In-Class

May 23 - 26 Toronto/AnyWare Enroll Now

How would you like to attend?

Live, Online via Anyware
In-Class

Jun 27 - 30 Herndon, VA/AnyWare Enroll Now

How would you like to attend?

Live, Online via Anyware
In-Class

Aug 15 - 18 New York/AnyWare Enroll Now

How would you like to attend?

Live, Online via Anyware
In-Class

Aug 29 - Sep 1 Toronto/AnyWare Enroll Now

How would you like to attend?

Live, Online via Anyware
In-Class

Guaranteed to Run

Bring this Course to Your Organization and Train Your Entire Team
For more information, call 1-888-843-8733 or click here

Tuition

Standard

$2990

Government

$2659

Course Tuition Includes:

After-Course Instructor Coaching
When you return to work, you are entitled to schedule a free coaching session with your instructor for help and guidance as you apply your new skills.

Free Course Exam
You can take your course exam on the last day of your course and receive a Certificate of Achievement with the designation "Awarded with Distinction."

Prev
Next

Questions

Call 1-888-843-8733 or click here »

An experienced training advisor will happily answer any questions you may have and alert you to any tuition savings to
which you or your organization may be entitled.

Training Hours

Standard Course Hours: 9:00 am – 4:30 pm
*Informal discussion with instructor about your projects or areas of special interest: 4:30 pm – 5:30 pm


FREE Online Course Exam (if applicable) – Last Day: 3:30 pm – 4:30 pm
By successfully completing your FREE online course exam, you will:

  • Have a record of your growth and learning results.
  • Bring proof of your progress back to your organization
  • Earn credits toward industry certifications (if applicable)
  • Make progress toward one or more Learning Tree Specialist & Expert Certifications (if applicable)

Enhance Your Credentials with Professional Certification

Learning Tree's comprehensive training and exam preparation guarantees that you will gain the knowledge and confidence to achieve professional certification and advance your career.

This course qualifies for 23 CPE credits from the National Association of State Boards of Accountancy CPE program. Read more ...

- ,

Prev
Next