Big Data & Hadoop Training Course Content

Overview / Description: Hadoop developer training builds your development skills in big data domain. Hadoop is a free source, Java based programming framework that assists the processing and storage of highly large data sets in a distributed computing environment. Hadoop developer jobs is same as the software developer and their roles are also similar but the domain in which they work is different.

Prerequisites / Eligibility: There are no prerequisites for Hadoop knowledge. This course is designed for people proficient with Object Oriented Programming language (OOPS) and JAVA.

Detailed Course Content :

Chapter 01:Introduction

  • 1.1 Bigdata Introduction
  • 1.2 Distributed System
  • 1.3 BigData Use cases
  • 1.4 Various Solutions
  • 1.5 Overview of Hadoop Echo System

Chapter 02: ZooKeeper

  • 2.1 ZooKeeper -Race Condition
  • 2.2 ZooKeeper – Dead Lock
  • 2.3 Use Cases
  • 2.4 When not to Use

Chapter 03: HDFS

  • 3.1 Why HDFS and Why not existing File system
  • 3.2 HDFS – Name Node and Data Node
  • 3.3 Advanced HDFS concepts (HA, Federation)
  • 3.4 Data Locality (RAC awareness

Chapter 04: YARN

  • 4.1 YARN – Why not existing tools
  • 4.2 YARN – Evolution from MapReduce 1.0
  • 4.3 Resource Management – YARN Architecture
  • 4.4 Advanced Concept – Speculative Execution

Chapter 05: MapReduce Basics

  • 5.1 MapReduce-Understanding Sorting
  • 5.2 MapReduce-Overview
  • 5.3 Example 1- Word Frequency Problem without MapReduce
  • 5.4 Example2- Only Mapper – Image Resizing
  • 5.5 Example3- Word Frequency Problem
  • 5.6 Exampe4 – Temperature Problem
  • 5.7 Example5 – Multiple Reducer
  • 5.8 Example6 – Java MapReduce walk through

Chapter 06: MapReduce – Advanced

  • 6.1 Writing MapReduce Code Using Java
  • 6.2 Building MapReduce project using Ant
  • 6.3 Concept – Associative and Commutative
  • 6.4 Example1- Combiner
  • 6.5 Example2- Hadoop Streaming
  • 6.6 Example3 – Advanced Problem Solving – Anagrams
  • 6.7 Example4 -Advanced Problem Solving – Same DNA
  • 6.8 Example5 -Advanced Problem Solving – Similar DNA
  • 6.9 Example6-Joins – Voting
  • 6.10. Limitations of MapReduce

Chapter 07: Analyzing Data with Pig

  • 7.1 Pig – Introduction
  • 7.2 Pig – Modes
  • 7.3 Getting Started
  • 7.4 Example – NYSE (NewYork Stock Exchange)
  • 7.5 Concept – Lazy Evolution

Chapter 08: rocessing Data with Hive

  • 8.1 Hive Introduction
  • 8.2 Hive – Data Type
  • 8.3 Getting Started
  • 8.4 Loading Data in Hive Tables
  • 8.5 Example: Movelens Data Processing
  • 8.6 Advanced Concepts – Views
  • 8.7 Connecting Tableau and Hive Server2
  • 8.8 Connecting Microsoft Excel and Hive Server2
  • 8.9 Project: Sentiment Analysis of Twitter Data
  • 8.10. Advanced – Partitioned Tables
  • 8.11. Understanding HCatelog & Impala

Chapter 09: NoSQL and HBase

  • 9.1 NoSQL – Scaling Out /Up
  • 9.2 NoSQL – ACID properties and RDBMS
  • 9.3 CAP Theorem
  • 9.4 HBase Architecture – Region Servers
  • 9.5 HBase Data Model – Column Family Orientedness
  • 9.6 Getting Started – Creating tables and adding Data
  • 9.7 Example – Google Link Storage
  • 9.8 Bloom Filter – Concept
  • 9.9 Comparison of NoSQL Databases.

Chapter 10: importing Data with Sqoop, Flume and Oozie

  • 10.1 sqoop – Introduction
  • 10.2 Sqoop Import – MySQL to HDFS
  • 10.3 Exporting from MySQL to HDFS
  • 10.4 Unbound Dataset Processing or Stream processing
  • 10.5 Flume Overview –
  • 10.6 Source , Sink, Channel
  • 10.7 Example1: Data from Local Network service into HDFS
  • 10.8 Example2: Extracting Twitter Data
  • 10.9 Creating workflow with Oozie

Chapter 11: Fundamentals of Scala

  • 11.1 Scala – Quick Introduction
  • 11.2 Scala – Quick Introduction – Variables and Methods
  • 11.3 Getting Started: Interactive, Compilation, SBT
  • 11.4 Types, Variables & Values
  • 11.5 Functions
  • 11.6 Collections
  • 11.7 Classes
  • 11.8 Parameters

Chapter 12: Spark Basics

  • 12.1 Spark Introduction – Why Spark?
  • 12.2 Using the Spark Shell
  • 12.3 Example 1 – Performing Word Count
  • 12.4 Understanding Spark Cluster Modes on YARN
  • 12.5 RDDs (Resilient Distributed Datasets)
  • 12.6 General RDD Operations: Transformations & Actions
  • 12.7 RDD lineage
  • 12.8 RDD Persistence Overview
  • 12.9 Distributed Persistence
  • Member Testimonials

    Imitiaz Aftaab

    Thanks for awesome knowledge sharing. change my life - from working at a gas station to awesome job.
    2017-08-07T06:59:23+00:00
    Thanks for awesome knowledge sharing. change my life - from working at a gas station to awesome job.

    ganyindia

    wat a way to present.....Definitely a highly inspiring presentation...Thanks you very much Mate....Looking for more videos...Cheers
    2017-06-30T12:28:50+00:00
    wat a way to present.....Definitely a highly inspiring presentation...Thanks you very much Mate....Looking for more videos...Cheers

    Renuka

    "i found your videos are very informative...you are to the point and crystal clear   "
    2017-06-30T12:01:20+00:00
    "i found your videos are very informative...you are to the point and crystal clear   "

    Amanda Pedro

    Really Helped me to crank my interview
    2017-06-30T12:36:01+00:00
    Really Helped me to crank my interview

    kk1780

    im just starting outand love your videos. I did a short practical course and now want work. Do you find beginners get work easily? or... Read More
    2017-06-30T12:30:33+00:00
    im just starting outand love your videos. I did a short practical course and now want work. Do you find beginners get work easily? or is it really hard at the moment?
  •  

Leave a Reply

Your email address will not be published. Required fields are marked *