Big Data & Hadoop Training Course Content

Overview / Description: Hadoop developer training builds your development skills in big data domain. Hadoop is a free source, Java based programming framework that assists the processing and storage of highly large data sets in a distributed computing environment. Hadoop developer jobs is same as the software developer and their roles are also similar but the domain in which they work is different.

Prerequisites / Eligibility: There are no prerequisites for Hadoop knowledge. This course is designed for people proficient with Object Oriented Programming language (OOPS) and JAVA.

Detailed Course Content :

Chapter 01:Introduction

  • 1.1 Bigdata Introduction
  • 1.2 Distributed System
  • 1.3 BigData Use cases
  • 1.4 Various Solutions
  • 1.5 Overview of Hadoop Echo System

Chapter 02: ZooKeeper

  • 2.1 ZooKeeper -Race Condition
  • 2.2 ZooKeeper – Dead Lock
  • 2.3 Use Cases
  • 2.4 When not to Use

Chapter 03: HDFS

  • 3.1 Why HDFS and Why not existing File system
  • 3.2 HDFS – Name Node and Data Node
  • 3.3 Advanced HDFS concepts (HA, Federation)
  • 3.4 Data Locality (RAC awareness

Chapter 04: YARN

  • 4.1 YARN – Why not existing tools
  • 4.2 YARN – Evolution from MapReduce 1.0
  • 4.3 Resource Management – YARN Architecture
  • 4.4 Advanced Concept – Speculative Execution

Chapter 05: MapReduce Basics

  • 5.1 MapReduce-Understanding Sorting
  • 5.2 MapReduce-Overview
  • 5.3 Example 1- Word Frequency Problem without MapReduce
  • 5.4 Example2- Only Mapper – Image Resizing
  • 5.5 Example3- Word Frequency Problem
  • 5.6 Exampe4 – Temperature Problem
  • 5.7 Example5 – Multiple Reducer
  • 5.8 Example6 – Java MapReduce walk through

Chapter 06: MapReduce – Advanced

  • 6.1 Writing MapReduce Code Using Java
  • 6.2 Building MapReduce project using Ant
  • 6.3 Concept – Associative and Commutative
  • 6.4 Example1- Combiner
  • 6.5 Example2- Hadoop Streaming
  • 6.6 Example3 – Advanced Problem Solving – Anagrams
  • 6.7 Example4 -Advanced Problem Solving – Same DNA
  • 6.8 Example5 -Advanced Problem Solving – Similar DNA
  • 6.9 Example6-Joins – Voting
  • 6.10. Limitations of MapReduce

Chapter 07: Analyzing Data with Pig

  • 7.1 Pig – Introduction
  • 7.2 Pig – Modes
  • 7.3 Getting Started
  • 7.4 Example – NYSE (NewYork Stock Exchange)
  • 7.5 Concept – Lazy Evolution

Chapter 08: rocessing Data with Hive

  • 8.1 Hive Introduction
  • 8.2 Hive – Data Type
  • 8.3 Getting Started
  • 8.4 Loading Data in Hive Tables
  • 8.5 Example: Movelens Data Processing
  • 8.6 Advanced Concepts – Views
  • 8.7 Connecting Tableau and Hive Server2
  • 8.8 Connecting Microsoft Excel and Hive Server2
  • 8.9 Project: Sentiment Analysis of Twitter Data
  • 8.10. Advanced – Partitioned Tables
  • 8.11. Understanding HCatelog & Impala

Chapter 09: NoSQL and HBase

  • 9.1 NoSQL – Scaling Out /Up
  • 9.2 NoSQL – ACID properties and RDBMS
  • 9.3 CAP Theorem
  • 9.4 HBase Architecture – Region Servers
  • 9.5 HBase Data Model – Column Family Orientedness
  • 9.6 Getting Started – Creating tables and adding Data
  • 9.7 Example – Google Link Storage
  • 9.8 Bloom Filter – Concept
  • 9.9 Comparison of NoSQL Databases.

Chapter 10: importing Data with Sqoop, Flume and Oozie

  • 10.1 sqoop – Introduction
  • 10.2 Sqoop Import – MySQL to HDFS
  • 10.3 Exporting from MySQL to HDFS
  • 10.4 Unbound Dataset Processing or Stream processing
  • 10.5 Flume Overview –
  • 10.6 Source , Sink, Channel
  • 10.7 Example1: Data from Local Network service into HDFS
  • 10.8 Example2: Extracting Twitter Data
  • 10.9 Creating workflow with Oozie

Chapter 11: Fundamentals of Scala

  • 11.1 Scala – Quick Introduction
  • 11.2 Scala – Quick Introduction – Variables and Methods
  • 11.3 Getting Started: Interactive, Compilation, SBT
  • 11.4 Types, Variables & Values
  • 11.5 Functions
  • 11.6 Collections
  • 11.7 Classes
  • 11.8 Parameters

Chapter 12: Spark Basics

  • 12.1 Spark Introduction – Why Spark?
  • 12.2 Using the Spark Shell
  • 12.3 Example 1 – Performing Word Count
  • 12.4 Understanding Spark Cluster Modes on YARN
  • 12.5 RDDs (Resilient Distributed Datasets)
  • 12.6 General RDD Operations: Transformations & Actions
  • 12.7 RDD lineage
  • 12.8 RDD Persistence Overview
  • 12.9 Distributed Persistence
  • Member Testimonials

    Shital N

    Don't have other words what to Say....But simply....Thanks a lot, superb work.....God bless you.....
    2017-06-30T12:42:31+00:00
    Don't have other words what to Say....But simply....Thanks a lot, superb work.....God bless you.....

    Guy Noel

    I am enjoying ITeLearn videos day by day. It's a very informative program. I was able to handle 3 interviews about Selenium after viewing videos... Read More
    2017-06-30T11:37:41+00:00
    I am enjoying ITeLearn videos day by day. It's a very informative program. I was able to handle 3 interviews about Selenium after viewing videos from the first 3 sessions. I believe that I will master Selenium by the end of the last video. Thank you for sharing your knowledge.

    Krishnaveni K

    It has been good time watching videos and learning things,now i am cofident that i can learn software testing and get a job I do... Read More
    2017-08-07T06:25:14+00:00
    It has been good time watching videos and learning things,now i am cofident that i can learn software testing and get a job I do appreciate the work the team put in to getting things easy for learning. I would like to thank Karthik for his knowledge sharing.

    Basith Shaik

    Your voice,simplicity,knowledge,pre­sentation are excellent
    2017-06-30T12:30:58+00:00
    Your voice,simplicity,knowledge,pre­sentation are excellent

    tejten1

    KARTHIK IS THE BEST!!! I am 100% sure that no one else can give such a neat and clear explanation.
    2017-06-30T12:25:50+00:00
    KARTHIK IS THE BEST!!! I am 100% sure that no one else can give such a neat and clear explanation.
  •  

Leave a Reply

Your email address will not be published. Required fields are marked *