Big Data Hadoop

Current Status
Not Enrolled
Price
Free
Get Started
  • Introduction to BIGDATA and HADOOP.
  • Relation between Big Data and Hadoop.
  • What is the need of going ahead with Hadoop?
  • Scenarios to apt Hadoop Technology in REAL TIME Projects.
  • How Hadoop is addressing Big Data Changes
  • Importance of Hadoop Ecosystem Components
  • What is HDFS (Hadoop Distributed File System).
  • HDFS Architecture – 5 Daemons of Hadoop
  • Replication in Hadoop – Fail Over Mechanism
  • Hadoop Cluster Setup and JDK Installation.
  • Why is Map Reduce is essential in Hadoop?
  • MapReduce and drawbacks w.r.to Task Tracker Failure in Hadoop Cluster.
  • Map Reduce Life Cycle & Communication Mechanism of Job Tracker & Task Tracker
  •  
  • How to write a basic Map Reduce Program
  • Compression Techniques in Map Reduce
  • Unix Shell Scripting Basics and commands.
  • How Unix shell used in hadoop
  • PIG Installation (Hands on Installation on Laptops)
  • Introduction to Apache Pig
  • Map Reduce Vs Apache Pig
  • Where to Use Map Reduce and PIG in REAL Time Hadoop Projects
  • How to write a simple pig script
  • Parameter substitution in PIG Scripts
  • How to develop the Complex Pig Script
  • Bags , Tuples and fields in PIG
  • HIVE Installation(Hands on Installation on Laptops)
  • Local Mode & Clustered Mode
  • Hive Introduction and need of Apache HIVE in Hadoop
  • When to choose PIG & HIVE in REAL Time Project
  • Importance Of Hive Meta Store.
  • Communication mechanism with Metastore.
  • Hive Integration with Hadoop & Hive Query Language(Hive QL)
  • SQL VS Hive QL, Data Slicing Mechanisms and Partitions In Hive
  • Partitioning Vs Bucketing
  • Collection Data Types in HIVE
  • User Defined Functions(UDFs) in HIVE
  • UDFs, UDAFs, UDTFs and need of UDFs in HIVE
  • Hive Serializer/Deserializer – SerDe
  • Semi Structured Data Processing Using Hive(XML/JSON)
  • HIVE – HBASE Integration
  • Sqoop installation with MySQL Client
  • Introduction to Sqoop.
  • MySQL client and Server Installation
  • How to connect to Relational Database using Sqoop
  • Different Sqoop Commands
  • Hive-Imports, Incremental import,
  • import all table and import using password on file
  • Hbase introduction and HDFS Vs Hbase
  • Hbase Data modeling Elements
  • Hbase Architecture & Clients(REST,Thrift,Java Based,Avro)
  • MongoDB basics & Introduction to MongoDB
  • Features of MongoDB
  • REAL Time Use Cases on Hadoop & MongoDB Use Cases
  • What is YARN?
  • Difference between Map Reduce & YARN
  • YARN Architecture(Resource Manager,Application Master,Node Manager),
  • When should we go ahead with YARN
  • YARN Process flow and Web UI
  • Different Configuration Files for YARN.
  • What is Impala? & How can we use Impala for Query Processing?
  • When should we go ahead with Impala
  • HIVE Vs Impala
  • Real time Use Cases with Impala.
  • Interactive Scala – Scala Shell
  • Functional Programing in Scala
  • What is Functional Programming
  • Difference between Object Oriented and Functional
  • Flume Master , Flume Collector and Flume Agent
  • Real Time Use Case using Apache Flume.
  • Oozie Introduction, Oozie Architecture & Job Submission
  • Spark Vs Map Reduce Processing
  • File Operations in Spark Shell.
  • Introduction to Spark Components.
  • What is RDD and why it is important in Spark
  • Core Features of RDD & Lazily Evaluated
  • Different Operation on RDDs
  • Actions and Transformation in RDD
  • Running Spark in a Clustered Mode.
  • Introduction to Spark SQL
  • The SQL Context, Hive Vs Spark SQL
  • Introduction to Data Frames [ Dfs ]