Hadoop Big Data Training

Welcome to the Best Hadoop & Big Data Training Institute Hyderabad.

Big Data Hadoop Course Training is created by Hadoop industry experts, and it covers indepth knowledge on Hadoop and Bigdata, You will Learn Hadoop Ecosystem tools such as HDFS, YARN, Hive, MapReduce, Pig, HBase, Spark, Oozie, Flume and Sqoop.

Course Content

HADOOP ADMINISTRATION COURSE

Introduction to Bigdata and Hadoop

Introduction to big data
limitations of existing solutions
Hadoop architecture
Hadoop components and ecosystem
data loading & reading from HDFS
replication rules
rack awareness theory
Hadoop cluster administrator: Roles and responsibilities

Hadoop Architecture and cluster setup

Hadoop server roles and their usage
Hadoop installation and initial configuration
Deploying Hadoop in a pseudo-distributed mode
Deploying a multi-node Hadoop cluster
Understanding the working of HDFS and resolving simulated problems

Hadoop Cluster Administration and understanding Mapreduce

Understanding secondary namenode
Working with Hadoop distributed cluster
Decommissioning or commissioning of nodes
Understanding MapReduce

Backup, Recovery and Maintenance

Key Hadoop Admin Commands
Trash
Import Check Point
Distcp, data backup, and recovery
Enabling trash
Namespace count quota or space quota
Manual failover or metadata recovery

Capacity Planning and Management

Planning a Hadoop 2.0 cluster
Cluster sizing, hardware
Network and software considerations
Popular Hadoop distributions
Workload and usage patterns

Hadoop 2.0 Features

Limitations of Hadoop 1.x
Features of Hadoop 2.0
YARN framework
Hadoop high availability and federation
YARN ecosystem and Hadoop 2.0 Cluster setup

Setting up Hadoop 2.X with High Availability and upgrading Hadoop

Configuring Hadoop 2 with high availability
upgrading to Hadoop 2
working with Sqoop
understanding Oozie
working with Hive
working with HBase

Cloudera manager and Cluster setup, Overview on Kerberos

Hive administration
HBase architecture
HBase setup, Hadoop/Hive/HBase performance optimization
Pig setup and working with a grunt, why Kerberos and how it helps

HADOOP DEVELOPMENT COURSE

Understanding Bigdata and Hadoop

Big Data, Limitations and Solutions of existing Data Analytics Architecture,
Hadoop, Hadoop Features,
Hadoop Ecosystem, Hadoop 2.x core components,
Hadoop Storage: HDFS, Hadoop Processing:
MapReduce Framework,
Hadoop Different Distributions.

Yarn

YARN (Yet another Resource Negotiator) – Next Gen.
Map Reduce
What is YARN?
Difference between Map Reduce & YARN
YARN Architecture
Resource Manager
Application Master
Node Manager

Hadoop Architecture and HDFS

Hadoop 2.x Cluster Architecture –
Federation and High Availability,
A Typical Production Hadoop Cluster,
Hadoop Cluster Modes,
Common Hadoop Shell Commands,
Hadoop 2.x Configuration Files,
Single node cluster and Multi node cluster set up
Hadoop Administration.

Hadoop Mapreduce Frameworks

MapReduce Use Cases,
Why MapReduce,
Hadoop 2.x MapReduce Architecture,
Hadoop 2.x MapReduce Components,
YARN MR Application Execution Flow,
YARN Workflow,
Demo on MapReduce. Input Splits,
Relation between Input Splits and HDFS Blocks,
MapReduce: Combiner & Partitioner,
Demo on de-identifying Health Care Data set,
Sequence Input Format,
Xml file Parsing using MapReduce.

Pig

About Pig,
MapReduce Vs Pig,
Pig Use Cases,
Programming Structure in Pig,
Pig Running Modes,
Pig components,
Pig Execution,
Pig Latin Program,
Data Models in Pig,
Pig Data Types,
Shell and Utility Commands,
Pig Latin : Relational Operators, Group Operator, COGROUP Operator, Joins and COGROUP, Union, Diagnostic Operators, Specialized joins in Pig, Built In Functions (Eval Function, Load and Store Functions, Math function, String Function, Date Function, Pig UDF, Piggybank, Parameter Substitution ( PIG macros and Pig Parameter substitution),

Hive

Hive Background,
Hive Use Case,
About Hive, Hive Vs Pig,
Hive Architecture and Components,
Metastore in Hive,
Limitations of Hive,
Comparison with Traditional Database,
Hive Data Types and Data Models,
Partitions and Buckets,
Hive Tables(Managed Tables and External Tables),
Importing Data, Querying Data,
Managing Outputs,
Hive Script, Hive UDF,
Retail use case in Hive

Advanced Hive and Hbase

Hive QL:Joining Tables,
Dynamic Partitioning,
Hive Indexes and views
Hive query optimizers,
Hive : Thrift Server,
User Defined Functions,
HBase: Introduction to NoSQL Databases and HBase,
HBase v/s RDBMS,
HBase Components,
HBase Architecture,
Run Modes & Configuration,
HBase Cluster Deployment.

Advanced Hbase

HBase Data Model,
HBase Shell,
Data Loading Techniques,
ZooKeeper Data Model,
Zookeeper Service,
Zookeeper,
Demos on Bulk Loading,
Getting and Inserting Data,
Filters in HBase.

Sqoop

Architecture – Installation — Commands(Import ,
Hive-Import, EVal, Hbase Import,
Import All tables, Export) – Connectors to Existing DBs
Hands on Exercise

Flume

Flume Introduction
Flume Architecture
Flume Master ,
Flume Collector and Flume Agent
Flume Configurations
Real Time Use Case using Apache Flume

MongoDB (As part of NoSQL Databases)

Need of NoSQL Databases
Relational VS Non-Relational Databases
Introduction to MongoDB
Features of MongoDB
Installation of MongoDB
Mongo DB Basic operations
REAL Time Use Cases on Hadoop & MongoDB Use Cases

Oozie and Hadoop Project

Flume and Sqoop Demo,
Oozie, Oozie Components,
Oozie Workflow,
Scheduling with Oozie,
Demo on Oozie Workflow,
Oozie Co-ordinator,
Oozie Commands,
Oozie Web Console,
Oozie for MapReduce,
PIG, Hive, and Sqoop,
Combine flow of MR, PIG, Hive in Oozie,
Hadoop Project Demo,

Spark

Introduction to Apache Spark
Role of Spark in Big data
Who is using Spark
Installation of SparkShell and StandAlone Cluster
Configuration
RDD Operations (Transformations and Actions)

What They’re Saying

Students Review’s

i am thankful to Tech Marshals Academy which is one of the best Educational organization. I have undergone two highly rated courses (Big data and Hadoop, Spark and Scala). Now i am doing well with the stuff learnt, after getting certified for big data and hadoop, I’m getting many offers from many companies. After the great experience of learning hadoop technology…

Viresh Dagade

Data Engineer, HCL Technologies

Tech Marshals Academy did a great job training our employees for an upcoming Application Development project. The classes were very well paced and flexible for the crazy work schedule of our employees. I am very impressed and highly recommend Tech Marshals

D Manu

Director, Tech AI Pro Solutions

Instructor explains everything with related and practical real world use cases and examples of Hadoop implementation. thanks to tech marshals Academy.

Revanth Kumar. D

Student, Bangalore

Be future ready. Start learning

Structure your learning and get a certificate to prove it. for admission call 9133333875

Location
Tech Marshals Academy,
B2, 2nd Floor, KVR Enclave,
Beside Satyam Theater,
Above Bata Showroom,
Ameerpet, Hyderabad.
+91 9133333875 / 9133333871 / 04040034050
info@techmarshals.com