Hadoop Online Training Course Content

Basics of Hadoop:

  1. Motivation for Hadoop
  2. Large scale system training
  3. Survey of data storage literature
  4. Literature survey of data processing
  5. Networking constraints
  6. New approach requirements

Basic concepts of Hadoop

  1. What is Hadoop?
  2. Distributed file system of Hadoop
  3. Map reduction of Hadoop works
  4. Hadoop cluster and its anatomy
  5. Hadoop demons
  6. Master demons
  7. Name node
  8. Tracking of job
  9. Secondary node detection
  10. Slave daemons
  11. Tracking of task
  12. HDFS(Hadoop Distributed File System)
  13. Spilts and blocks
  14. Input Spilts
  15. HDFS spilts
  16. Replication of data
  17. Awareness of Hadoop racking
  18. High availably of data
  19. Block placement and cluster architecture
  21. Practices & Tuning of performances
  22. Development of mass reduce programs
  23. Local mode
  24. Running without HDFS
  25. Pseudo-distributed mode
  26. All daemons running in a single mode
  27. Fully distributed mode
  28. Dedicated nodes and daemon running

Hadoop administration

  1. Setup of Hadoop cluster of Cloud era, Apache, Green plum, Horton works
  2. On a single desktop, make a full cluster of a Hadoop setup.
  3. Configure and Install Apache Hadoop on a multi node cluster.
  4. In a distributed mode, configure and install Cloud era distribution.
  5. In a fully distributed mode, configure and install Hortom works distribution
  6. In a fully distributed mode, configure the Green Plum distribution.
  7. Monitor the cluster
  8. Get used to the management console of Horton works and Cloud era.
  9. Name the node in a safe mode
  10. Data backup.
  11. Case studies
  12. Monitoring of clusters

Hadoop Development :

  1. Writing a MapReduce Program
  2. Sample the mapreduce program.
  3. API concepts and their basics
  4. Driver code
  5. Mapper
  6. Reducer
  7. Hadoop AVI streaming
  8. Performing several Hadoop jobs
  9. Configuring close methods
  10. Sequencing of files
  11. Record reading
  12. Record writer
  13. Reporter and its role
  14. Counters
  15. Output collection
  16. Assessing HDFS
  17. Tool runner
  18. Use of distributed CACHE
  19. Several MapReduce jobs (In Detailed)
  23. Identification of mapper
  24. Identification of reducer
  25. Exploring the problems using this application
  26. Debugging the MapReduce Programs
  27. MR unit testing
  28. Logging
  29. Debugging strategies
  30. Advanced MapReduce Programming
  31. Secondary sort
  32. Output and input format customization
  33. Mapreduce joins
  34. Monitoring & debugging on a Production Cluster
  35. Counters
  36. Skipping Bad Records
  37. Running the local mode
  38. MapReduce performance tuning
  39. Reduction network traffic by combiner
  40. Partitioners
  41. Reducing of input data
  42. Using Compression
  43. Reusing the JVM
  44. Running speculative execution
  45. Performance Aspects

CDH4 Enhancements :
1. Name Node – Availability
2. Name Node federation
3. Fencing
4. MapReduce – 2

1.Concepts of Hive
2. Hive and its architecture
3. Install and configure hive on cluster
4. Type of tables in hive
5. Functions of Hive library
6. Buckets
7. Partitions
8. Joins
1. Inner joins
2. Outer Joins
9. Hive UDF

1.Pig basics
2. Install and configure PIG
3. Functions of PIG Library
4. Pig Vs Hive
5. Writing of sample Pig Latin scripts
6. Modes of running
1. Grunt shell
2. Java program
8. Macros of Pig
9. Debugging the PIG

1. Difference between Pig and Impala Hive
2. Does Impala give good performance?
3. Exclusive features
4. Impala and its Challenges
5. Use cases

1. HBase
2. HBase concepts
3. HBase architecture
4. Basics of HBase
5. Server architecture
6. File storage architecture
7. Column access
8. Scans
9. HBase cases
10. Installation and configuration of HBase on a multi node
11. Create database, Develop and run sample applications
12. Access data stored in HBase using clients like Python, Java and Pearl
13. Map Reduce client
14. HBase and Hive Integration
15. HBase administration tasks
16. Defining Schema and its basic operations.
17. Cassandra Basics
18. MongoDB Basics

Ecosystem Components
1. Sqoop
2. Configure and Install Sqoop
3. Connecting RDBMS
4. Installation of Mysql
5. Importing the data from Oracle/Mysql to hive
6. Exporting the data to Oracle/Mysql
7. Internal mechanism

1. Oozie and its architecture
2. XML file
3. Install and configuring Apache
4. Specifying the Work flow
5. Action nodes
6. Control nodes
7. Job coordinator
Avro, Scribe, Flume, Chukwa, Thrift
1. Concepts of Flume and Chukwa
2. Use cases of Scribe, Thrift and Avro
3. Installation and configuration of flume
4. Creation of a sample application

Challenges of Hadoop
1. Hadoop recovery
2. Hadoop suitable cases.

10 Comments  to   Hadoop Online Training Course Content

  1. Madhumitha United states says:

    Great Hadoop Online training guys i am really happy with this training guys. and the support was really good

  2. Anuradha says:


    This is Anuradha I have taken Hadoop course from this instute

    They are supporting very well in Resume preparation and Interviews

  3. Ravindra United States says:

    Guys I was just thinking to take Hadoop classes can any one help the interview questions on Java (Is Core Java is mandatory for this Hadoop Online Training) please suggest

  4. Sridevi India says:

    I would like to learn this Hadoop Training please give me the details…

  5. Naveen Kumar USA says:

    What is the difference between TextInputFormat & KeyValueInputFormat class please give me the clarification about this i have taken Hadoop Online Training through other training institute from India but they didn’t teach real time scenarios please help me out..

  6. Sushma India says:

    What is JobTracker ? What are responsibilities of Job Tracker can some one explain me about this…!!

  7. Priyanka Nandu USA says:

    Can some one explain about Distributed Cache in Hadoop and it is useful in Mapreduce

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>