Introdution - Oops concept - (Object-Class-Inheritance-Polymorphism-Abstrraction-Encapsulation)
String(Concept of String - Immutable String - String Concatenation - Concept of Substring - String class methods and its usage-StringBuilder class )
Exception Handing - (try -throw-catch)Advance(throws-finally) Input and output(I/O)function
Collection - (List Map Set) interface and its algorithm - Iterator interface - Map(hash map-tree map-linked hash map-multi key map)- list(array list-linked list) Set(Hash set-tree set) Serialization - Deserialization
BigData(What,Why,Who) - 3++Vs-Overviews of Hadoop EcoSystem - Role of Hadoop in Big data - overviews of other Big Data System - Who is using Hadoop - Hadoop integrations into Exiting Software Products - Current Scenario in Hadoop Ecosystem - Installation - Configuration - UseCases of Hadoop(HealthCare,Retail,teecom)
Concepts - Architecture - Data Flow(File Read,File Write) - Fault Tolerance - Shell Commands - Java Base API - Data Flow Archives - coherency - Data Integrity - Role of Secondary NameNode
Theory - Data Flow (Map-shuffle-Reduce) - MapRed vs MapReduce APIs - Programming[Mapper,Reducer, Combiner, Partitioner] - Writable- InputFormat - Outputformat- Streaming API using python - Inherent Failure Handing using Speculative Execution - Magic of Shuffle phase - FileFormats - Sequence Files
Counter(Built IN and Custom) - Custom Input Format - Distributed Cache - Joins(MapSide,FReduceSide) - Sorting - Perfomance Tuning-GenericOptionsParser - ToolRunner - Debugging(LocalJobRunner)
Multi Node Cluster Setup using AWS Cloud Machines - Hardware Considerations - Software Considerations - Commands(fsck,job,dfsadmin)-Schedulers in job Tracker - RackAwareness Policy - Balancing - NameNode Failure and Recovery - commissioning and Decommissioning a Node - Compression Codecs
Introduction to NoSQL - CAP Theorem - Classification of NoSQL - Hbase and RDBMS - HBASE and HDFS - Architecture(Read Path,Write Path,Compactions,Splits) - Installation - Configuration - Role of Zookeeper - HBase Shell - Java Based APIs(Scan,Get,Other advanced APIs) - Introduction to Filter - RowKey Design - Map reduce Integration-performance Tuning - What's New in HBase0.98 - Backup and Disaster Recovery - Hands On
Architecture - Installation - Configuration - Hive vs RDBM - Tables - DDl - DML - UDF - UDAF - Partitioning - Bucketing - MetaStore - Hive - Hbase Integration - Hive Web Interface - Hive Server(JDBC,ODBC,Thrift) - File Formats(RCFile-ORCFile) - other SQL on Hadoop
Architecture- Installation - Hive vs Pig-Pig Latin syntax - Data Types - Functions(Eval,Load/Store,String,DateTime) - joins - PigServer - Macros - UDFs -performance - Troubleshooting - Commonly Used Functions
Architecture - Installation,commands(Import,Hive-Import,Eval,Hbase Import,Import All tables,Export)-Connectors to Existing DBs and DW
Why Flume?-Architecture,Configuration(Agents), Source(Exec-Avro-NetCat), Channels(File,Memory,JDBC,HBase), Sinks(Logger, Avro, HDFS, Hbase, FileRoll), Contextual Routing(Interceptors, Channel Selectors)-Introduction to other aggregation frameworks
Architecture, Installation, Workflow,Coordinator, Action(Mapreduce,Hive,Pig,Sqoop)-Introduction to Bundle - Mail Notifications
Limitations in Hadoop-1.0-HDFS Federation-High Availability in HDFS-HDFS Snapshots-Other Improvements in HDFS2-Introduction to YARN aka MR2-Limitations in MR1-Architecture of YARN-MapReduce Job Flow in YARN-Introduction to Stinger Initiative and Tez-BackWard Compatibility for Hadoop 1.X
Introduction to Information Retrieval - common usecases- Introduction to Solr and Lucene - Installation - Concepts(Cores,Schema,Documents,fields, Inverted Index)- Configuration - CRUD Operation requests and responses - Java Based APIs - Introduction to SolrCloud