20MCA352 Big data Analytics syllabus for MCA



A d v e r t i s e m e n t

Module-1 Big Data and Analytics 0 hours

Big Data and Analytics

Example Applications, Basic Nomenclature, Analysis Process Model, Analytical Model Requirements , Types of Data Sources, Sampling, Types of Data Elements, Data Exploration, Exploratory Statistical Analysis, Missing Values, Outlier Detection and Treatment, Standardizing Data Labels, Categorization

Module-2 Big Data Technology 0 hours

Big Data Technology

Hadoop’s Parallel World, Data discovery, Open source technology for Big Data Analytics, Cloud and Big Data, Predictive Analytics, Mobile Business Intelligence and Big Data, Crowd Sourcing Analytics, Inter- and Trans-Firewall Analytics.

Module-3 Meet Hadoop 0 hours

Meet Hadoop

Data, Data Storage and Analysis, Comparison with Other Systems, RDBMS, Grid Computing, Volunteer Computing, A Brief History of Hadoop, Apache Hadoop and the Hadoop Ecosystem Hadoop Releases Response.

Module-4 The Hadoop Distributed File system 0 hours

The Hadoop Distributed File system

The Design of HDFS, HDFS Concepts, Blocks, Namenodes and Datanodes, HDFS Federation, HDFS High-Availability, The Command-Line Interface, Basic Filesystem Operations, Hadoop Filesystems Interfaces, The Java Interface, Reading Data from a Hadoop URL, Reading Data Using the FileSystem API, Writing Data, Directories, Querying the Filesystem, Deleting Data, Data Flow Anatomy of a File Read, Anatomy of a File Write, Coherency Model, Parallel Copying with distcp Keeping an HDFS Cluster Balanced, Hadoop Archives.

Module-5 A Weather Dataset 0 hours

A Weather Dataset ,Data Format, Analysing the Data with Unix Tools, Analyzing the Data with Hadoop, Map and Reduce, Java MapReduce, Scaling Out, Data Flow, Combiner functions, Running a Distributed MapReduce Job, Hadoop Streaming, Hadoop Pipes, Compiling and Running, Developing a MapReduce Application, The Configuration API, Combining Resources, Variable Expansion, Configuring the Development Environment, Managing Configuration, GenericOptionsParser, Tool and ToolRunner, Writing a Unit Test, Mapper, Reducer, Running Locally on Test Data, Running a Job in a Local Job Runner, Testing the Driver, Running on a Cluster, Packaging, Launching a Job, The MapReduce Web UI, Retrieving the Results, Debugging a Job, Hadoop Logs, Remote Debugging.

 

Question Paper Pattern:

• The Question paper will have TEN questions

• Each full question will be for 20 marks

• There will be 02 full questions (with maximum of four sub questions) from each module.

• Each full question will have sub questions covering all the topics under a module.

• The students will have to answer FIVE full questions, selecting one full question from each module.

 

Textbooks

1. Bart Baesens, “Analytics in a Big Data World: The Essential Guide to Data Science and its Applications” Wiley.

2. Michael Minelli, Michehe Chambers, “Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today’s Businesses”, 1st Edition, Michael Minelli, Michele Chambers, AmbigaDhiraj, Wiley CIO Series, 2013.

3. Tom White, “Hadoop: The Definitive Guide”, 3rd Edition, O’reilly, 2012.

 

References

1.Boris Lublinsky, Kevin T. Smith, Alexey Yakubovich, “Professional Hadoop Solutions”, Wiley, ISBN: 9788126551071, 2015.

2. Chris Eaton, Dirk deroos et al., “Understanding Big data”, McGraw Hill, 2012.

3. Vignesh Prajapati, “Big Data Analytics with R and Haoop”, Packet Publishing 2013.

4. Tom Plunkett, Brian Macdonald et al, “Oracle Big Data Handbook”, Oracle Press, 2014.

 

Course Outcomes:

CO1: Identify the business problem for a given context and frame the objectives to solve it through data analytics tools.

CO2: Apply various algorithms for handling large volumes of data.

CO3: Illustrate the architecture of HDFS and explain functioning of HDFS clusters.

CO4: Analyse the usage of Map-Reduce techniques for solving big data problems.

CO5: Conduct experiment with various datasets for analysis / visualization and arrive at valid conclusions.

Last Updated: Tuesday, January 24, 2023