MTech Data Warehousing And Data Mining syllabus for 2 Sem 2020 scheme 20BBI252

Module-1 Introduction to Data Warehousing 0 hours

Introduction to Data Warehousing:

Heterogeneous information, Integration problem. Warehouse architecture. Data warehousing, Warehouse vs DBMS. Aggregations: SQL and Aggregations, Aggregation functions and Grouping. Data Warehouse Models and OLAP Operations: Decision support; Data Marts, OLAP vs OLTP. Multi- Dimensional data model. Dimensional Modelling. ROLAP vs MOLAP; Star and snowflake schemas; the MOLAP cube; roll-up, slicing, and pivoting.

Module-2 Issues in Data Warehouse Design 0 hours

Issues in Data Warehouse Design:

Design issues - Monitoring, Wrappers, Integration, Data cleaning, Data loading, Materialised views, Warehouse maintenance, OLAP servers and Metadata. Building Data Warehouses: Conceptual data modeling, Entity-Relationship (ER) modeling and Dimension modeling. Data warehouse design using ER approach. Aspects of building data warehouses.

A d v e r t i s e m e n t
Module-3 Introducing Data Mining 0 hours

Introducing Data Mining:

KDD Process, Problems and Techniques, Data Mining Applications, Prospects for the Technology. CRISP-DM Methodology: Approach, Objectives, Documents, Structure, Binding to Contexts, Phases, Task, and Outputs. Data Mining Inputs and Outputs: Concepts, Instances, Attributes. Kinds of Learning, Kinds of Attributes and Preparing Inputs. Knowledge representations – Decision tables and Decision trees, Classification rules, Association rules, Regression trees & Model trees and Instance-Level representations.

Module-4 Data Mining Algorithms 0 hours

Data Mining Algorithms:

One-R, Naïve Bayes Classifier, Decision trees, Decision rules, Association Rules, Regression, K-Nearest Neighbour Classifiers.

Module-5 Evaluating Data Mining Results 0 hours

Evaluating Data Mining Results:

Issues in Evaluation; Training and Testing Principles; Error Measures, Holdout, Cross Validation. Comparing Algorithms; Taking costs into account and TradeOffs in the Confusion Matrix.

 

Course outcomes:

At the end of the course the student will be able to:

  • Learn about data warehouse design and concepts of data warehousing.
  • Understand about data mining algorithms and evaluation of data mining results.

 

Question paper pattern:

The SEE question paper will be set for 100 marks and the marks scored will be proportionately reduced to 60.

  • The question paper will have ten full questions carrying equal marks.
  • Each full question is for 20 marks.
  • There will be two full questions (with a maximum of four sub questions) from each module.
  • Each full question will have sub question covering all the topics under a module.
  • The students will have to answer five full questions, selecting one full question from each module.

 

Textbook/ Textbooks

1 Fundamentals of Data Warehouses M. Jarke, M. Lenzerini, Y. Vassiliou , Springer-Verlag P. Vassiliadis (ed.), 1999

2 Data Mining: Concepts and Techniques J. Han and M. Kamber, Morgan Kaufman 2000

 

Reference Books

1 The Data Warehouse Toolkit Ralph Kimball, Wiley 1996

2 Principles of Data Mining D. Hand, H. Mannila and P. Smyth MIT Press 2001

3 Data Mining: Introductory and Advanced Topic M. H. Dunham, Prentice Hall 2003