MTech Natural Language Processing And Text Mining syllabus for 2 Sem 2020 scheme 20SCE243

Module-1 OVERVIEW AND LANGUAGE MODELLING 0 hours

OVERVIEW AND LANGUAGE MODELLING:

Overview: Origins and challenges of NLP-Language and Grammar-Processing Indian Languages- NLP Applications-Information Retrieval. Language Modelling: Various Grammar- based Language Models-Statistical Language Model.

Module-2 WORD LEVEL AND SYNTACTIC ANALYSIS 0 hours

WORD LEVEL AND SYNTACTIC ANALYSIS:

Word Level Analysis: Regular Expressions-FiniteState Automata-Morphological Parsing-Spelling Error Detection and correction-Words and Word Classes-Part-of Speech Tagging. Syntactic Analysis: Context-free Grammar-Constituency- ParsingProbabilistic Parsing.

A d v e r t i s e m e n t
Module-3 Extracting Relations from Text 0 hours

Extracting Relations from Text:

From Word Sequences to Dependency Paths: Introduction, Subsequence Kernels for Relation Extraction, A Dependency-Path Kernel for Relation Extraction and Experimental Evaluation. Mining Diagnostic Text Reports by Learning to Annotate Knowledge Roles: Introduction, Domain Knowledge and Knowledge Roles, Frame Semantics and Semantic Role Labelling, Learning to Annotate Cases with Knowledge Roles and Evaluations. A Case Study in Natural Language Based Web Search: InFact System Overview, The GlobalSecurity.org Experience.

Module-4 Evaluating Self-Explanations in iSTART 0 hours

Evaluating Self-Explanations in iSTART:

Word Matching, Latent Semantic Analysis, and Topic Models: Introduction, iSTART: Feedback Systems, iSTART: Evaluation of Feedback Systems, Textual Signatures: Identifying Text-Types Using Latent Semantic Analysis to Measure the Cohesion of Text Structures: Introduction, Cohesion, Coh-Metrix, Approaches to Analysing Texts, Latent Semantic Analysis, Predictions, Results of Experiments. Automatic Document Separation: A Combination of Probabilistic Classification and Finite-State Sequence Modelling: Introduction, Related Work, Data Preparation, Document Separation as a Sequence Mapping Problem, Results. Evolving Explanatory Novel Patterns for Semantically Based Text Mining: Related Work, A Semantically Guided Model for Effective TextMining.

Module-5 INFORMATION RETRIEVAL AND LEXICAL RESOURCES 0 hours

INFORMATION RETRIEVAL AND LEXICAL RESOURCES:

Information Retrieval: Design features of Information Retrieval Systems-Classical, Non classical, Alternative Models of Information Retrieval – valuation Lexical Resources: World Net-Frame Net- Stemmers-POS Tagger- Research Corpora.

 

Course outcomes:

At the end of the course the student will be able to:

  • Analyze the natural language text. Generate the natural language.
  • Demonstrate Text mining.
  • Apply information retrieval techniques.

 

Question paper pattern:

The SEE question paper will be set for 100 marks and the marks scored will be proportionately reduced to 60.

  • The question paper will have ten full questions carrying equal marks.
  • Each full question is for 20 marks.
  • There will be two full questions (with a maximum of four sub questions) from each module.
  • Each full question will have sub question covering all the topics under a module.
  • The students will have to answer five full questions, selecting one full question from each module.

 

Textbook/ Textbooks

1 Natural Language Processing and Information Retrieval TanveerSiddiqui, U.S. Tiwary Oxford University Press 2008

2 Anne Kao and Stephen R. Potee Natural LanguageProcessing andText Mining Springer-Verlag London Limited 2007

 

Reference Books

1 Speech and Language Processing: Anintroduction to Natural Language Processing, Computational Linguistics and SpeechRecognition Daniel Jurafsky and James H Martin Prentice Hall 2008 2nd Edition

2 Natural Language Understanding James Allen Benjamin/Cumming spublishing company 2nd edition, 1995

3 Information Storage and Retrieval systems Gerald J. Kowalski and Mark.T. Maybury Kluwer academic Publishers 2000.

4 Natural Language Processing with Python Steven Bird, Ewan Klein, Edward Loper O'Reilly Media 2009

5 Foundations of Statistical Natural Language Processing Christopher D.Manning and HinrichSchutze MIT Press 1999