- Boolean Information Retrieval (IR), TF-IDF)
- Evaluation Models (Precision, Recall, MAP, NDCG)
- Probabilistic IR, BM25
- Hypothesis testing
- Statistical language models
- Latent topic models (LSI, pLSI, LDA)
- Relevance feedback, novelty & diversity
- PageRank, HITS
- Spam detection, social networks
- Inverted lists
- Index compression, top-k query processing
- Frequent itemsets & association rules
- Hierarchical, density-based, and co-clustering
- Decision trees and Naive Bayes
- Support vector machines
Information Retrieval and Data Mining (M, 4.0 LP)
|Module Number||Module Name||CP (Effort)|
|INF-24-52-M-6||Information Retrieval and Data Mining||4.0 CP (120 h)|
|CP, Effort||4.0 CP = 120 h|
|Position of the semester||1 Sem. irreg. SuSe|
|Level|| Master (General)|
|Area of study||[INF-INSY] Information Systems|
|Reference course of study||[INF-88.79-SG] M.Sc. Computer Science|
|Type/SWS||Course Number||Title||Choice in |
|SL||SL is |
required for exa.
Information Retrieval and Data Mining
|P||42 h||78 h||
- About [INF-24-52-K-6]: Title: "Information Retrieval and Data Mining"; Presence-Time: 42 h; Self-Study: 78 h
- About [INF-24-52-K-6]:
The study achievement "[U-Schein] proof of successful participation in the exercise classes (ungraded)" must be obtained.
- It is a prerequisite for the examination for PL1.
Examination achievement PL1
- Form of examination: oral examination (20-60 Min.)
- Examination Frequency: Examination only within the course
- Examination number: 62452 ("Information Retrieval and Data Mining")
Evaluation of grades
The grade of the module examination is also the module grade.
Competencies / intended learning achievements
After successfully completing the module, students will be able to:
- explain how modern information retrieval systems are realized,
- handle unstructured, textual information, regarding human created typos, synonymy, polysemy, etc. as well as novelty aspects among documents,
- study core data mining approaches such as frequent itemset mining, decision trees, k-means clustering, and Bayesian classification, allowing them to build data analytics solutions, for instance, for smart decision making (concepts that are getting more and more important in the Big Data era).
- Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze. Introduction to Information Retrieval, Cambridge University Press, 2008
- Larry Wasserman. All of Statistics, Springer, 2004.
- Stefan Büttcher, Charles L. A. Clarke, Gordon V. Cormack. Information Retrieval: Implementing and Evaluating Search Engines
- Anand Rajaraman and Jeffrey D. Ullman. Mining of Massive Datasets, Cambridge University Press, 2011.
- supplementary literature references will be given in the lecture