Module Handbook

  • Dynamischer Default-Fachbereich geändert auf INF

Module INF-24-53-M-6

Distributed Data Management (M, 4.0 LP)

Module Identification

Module Number Module Name CP (Effort)
INF-24-53-M-6 Distributed Data Management 4.0 CP (120 h)

Basedata

CP, Effort 4.0 CP = 120 h
Position of the semester 1 Sem. irreg. SuSe
Level [6] Master (General)
Language [EN] English
Module Manager
Lecturers
Area of study [INF-INSY] Information Systems
Reference course of study [INF-88.79-SG] M.Sc. Computer Science
Livecycle-State [NORM] Active

Courses

Type/SWS Course Number Title Choice in
Module-Part
Presence-Time /
Self-Study
SL SL is
required for exa.
PL CP Sem.
2V+1U INF-24-53-K-6
Distributed Data Management
P 42 h 78 h
U-Schein
ja PL1 4.0 irreg. SuSe
  • About [INF-24-53-K-6]: Title: "Distributed Data Management"; Presence-Time: 42 h; Self-Study: 78 h
  • About [INF-24-53-K-6]: The study achievement "[U-Schein] proof of successful participation in the exercise classes (ungraded)" must be obtained.
    • It is a prerequisite for the examination for PL1.

Examination achievement PL1

  • Form of examination: oral examination (20-60 Min.)
  • Examination Frequency: Examination only within the course
  • Examination number: 62153 ("Distributed Data Management")

Evaluation of grades

The grade of the module examination is also the module grade.


Contents

  • Distributed Query Processing
  • Fault Tolerance
  • Replication
  • Map Reduce (Hadoop) Fundamentals
  • Spark and SparkSQL
  • PIG and Hive
  • NoSQL: key value stores, graph databases, ...
  • Consensus algorithms (Paxos)
  • State machine replication
  • Lamport timestamps
  • CAP Theorem, BASE
  • Consistency Models
  • Vector clocks
  • Cloud Computing
  • Stream Processing (STREAM, Storm)
  • Probabilistic Counting and Data Synopses

Competencies / intended learning achievements

After successfully completing the module, students
  • can realize data analytics algorithms like Google’s PageRank algorithm, frequent n-gram counting, or near-duplicate detection using min hashing using complex APIs of Big Data processing frameworks like Apache Spark or Hadoop,
  • can install the runtime environments of these frameworks and execute the developed algorithms on real datasets,
  • acquire in-depth knowledge how such distributed systems are realized, allowing them to implement scalable and efficient solutions,
  • acquire core concepts of distributed systems, like consensus algorithms or distributed clocks, that are widely applicable concepts.

Literature

Will be announced during the course.

Requirements for attendance of the module (informal)

None

Requirements for attendance of the module (formal)

None

References to Module / Module Number [INF-24-53-M-6]

Course of Study Section Choice/Obligation
[INF-88.79-SG] M.Sc. Computer Science [Specialisation] Specialization 1 [WP] Compulsory Elective
[INF-88.79-SG] M.Sc. Computer Science [Specialisation] Specialization 1 [WP] Compulsory Elective
[INF-88.79-SG] M.Sc. Computer Science [Specialisation] Specialization 1 [WP] Compulsory Elective