Big Data Management Technologies

This course focuses on the concept of “Big data” and studies modern techniques and storage platforms for their management at Internet scale. Specifically, during this course we will study: Large-scale system architectures: Peer-to-Peer and Cloud Computing. Databases on the Internet: Relational, parallel and distributed databases, with emphasis on distributed file system technologies (HDFS), NoSQL (HBase, Cassandra), graph-databases (Neo4j), NewSQL. Execution models over large amounts of data (MapReduce, BSP) and platforms that implement them (Hadoop, Hama, Spark, etc.). Applications of the above and distributed algorithm implementation.
Code Hours Type eClass Semester
DCS262 4 Elective e-Class 7

Bibliography:

  • “Εξόρυξη από Μεγάλα Σύνολα Δεδομένων, Anand Rajaraman, Jeffrey David Ullman”eudoxus link
  • ” Η ΜΕΘΟΔΟΣ PAGERANK ΤΗΣ GOOGLE ΚΑΙ ΑΛΛΑ ΣΥΣΤΗΜΑΤΑ ΚΑΤΑΤΑΞΗΣ ΙΣΤΟΣΕΛΙΔΩΝ, LANGVILLE AMY, MEYER CARL”eudoxus link