Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19623
Full metadata record
DC FieldValueLanguage
dc.contributor.authorΣιδέρης, Κωνσταντίνος-
dc.date.accessioned2025-06-27T08:38:37Z-
dc.date.available2025-06-27T08:38:37Z-
dc.date.issued2025-06-23-
dc.identifier.urihttp://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19623-
dc.description.abstractAs organisations increasingly adopt lakehouse architectures to support big data analytics, understand- ing the performance trade-offs of utilising enhanced storage layers instead of standard data lake ar- chitectures is essential. This masters dissertation aims to present a comprehensive performance eval- uation of two leading data lakehouse solutions, Delta Lake and Apache Hudi, focusing on both batch and stream processing workloads. Through the benchmarking process, we compare Delta Lake and Hudi against standard data lake implementations, which consist of a simple storage layer queried by an analytics engine, in this case, HDFS and Apache Spark. Being built on top of data lakes, lakehouses leverage their strengths, while simultaneously, introducing new features, such as ACID transactions, schema enforcement, schema evolution and data governance mechanisms, to address the issues data lakes face. Additionally, they introduce optimisations, such as indexing, data skipping, and parti- tion pruning, to further improve them. Throughout this thesis, we present these features and through benchmarks, evaluate how they improve performance and whether the added functionalities justify the use of lakehouses, even in cases where they may underperform.en_US
dc.languageenen_US
dc.subjectBig Dataen_US
dc.subjectData Lakesen_US
dc.subjectBatch Processingen_US
dc.subjectStream Processingen_US
dc.subjectDelta Lakeen_US
dc.subjectApache Hudien_US
dc.titleData Lakehouse Performance Studyen_US
dc.description.pages81en_US
dc.contributor.supervisorΤσουμάκος Δημήτριοςen_US
dc.departmentΤομέας Τεχνολογίας Πληροφορικής και Υπολογιστώνen_US
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
03118134_thesis.pdf1.34 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.