Please use this identifier to cite or link to this item:
http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19484
Title: | Extending the Daphne Runtime: Lustre File System integration |
Authors: | Σταματής, Απόστολος Τσουμάκος Δημήτριος |
Keywords: | Distributed file systems Distributed systems Daphne Lustre file system IDA pipelines Data analytics |
Issue Date: | 21-Feb-2025 |
Abstract: | Recently, there has been a trend toward Integrated Data Analysis (IDA) pipelines that integrate various computational and data processing tasks within a unified framework. DAPHNE is an open and extensible system infrastructure for such IDA pipelines. This study focuses on the integration of the DAPHNE runtime with the Lustre file system. Lustre is a POSIX-compliant, object-based distributed file system, which is widely adopted in High-Performance Computing (HPC) due to its ability to handle parallel I/O operations efficiently. This integration is achieved via the development of specialized C++ kernels that support read and write operations for CSV and DAPHNE Binary Data Format (dbdf) files. The Single-File approach is selected to reduce metadata overhead and improve scalability. Experiments were conducted in an AWS-based cluster to analyze performance improvements in read/write operations, scalability with increasing worker nodes, and the impact of various optimization techniques such as stripe size adjustments, file preallocation, and stripe alignment. Results indicate that Lustre integration significantly enhances the performance of DAPHNE’s distributed runtime and enables better scalability for large datasets. |
URI: | http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19484 |
Appears in Collections: | Διπλωματικές Εργασίες - Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
DAPHNE_Lustre_Integration_Apostolis_Stamatis.pdf | 1.87 MB | Adobe PDF | View/Open |
Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.