Please use this identifier to cite or link to this item:
http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19786
Title: | Smart Systems for Transportation Applications |
Authors: | Δούκας, Αλέξανδρος Τσανάκας Παναγιώτης |
Keywords: | STG-Tx Transformer, Spatio-temporal graph forecasting, Short-term traffic prediction, Flash Attention, Node subsampling, Mixed precision training (AMP), Masked MAE / RMSE / MAPE |
Issue Date: | 9-Oct-2025 |
Abstract: | Accurate short-term traffic forecasting is a cornerstone of modern intelligent transportation systems, enabling proactive congestion management, adaptive signal control, and informed route guidance. Existing deep-learning models deliver high accuracy on small or moderately sized road networks, yet they struggle to scale to state-wide sensor graphs that comprise thousands of nodes and years of high-frequency data. This diploma thesis addresses that gap by designing and evaluating STG-Tx, a streamlined Spatio-Temporal Graph Transformer tailored for large-scale traffic flow prediction. STG-Tx couples a dual-path attention architecture—temporal-first and spatial-second—with efficiency techniques such as patchified sensor sequences and FlashAttention, reducing the quadratic memory footprint of classical transformers by two orders of magnitude. An end-to-end pipeline was implemented in PyTorch: historical flow records from the LargeST-CA dataset (8 600 loop detectors, 5-minute resolution, 2017–2021) were re-indexed, imputed, and Z-score normalised on GPUs before being streamed to training jobs. Node subsampling, mixed-precision arithmetic (AMP), and efficient batching further lowered the hardware barrier for experimentation. Model performance was assessed on a held-out 2020–2021 test split using the standard metrics MAE, RMSE, and MAPE across twelve 5-minute horizons. STG-Tx achieved an overall MAE of 30.2, RMSE of 39.7, and MAPE of 8.3 %, placing it in line with or outperforming several established spatio-temporal baselines on LargeST-CA. Training throughput reached ≈280 samples s⁻¹, while inference for the full 8 600-sensor graph completed in ≈120 ms, with a peak GPU memory footprint of approximately 22 GB. These results confirm the practical scalability of the proposed design. The thesis contributes (i) a reproducible large-scale preprocessing and training toolkit, (ii) an efficient transformer variant for spatio-temporal graphs, and (iii) an empirical study that clarifies the trade-offs between accuracy, memory, and runtime on the largest publicly available traffic benchmark. |
URI: | http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19786 |
Appears in Collections: | Διπλωματικές Εργασίες - Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
diplomatiki.pdf | 821.41 kB | Adobe PDF | View/Open |
Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.