Year of Graduation
2024
Level of Access
Open Access Thesis
Embargo Period
5-16-2025
Department or Program
Computer Science
First Advisor
David Byrd
Abstract
High-frequency financial time series forecasting is only lightly explored in academic literature. Challenges arise from the nature of the data, which is noisy, voluminous, time-dependent, and sequential. This paper proposes a clustering framework for such data utilizing data partitioning for deep learning model training. We perform a comparative analysis using multiple distance-based clustering methods and time series-specific distance metrics to select training data for recurrent neural forecasting models. Evaluating our approach over a three year period for three large-cap technology stocks, we find that models trained on the partitioned data achieve lower loss values and increased directional prediction accuracy compared to equivalent models trained without partitioning.