Year of Graduation

2024

Level of Access

Open Access Thesis

Embargo Period

5-16-2025

Department or Program

Computer Science

First Advisor

David Byrd

Abstract

High-frequency financial time series forecasting is only lightly explored in academic literature. Challenges arise from the nature of the data, which is noisy, voluminous, time-dependent, and sequential. This paper proposes a clustering framework for such data utilizing data partitioning for deep learning model training. We perform a comparative analysis using multiple distance-based clustering methods and time series-specific distance metrics to select training data for recurrent neural forecasting models. Evaluating our approach over a three year period for three large-cap technology stocks, we find that models trained on the partitioned data achieve lower loss values and increased directional prediction accuracy compared to equivalent models trained without partitioning.

Available for download on Friday, May 16, 2025

COinS