Evaluating data augmentation for financial time series classification

  • Elizabeth Fons1
  • Paula Dawson2
  • Xiao-jun Zeng1
  • John Keane1
  • Alexandros Iosifidis3
  •  

      1Department of Computer Science, University of Manchester, UK.
      2AllianceBernstein, London, UK.
      3Department of Electrical and Computer Engineering, Aarhus University, Denmark.

     

    Examples of time-series data augmentation methods on a sine wave. The blue line corresponds to the original time-series and the dotted orange lines correspond to the generated time-series patterns.

    Abstract

    Data augmentation methods in combination with deep neural networks have been used extensively in computer vision on classification tasks, achieving great success; however, their use in time series classification is still at an early stage. This is even more so in the field of financial prediction, where data tends to be small, noisy and non-stationary. In this paper we evaluate several augmentation methods applied to stocks datasets using two state-of-the-art deep learning models. The results show that several augmentation methods significantly improve financial performance when used in combination with a trading strategy. For a relatively small dataset (30K samples), augmentation methods achieve up to 400% improvement in risk adjusted return performance; for a larger stock dataset (300K samples), results show up to 40% improvement.


    Cumulative profit over time (out of sample) of the LSTMNet and TLo-NBoF models trained with different augmentation methods and the baseline (no augmentation) evaluated using a simple trading strategy of buying the predicted top 10 stocks. The dataset consists of the largest 50 stocks on the S&P500 index measured by market capitalization. We focus on the most competitive techniques and for comparison, we add the benchmark calculated by the market weighted returns of the 50 constituent stocks.


    Paper

     


    Bibtex

     


    Acknowledgement

    This work was supported by the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Grant Agreement no. 675044 (http://bigdatafinance.eu/), Training for Big Data in Financial Research and Risk Management. A. Iosifidis acknowledges funding from the Independent Research Fund Denmark project DISPA (Project Number: 9041-00004).