Novel Deep Reinforcement Algorithm with Adaptive Sampling Strategy for Continuous Portfolio Optimization

Szu Hao Huang*, Yu Hsiang Miao, Yi Ting Hsiao

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Quantitative trading targets favorable returns by determining patterns in historical data through statistical or mathematical approaches. With advances in artificial intelligence, many studies have indicated that deep reinforcement learning (RL) can perform well in quantitative trading by predicting price change trends in the financial market. However, most of the related frameworks display poor generalizability in the testing stage. Thus, we incorporated adversarial learning and a novel sampling strategy for RL portfolio management. The goal was to construct a portfolio comprising five assets from the constituents of the Dow Jones Industrial Average and to achieve excellent performance through our trading strategy. We used adversarial learning during the RL process to enhance the model's robustness. Moreover, to improve the model's computational efficiency, we introduced a novel sampling strategy to determine which data are worth learning by observing the learning condition. The experimental results revealed that the model with our sampling strategy had more favorable performance than the random learning strategy. The Sharpe ratio increased by 6 %-7 %, and profit increased by nearly 45 %. Thus, our proposed learning framework and the sampling strategy we employed are conducive to obtaining reliable trading rules.

    Original languageEnglish
    Article number9437210
    Pages (from-to)77371-77385
    Number of pages15
    JournalIEEE Access
    Volume9
    DOIs
    StatePublished - 2021

    Keywords

    • adversarial learning
    • Portfolio management
    • reinforcement learning

    Fingerprint

    Dive into the research topics of 'Novel Deep Reinforcement Algorithm with Adaptive Sampling Strategy for Continuous Portfolio Optimization'. Together they form a unique fingerprint.

    Cite this