Successfully reported this slideshow.
Upcoming SlideShare
×

# My recent attempts at using GANs for simulating realistic stocks returns

993 views

Published on

A presentation for the Hong Kong Machine Learning meetup summarizing my hobby research over the past year. My goal is to be able to simulate realistic multivariate financial time series. If so, I will be able to compare different statistical methods for portfolio construction, studying complex networks, algorithmic trading, being able to do some reinforcement learning, etc. Still far from being achieved...

Published in: Economy & Finance
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

### My recent attempts at using GANs for simulating realistic stocks returns

1. 1. My recent attempts at using GANs for simulating realistic stocks returns Hong Kong Machine Learning Meetup - Season 2 Episode 4 [online] Gautier Marti HKML 8 April 2020 Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 1 / 28
2. 2. Table of contents 1 Motivations 2 My attempts at building CorrGAN Starting simple, always: The 3-dimensional case From 3D to nD, many diﬃculties arise. . . Exploring diﬀerent architectures Evaluation of CorrGAN 3 Next steps Comparison of ML-based portfolio allocation methods cCorrGAN for conditional sampling on the market state Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 2 / 28
3. 3. Section 1 Motivations Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 3 / 28
4. 4. Motivations Most ﬁnancial time series are too short! We only observe one path of history out of the many possible. As a consequence, most ﬁndings (e.g. trading algos, cross-sectional alphas, portfolio construction methods) could be over-ﬁtted to this one particular observed path. Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 4 / 28
5. 5. Monte Carlo Simulations: A set of techniques to alleviate these problems Ideally: We want to sample time series from the underlying true (multivariate) distribution. Some of the techniques available: sampling from a parametric distribution (iid, parameters ﬁt on a single path, simplistic and unrealistic distribution) [1946] bootstrapping (iid, only historical values) [1979] stationary block-bootstrapping (only historical values) [1994] GANs (less obvious assumptions, but dependent on many hyper-parameters such as its architecture) [2014] Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 5 / 28
6. 6. GANs Already presented at the meetup by Alex Lau: http://www.hkml.ai/ 2019/07/hong-kong-machine-learning-season-1-episode-12/ In ﬁnance (time series), not much yet but: https://arxiv.org/abs/1901.01751, univariate time series; https://arxiv.org/abs/1907.06673, univariate time series; For multivariate time series, i.e. capturing the joint behaviour of a large number of stocks, nothing really. CorrGAN, https://arxiv.org/abs/1910.09504, is a ﬁrst step. Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 6 / 28
7. 7. CorrGAN scope Simulating the full multivariate distribution of stocks returns, that is their joint behaviour (think correlations between the stocks), and also marginal behaviour (think their typical volatility and occasional jumps) is hard. With CorrGAN, I will only focus on their joint behaviour as captured by correlation matrices (already a major simpliﬁcation of the full dependence distribution - cf. copula theory). Goal: Sampling realistic correlation matrices which could have been estimated from real stock returns. Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 7 / 28
8. 8. Section 2 My attempts at building CorrGAN Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 8 / 28
9. 9. Subsection 1 Starting simple, always: The 3-dimensional case Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 9 / 28
10. 10. 3D CorrGAN E3 =    (ρ12, ρ13, ρ23) ∈ R3   1 ρ12 ρ13 ρ12 1 ρ23 ρ13 ρ23 1   0    http://marti.ai/ml/2019/06/23/CorrGan-3D.html http://marti.ai/ml/2019/07/01/CorrGan-3D-empirical.html OK, it works! in 3D. . . Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 10 / 28
11. 11. Subsection 2 From 3D to nD, many diﬃculties arise. . . Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 11 / 28
12. 12. How to evaluate in nD? Challenge: Not possible to visualize anymore the space of empirical and simulated correlations, how to evaluate? Several stylized facts are known about these matrices: Distribution of pairwise correlations is signiﬁcantly shifted to the positive, Eigenvalues follow the Marchenko–Pastur distribution, but for 1 a very large ﬁrst eigenvalue, 2 a couple of other large eigenvalues, Perron-Frobenius property (ﬁrst eigenvector has positive entries), Hierarchical structure of clusters, Scale-free property of the corresponding MST. http://marti.ai/ml/2019/07/15/ financial-correlations-stylized-facts.html Alternative: Compare empirical (real) and generated (fake) distributions using Topological Data Analysis https://arxiv.org/abs/1802.02664 Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 12 / 28
13. 13. Permutation invariance in neural networks? GANs rely on deep nets. Those are in general not permutation invariant. Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 13 / 28
14. 14. Why do we care about permutation invariance? Regression task: Given a set of coeﬃcients (the upper diagonal of a correlation matrix), output the sum of its values. Remark: There are n(n−1) 2 ! equivalent input vectors. If we don’t leverage permutation invariance, the number of examples is not suﬃcient for the model to “learn”. http://marti.ai/ml/2019/09/01/ correl-invariance-permutations-nn.html Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 14 / 28
15. 15. Idea 1: Build invariance directly into the NN architecture A simple neural network module based on the permutation invariance property of the sum operator one can plug into the main deep net for adding permutation invariance to it: Deep Sets https://arxiv.org/abs/1703.06114 My experience is that it is not working technology yet. Some other research supporting this claim https://arxiv.org/abs/1901.09006. Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 15 / 28
16. 16. Idea 2: Find a canonical representation Find a canonical representation, e.g. associate each of the n! equivalent correlation matrices to the same one, the representer. Arbitrary C Rij = CπS (i)πS (j) Rij = CπH (i)πH (j) Figure 1: Three equivalent correlation matrices. The leftmost one has been obtained by estimation on returns of arbitrarily ordered stocks; The one displayed in the middle has been reordered by applying the same permutation πS to the rows and columns (obtained by sorting the rows according to their sum); The rightmost one by applying the same permutation πH to the rows and columns (induced by a hierarchical clustering algorithm). Question: Are some representations better than others? Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 16 / 28
17. 17. Subsection 3 Exploring diﬀerent architectures Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 17 / 28
18. 18. MLP GAN Did not manage to make it work: The GAN converges toward generating the mean of the dataset. Empirical Generated Mean of empirical Figure 2: (Left) Flatten upper triangular of an empirical correlation matrix re-ordered by πS and displayed in Figure 1; (Center) An example of vector generated by the MLP GAN trained on 10,000 ﬂatten upper triangular of empirical correlation matrices re-ordered by πS . It seems that the model has learnt to generate an average of the empirical correlations (Right). http://marti.ai/ml/2019/09/22/ tf-mlp-gan-repr-correlation-matrices.html Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 18 / 28
19. 19. DCGAN + Hierarchical sorting ≈ CorrGAN Figure 3: Three correlation matrices; Can you guess which one is DCGAN-generated? http://marti.ai/ml/2019/10/13/ tf-dcgan-financial-correlation-matrices.html Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 19 / 28
20. 20. Subsection 4 Evaluation of CorrGAN Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 20 / 28
21. 21. Evaluation of CorrGAN As a ﬁrst evaluation, we can verify that the generated matrices verify the known stylized facts: Figure 4: (Left) Distribution of correlations; (Center) Distribution of eigenvalues; (Right) First eigenvector entries Results are summarized in the paper: https://arxiv.org/abs/1910.09504 http://marti.ai/ml/2019/10/13/ tf-dcgan-financial-correlation-matrices.html Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 21 / 28
22. 22. CorrGAN.io One can look at outputs of the model (fake) vs real empirical correlations, and try to guess which is which. Figure 5: http://www.corrgan.io/, a simple web app using Flask. Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 22 / 28
23. 23. Section 3 Next steps Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 23 / 28
24. 24. Subsection 1 Comparison of ML-based portfolio allocation methods Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 24 / 28
25. 25. Lopez de Prado HRP vs. Papenbrock-Raﬃnot HERC http://marti.ai/qfin/2019/12/04/ hierarchical-risk-parity-part-3.html http://marti.ai/qfin/2020/03/22/ herc-part-i-implementation.html Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 25 / 28
26. 26. Subsection 2 cCorrGAN for conditional sampling on the market state Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 26 / 28
27. 27. cCorrGAN - {normal, stressed, rally} market correlations We may want to sample conditional on the market state. For example, 3-modal: normal, rally, and stressed. Figure 6: Correlation matrices estimated when the market was in a normal, rally, and stress state respectively. Preparing the training set: http: //marti.ai/qfin/2020/02/03/sp500-sharpe-vs-corrmats.html Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 27 / 28
28. 28. Questions? Suggestions? Gautier Marti (HKML) GANs and ﬁnancial stock returns 8 April 2020 28 / 28