Using random data

Start

As you might expect I spend quite a lot of my time using real financial data - asset prices and returns; and returns from live and simulated trading. It may surprise you to know that I also spend time examining randomly created financial data.

This publish explains why. I also give an explanation for how to generate random facts for various purposes the use of each python and excel. I'll then provide you with an example of making use of random statistics; to draw conclusions about drawdown distributions and other go back statistics.

This is the first in a chain of three posts. I intend to observe up with 3 greater posts illustrating the way to use random records - the second may be on 'trading the equity curve' (essentially adjusting your gadget risk relying in your modern-day overall performance), and the 1/3 illustrating why you must use robust out of pattern portfolio allocation techniques (again blanketed in bankruptcy 4 of my book).

Why random information is a superb element

As systematic traders we spend a honest amount of our time searching backwards. We take a look at backtests - simulations of what might have occurred if we ran our buying and selling gadget inside the past. We then draw conclusions, along with 'It could had been better to drop this trading rule variation as it seems to be exceedingly correlated with this different one', or 'This buying and selling rule variation holds it is positions for only a week', or 'The maximum drawdown I should anticipate is round 20%' or 'If I had stopped trading as soon as I'd lost five%, and commenced Once more as soon as I become flat, then I'd make extra cash'.

However it is important to bear in mind that any backtest is to an extent, random. I like to think of any backtest that I have ran as a random draw from a massive universe of unseen back tests. Or perhaps a better way of thinking about this is that any financial price data we have is a random draw from a massive universe of unseen financial data, which we then run a backtest on.

This is an essential idea because any conclusions you would possibly draw from a given backtest also are going to randomly depend upon precisely how that backtest grew to become out. For instance certainly one of my puppy hates is overfitting. Overfitting is when you tune your strategy to at least one specific backtest. But whilst you truly start buying and selling you will get some other random set of fees, that is not likely to seem like the random draw you had with your unique backtest.

As this guy would probably say, we are easily fooled by randomness:

Despite his exceptional efforts inside the interview Nassim did not get the Apple CEO activity when Steve retired. Source: poptech.Org

In truth without a doubt whenever you look at an fairness curve you have to draw mistakes bars around each fee to remind you of this underlying uncertainty! I'll truely do that later in this put up.

There are 4 different methods to cope with this hassle (aside from ignoring it of path):

Forget about using your actual records, if you have not got enough of it to attract any meaningful conclusions

Use as many specific samples of real facts as possible - fit across a couple of devices and use long histories of price records not only a few years.

Resample your real data to create more data. For example suppose you want to know how likely a given drawdown is. You could resample the returns from a strategies account curve, see what the maximum drawdown was, and then look at the distributions of those maximum drawdowns to get a better idea. Resampling is quite fun and useful, but I won't be talking about it today.

Generate large amounts of random data with the desired characteristics you want, and then analyse that.

This post is about the fourth method. I am a big fan of using randomly generated data as much as possible when designing trading systems. Using random data, especially in the early stages of creating a system, is an excellent way to steer clear of real financial data for as long as possible, and avoid being snared in the trap of overfitting.

The 3 forms of random statistics

There are 3 kinds of random information that I use:

Random price data, on which I then run a backtest. This is good for examining the performance of individual trading rules under certain market conditions. We need to specify: the process that is producing the price data which will be some condition we like (vague I know, but I will provide examples later) plus Gaussian noise.

Random asset returns. The assets in question could be trading rule variations for one instrument, or the returns from trading multiple instruments. This is good for experimenting with and calibrating your portfolio optimisation techniques. We need to specify: The Sharpe ratio,standard deviation and correlation of each asset with the others.

Random backtest returns for an entire strategy. The strategy that is producing these returns is completely arbitrary. We need to specify: The Sharpe ratio, standard deviation and skew of strategy returns (higher moments are also possible)

Notice that I am not going to cover on this put up:

Price processes with non Gaussian noise
Generating prices for multiple instruments. So this rules out testing relative value rules, or those which use data from different places such as the carry rule. It is possible to do this of course, but you need to specify the process by which the prices are linked together.
Skewed asset returns (or indeed anything except for a normal, symmettric Gaussian distribution). Again in principle it's possible to do this, but rather complicated.
Backtest returns which have ugly kurtosis, jumps, or anything else.

Generally the more complicated your random model is, the more you will have to calibrate your random model to real data to produce realistic results, and you will start to lose the benefits.

In an abstract sense then we need in order to generate:

Price returns from some process plus gaussian noise (mean zero returns) with some standard deviation
Multiple gaussian asset returns with some mean, standard deviation and correlation
Backtest returns with some mean, standard deviation and skew

Generating random statistics

Random charge information

To make random rate collection we want two elements: an underlying system and random noise. The underlying manner is some thing that has the traits we are interested in checking out our buying and selling policies in opposition to. On top of that we upload random gaussian noise to make the fee series more practical.

The 'scale' of the system is unimportant (although you may scale it against a familiar asset charge if you want), however the ratio of that scale to the volatility of the noise is crucial.

Underlying processes can take many forms. The most effective version would be a flat line (wherein case the final fee manner might be a random walk). The next simplest version would be a unmarried, secular, fashion. Clearly the final charge series could be a random stroll with float. The process may also for example be a sharp drop.

Rather than use those trivial cases the spreadsheet and python code I've created illustrate techniques with trends occuring on a regular basis.

Using the python code right here is the underlying system for one month tendencies over a 6 month length (Volscale=zero.Zero will add no noise, and so display us the underlying technique):

ans=arbitrary_timeseries(generate_trendy_price(Nlength=a hundred and eighty, Tlength=30, Xamplitude=10.Zero, Volscale=zero.0)).Plot()

Here is one random collection with noise of 10% of the amplitude delivered on:

ans=arbitrary_timeseries(generate_trendy_price(Nlength=180, Tlength=30, Xamplitude=10.0, Volscale=0.10)).plot()

And here Once more another random rate collection with five times the noise:

ans=arbitrary_timeseries(generate_trendy_price(Nlength=180, Tlength=30, Xamplitude=10.0, Volscale=0.50)).plot()

Random correlated asset returns

Spreadsheet: There is a exquisite aid here; others are available on the internet

My python code is right here

It's for three property but you must be capable of adapt it effortlessly sufficient.

Here is an instance of the python code walking. Notice the code generates returns, these are cumulated up to make an equity curve.

SRlist=[.5, 1.0, 0.0]

clist=[.9,.5,-0.5]

threeassetportfolio(plength=5000, SRlist=SRlist, clist=clist).cumsum().plot()

We'll be using this kind of random price data in the final post of this series (why you should use robust out of sample portfolio optimisation techniques).

Random backtest returns (fairness curve) with skew

Spreadsheet: There are some of resources at the net displaying how skewed returns may be generated in excel along with this one.

My python code

From my python code here is an equity curve with an predicted Sharpe Ratio of 0.5 and no skew (gaussian returns):

cum_perc(arbitrary_timeseries(skew_returns_annualised(annualSR=0.Five, want_skew=0.0, size=2500))).Plot()

Now the identical sharpe however with skew of 1 (normal of a quite fast fashion following device):

cum_perc(arbitrary_timeseries(skew_returns_annualised(annualSR=zero.5, want_skew=1.Zero, length=2500))).Plot()

Here is a backtest with Sharpe 1.0 and skew -three (common of a short gamma approach together with relative fee fixed earnings buying and selling or option selling):

cum_perc(arbitrary_timeseries(skew_returns_annualised(annualSR=1.0, want_skew=-3.Zero, size=2500))).Plot()

Ah the joys of compounding. But look out for the short, sharp, shocks of negative skew.

We'll use this python code Once more in the example on the cease of this publish (and in the subsequent put up on trading the fairness curve).

Safe use of random records

What can we use random statistics for?

Some matters that I even have used random records for within the past encompass:

Looking at the correlation of trading rule returns
Seeing how sensitive optimal parameters are over different backtests
Understanding the likely holding period, and so trading costs, of different trading rules
Understanding how a trading rule will react to a given market event (eg 1987 stock market crash being repeated) or market environment (rising interest rates)
Checking that a trading rule behaves the way you expect - picks up on trends of a given length, handles corners cases and missing data, reduces positions at a given speed for a reversal of a given severity
Understanding how long it takes to get meaningful statistical information about asset returns and correlations
Understanding how to use different portfolio optimisation techniques; for example if using a Bayesian technique calibrating how much shrinkage to use (to be covered in the final post of this series)
Understanding the likely properties of strategy account curves (as in the example below)
Understanding the effects of modifying capital at risk in a drawdown (eg using Kelly scaling or 'trading the equity curve' as I'll talk about in the next post in this series)

[If you've read my book, "Systematic Trading", then you'll recognise many of the applications listed here]

What shouldn't we use random data for?

Random data cannot tell you how profitable a trading rule was in the past (or will be in the future... but then nothing can tell you that!). It can't tell you what portfolio weights to give to instruments or trading rule variations. For that you need real data, although you should be very careful - the usual rules about avoiding overfitting, and fitting on a pure out of sample basis apply.

Random data in a method design workflow

Bearing in mind the above I'd use a combination of random and real statistics as follows while designing a buying and selling method:

Using random data design a bunch of trading rule variations to capture the effects I want to exploit, eg market trends that last a month or so
Design and calibrate method for allocating asset weights with uncertainty, using random data (I'll cover this in the final post of this series)
Use the allocation method and real data to set the forecast weights (allocations to trading rule variations) and instrument weights; on a pure out of sample basis (eg expanding window).
Using random data decide on my capital scaling strategy - volatility target, use of Kelly scaling to reduce positions in a drawdown, trade the equity curve and so on (I'll give an example of this in the next post of this series).

Notice we only use real data Once - the minimum possible.

Example: Properties of lower back tested fairness curves

To end this put up permit's examine a easy instance. Suppose you want to realize how likely it is that you may see sure returns in a given live buying and selling file; based totally on your backtest. You is probably interested in:

The common return, volatility and skew.
The likely distribution of daily returns
The likely distribution of drawdown severity

To do this we're going to assume that:

We know roughly what sharpe ratio to expect (from a backtest or based on experience)
We know roughly what skew to expect (ditto)
We have a constant volatility target (this is arbitrary, let's make it 20%) which on common we achieve
We reduce our capital at risk when we make losses according to Kelly scaling (i.e. a 10% fall in account value means a 10% reduction in risk; in practice this means we deal in percentage returns)

Python code is here

Let's go through the python code and notice what it's far doing (some traces are ignored out for readability). First we create 1,000 fairness curves, with the default annualised volatility goal of 20%. If your laptop is slow, or you haven't any staying power, sense free to lessen the range of random curves.

Length_backtest_years=10

number_of_random_curves=one thousand

annualSR=0.Five

want_skew=1.0

random_curves=[skew_returns_annualised(annualSR=annualSR, want_skew=want_skew, size=length_bdays)

for NotUsed in range(number_of_random_curves)]

We can then plot those:

plot_random_curves(random_curves)

show()

Each of the mild strains is a single equity curve. Note to make the outcomes clearer right here I am including up percentage returns, in place of making use of compound interest properly. Alternatively I could graph the fairness curves on a log scale to get the identical photograph.

All have an anticipated Sharpe Ratio of zero.5, however over 10 years their overall go back levels from losing 'all' their capital (not in exercise if we are the usage of Kelly scaling), to creating three times our initial capital (once more in exercise we would make more).

The dark line shows the common of those curves. So on common we should expect to make our preliminary capital returned (20% vol goal, improved by means of Sharpe Ratio of 0.5, over 10 years). Notice that the cloud of equity curves across the common offers us a demonstration of ways unsure our real performance over 10 years may be, despite the fact that we understand for sure what the anticipated Sharpe Ratio is. This photo alone have to convince you of ways any person backtest is just a random draw.

* Of course in the real world we don't know what the true Sharpe Ratio is. We just get one of the lighted coloured lines when we run a backtest on real financial data. From that we have to try and infer what the real Sharpe Ratio might be (assuming that the 'real' Sharpe doesn't change over time ... ). This is a very important point - never forget it.

Note that to do our analysis we cannot simply have a look at the statistics of the black line. It has the right mean return, but it is manner too smooth! Instead we want to take records from every of the lighter strains, and then observe their distribution.

Magic moments

Mean annualised return

function_to_apply=np.Imply

effects=pddf_rand_data.Observe(function_to_apply, axis=zero)

## Annualise (now not usually wished, relies upon on the statistic)

outcomes=[x*DAYS_IN_YEAR for x in results]

hist(outcomes, one hundred)

As you'd expect the common return clusters around the expected value of 10% (With a Sharpe ratio of 0.5 and an annualised volatility target that is what you'd expect). But it's not unlikely that even over 10 years we'd see losses.

Volatility

function_to_apply=np.Std

effects=pddf_rand_data.Observe(function_to_apply, axis=zero)

## Annualise (now not usually wished, relies upon on the statistic)

results=[x*ROOT_DAYS_IN_YEAR for x in results]

hist(outcomes, one hundred)

The realised annual standard deviation of returns is much more stable. Of course this isn't realistic. It doesn't account for the fact that a good trading system will reduce it's risk when opportunities are poor, and vice versa. It assumes we can always target volatility precisely, and that volatility doesn't unexpectedly change, or follow a different distribution to the skewed Gaussian we are assuming here.

But all those things practice equally to the mean - and this is nevertheless extremely risky even within the simplified random global we're using right here.

Skew

import scipy.Stats as st

function_to_apply=st.Skew

effects=pddf_rand_data.Observe(function_to_apply, axis=zero)

hist(outcomes, one hundred)

Skew is also quite strong (on this simplified random global). I plotted this to verify that the random manner I'm the use of is reproducing the proper skew (in expectation).

Return distribution

function_to_apply=np.Percentile

function_args=(100.Zero/DAYS_IN_YEAR,)

consequences=pddf_rand_data.Practice(function_to_apply, axis=0, args=function_args)

hist(outcomes, one hundred)

This graph answers the question "Over a 10 year period, how bad should I expect my typical worst 1 in 250 business day (Once a year) loss to be?" (with the usual caveats). As an exercise for the reader you can try and reproduce similar results for different percentile points, and different return periods.

Drawdowns

Average drawdown

results=[x.avg_drawdown() for x in acccurves_rand_data]

hist(outcomes, one hundred)

So most of the time an common drawdown of around 10 - 20% is expected. However there are some terrifying random equity curves where your common drawdown is over 50%.

Really horrific drawdowns

outcomes=[x.Worst_drawdown() for x in acccurves_rand_data]

hist(outcomes, one hundred)

So over 10 years you'll probably get a worst drawdown of around 40%. You might look at the backtest and think "I can live with that". However it's not impossible that you'll lose three quarters of your capital at times if you're unlucky. You really do have to be prepared to lose all your money.

Clearly I may want to preserve and calculate all types of fancy data; but I'm hoping you get the concept and the code is with a bit of luck clean enough.

What's next

As promised there will be two more posts in this series. In the next post I'll look at using random backtest returns to see if 'trading the equity curve' is a good idea. Finally I'll explore why you should use robust portfolio optimisation techniques rather than any alternative.

Finish

Bagikan ke Facebook

Instal Aplikasi

HPK

Menu Navigasi