Kurtosis and expected returns
In my last post, I stated my intention to write a series of posts about skew.
Slight change of plan, since one loyal reader suggested that I write about kurtosis. I thought that might be fun, since I haven't thought about kurtosis much, and the literature on kurtosis isn't as well developed. It turns out that considering both together leads to some very interesting results.
The plan is to basically repeat my preceding analysis of skew for kurtosis. Then my next posts on this problem will talk each skew AND kurtosis. Hope that makes some type of feel.
Not "everything you continually wanted to realize about kurtosis, but we're afraid to ask", however enough to understand this put up
The first four moments of a distribution are:
- suggest
- trendy deviation
- skew
- kurtosis
In laymans terms, those define:
- how nice or poor the "center"* of a distribution is
- how wide the distribution is
- how symettric the distribution is, or isn't
- whether the distribution is typically lumped into the middle, or spreads out to the rims
* be aware I've used the comically obscure time period 'center' here to avoid any mean vs median arguments
High kurtosis then means extreme occasions are greater not unusual than a vanilla Gaussian distribution might propose. High kurtosis way fat tails. High kurtosis approach every economic time collection, ever.
Interpreting the first 3 moments is pretty easy, kurtosis less so. In this post I'm going to be using the standard kurtosis measure used in pandas. For reasons to boring to go into, the kurtosis of a normal distribution is 3. The pandas measure looks at excess kurtosis, so a figure of 0 means 'the normal amount of kurtosis'*, and any positive number means more than that. What does a kurtosis of 1.0 mean? Or 10.0? No idea really, since I don't have any intuition for the figure - one of the reasons to do this post is to get some.
* It's worth checking this yourself by generating random Gaussian data and measuring the kurtosis. I leave this as an exercise for the reader.
I won't be repeating the code on account that it is equal to the preceding submit, however with the string 'skew' replaced with 'kurtosis'.
Niggle: Not pretty genuine, thanks to the bizzare and ever converting pandas API. These will all supply averages over the entire Series:
percentage_returns[code].Mean()
percentage_returns[code].std()
percentage_returns[code].Skew()
percentage_returns[code].Kurtosis(0) # omitting the 0 will supply a rolling discern(!)
percentage_returns[code].Kurt() # equal as kurtosis(0)
Variance in kurtosis estimates
So how accurately can we measure kurtosis? Here are some bootstrapped distributions of sample variance. Firstly, here it is for EURUSD, which has a relatively low kurtosis:

Now right here are US 2 12 months bonds, which has a noticeably high kurtosis:

Somewhat unsurprisingly, the better the Kurtosis the broader the estimate. Big kurtosis way outliers; resampling means we're going to from time to time catch the outliers and occasionally gets greater than our honest proportion of them: so a huge variant in potential kurtosis.
Let's do a boxplot for the whole thing:

Unlike skew there are not any obvious styles right here; belongings we'd count on to have comparable kurtosis (like US bond futures) are all over the shop: twenty years are low, 5 years have a bit extra, 10 years a bit extra once more, and 2 years have actually hundreds of the stuff.
Some of that is due to time various regimes, a problem in order to leave later within the publish. For instance right here are US 2 year daily returns:

Pretty crazy in the economic crisis, and then they calm down relatively. Measuring kurtosis publish 2009 is likely to offer a totally special answer.
Do belongings with typically better kurtosis have better returns?
It changed into smooth to tell a story approximately skew choice; negative skew is virtually a terrible factor and we should be rewared for containing it.
Should we receives a commission for high kurtosis? There are two possibilities. If the kurtosis is coming from a wonderful fat tail we might expect human beings to overpay for the threat to 'win a lottery': they may opt for excessive kurtosis, and there can be better expected returns for low kurtosis assets. But if the kurtosis is coming from a poor fats tail then humans will dislike it lots.
Anyway, on common we're now not paid for kurtosis:
![]() |
| X-axis, kurtosis over whole sample. Y-axis: average daily return |
Ignoring the two vol markets, whose kurtosis is nowhere near as awful as their skew, the relationship seems ... Slightly poor?
Here's a boxplot showing the distribution of resampled daily returns for high kurtosis (over five.5) versus low kurtosis (under 5.Five) devices:

It does looks like low kurtosis is better than excessive, suggesting the 'lottery price ticket choice' is maintaining up right here: humans overpay for high kurtosis. But we want to circumstance on skew to determine whether that hypothesis is accurate or now not.
Incidentally in case you are worried approximately the vol markets, VIX is available in as low kurtosis and V2X as excessive.
Does the approximately finding nonetheless preserve for danger adjusted returns, i.E. Sharpe Ratios?

Does an asset with currently better kurtosis outperform one which has lower present day kurtosis? (time collection forecasting)
As with skew I'm going to measure kurtosis over distinct time intervals; from per week as much as a 12 months of historical returns. Then I will do a t-test to look if belongings that currently have better kurtosis than the global median (approximately ) outperform people with decrease kurtosis.
First the Sharpe Ratios, conditional on current kurtosis:

It looks as if, not like skew, the desire for excessive kurtosis is some thing that looks most strongly at brief horizons.
Now the t-data, evaluating low and excessive kurtosis:

Basically noise.
How do modern-day skew and kurtosis forecast future returns? (time collection forecasting)
As I mentioned above there is a big difference between high kurtosis coming from positive returns, and the same from negative returns. Perhaps we will see something more interesting if we look at the combination of skew and kurtosis.
Same as before, distinctive frequencies, but this time we look at both skew and kurtosis preceeding the date when we estimate a forward looking SR. First Sharpe Ratios:

This is without doubt the maximum exciting graph so far.
(Sure, however pretty a low bar to conquer...)
Remember we had hypothesis about kurtosis:
- People dislike lumpy returns, and want to avoid them. High kurtosis should always pay more than low kurtosis.
- People only dislike lumpy returns if they're negative. They're happy to pay more for lottery tickets. High kurtosis should outperform low kurtosis for positive skew assets. For negative skew assets the relationship should be reversed
Let's summarise the findings of this graph and the preceding publish:
- Negative skew* assets outperform, most strongly at longer horizons (from my previous blog post).
- High kurtosis assets underperform, most strongly at shorter horizons (discussed earlier)
- Within assets that have high kurtosis, at short horizons positive skew is rewarded. At longer horizons there is nothing meaningful.
- Within assets that have low kurtosis, at longer horizons negative skew is rewarded. At shorter horizons there is nothing meaningful.
- Within assets that have positive skew, at short horizons high kurtosis is rewarded
- Within assets that have negative skew, kurtosis is irrelevant
* by the way I'm the use of zero as my skew cutoff here for simplicity, as inside the preceding post I decided it failed to make plenty distinction. For kurtosis there is no 'herbal' cutoff, so I'm sticking to the historical sample median of around five.5
Or to put it any other way:
- The dominance of negative skew assets at longer horizons is only relevant for assets with low kurtosis.
- The outperformance of high kurtosis assets at shorter horizons is only relevant for assets with positive skew.
I decided to boil the above down to two simple buying and selling regulations:
The skew rule
{(skew - Average skew) / Sigma [Skew]}
* sign (Kurtosis - common kurtosis)
The kurtosis rule
(Kurtosis - Average kurtosis) / Sigma [Kurtosis]
* sign (Skew - average skew)
... Where for this specification the average is throughout all devices and all past records (presently done absolutely in sample, but in a trading rule may be primarily based on an expanding window), and sigma is a fashionable deviation primarily based on the past history of all instruments (the sigma is not critical now, however will be whilst we come to design buying and selling regulations to make certain we have properly normalised forecasts).
This have the benefit of being tremendously parsimonious and symettrical, albeit a chunk non linear. There is still potentially an trouble with implicit fitting, but we will cope with that during later posts.
Under these situations we would have the following positions in both guidelines:
- High kurtosis, high skew: Both Long (profitable at short horizons)
- High kurtosis, low skew: Short (Does relatively badly at short horizons)
- Low kurtosis, high skew: Short (Does relatively badly especially at long horizons)
- Low kurtosis, low skew: Long (does relatively well at long horizons)
(It's viable to mix these into a unmarried rule, but I just like the idea of having a skew and a kurtosis rule and the outcomes work differently at various horizons)
Let's study the t-information:

(For instance 'pos skew rule' is the kurtosis rule carried out whilst skew is high-quality and so forth; surely this should be 'pos skew, kurtosis rule' however you get the concept).
Here fantastic t-statistic method a rule is running. It looks as if all the rules paintings quite properly at a one month frequency, with the skew rule running in particular properly for longer durations while kurtosis is low.
Does an asset with specific skew / kurtosis than everyday perform better than common (normalised time collection)?
We can modify the guidelines above in order that instead of the use of the common across all belongings we can truely use the common for a given device (we also can regulate the standard deviation as soon as we get to generating actual forecasts).

Interestingly the regulations appear to be horrific on the unique sweet spot, despite the fact that skew conditioned on low kurtosis still does very well at longer horizons.
Now allow's demean at the modern-day average throughout all property:


Ouch. A quite terrible performance. The skew policies (pink and green) specifically are very sensitive to frequency.
Finally permit's do the equal element, but this time demean by using the modern median skew and kurtosis for a given asset magnificence.


Again, no longer surely the exceptional end result.
Summary
- It's hard to estimate kurtosis with any certainty, even harder when kurtosis is large (outliers)
- Unexpectedly we don't get paid for owning assets with high kurtosis
- .... and then it gets complicated
Yes, there's an awful lot of results in this put up!
The key finding is that, as you could count on, skew and kurtosis have greater forecasting energy whilst they're conditioned on each other. Generally we want to very own instruments which have had excessive kurtosis and relatively superb skew: these are lottery tickets which for a few reason the market undervalues. We additionally want to very own instruments that have low kurtosis and comparatively negative skew; right here we get rewarded for poor skew without suffering too many outliers. Instruments where skew and kurtosis are in opposite instructions are much less appealing.
These consequences do not persist that well whilst we use one of a kind demeaning strategies, in contrast to in skew world in which they keep up pretty well.
It's well worth reflecting on what I even have achieved thus far. In the last post I taken into consideration four one of a kind skew buying and selling guidelines (outright, time series demean, move sectional demean, asset magnificence cross sectional demean). In this one I've successfully come up with another 8: 4 for skew conditioned on kurtosis, four for kurtosis conditioned on skew. That's a complete of 12 special buying and selling policies, each of which doubtlessly has 6 exceptional versions for unique lookbacks.
Though it would be tempting to select a few of these for further testing that would be implicit fitting; I would be doing so based on the analysis I have done so far having looked at all the data. Instead the right and proper thing will be to take forward all 12 rules into an analysis where their risk weights are fitted systematically in a backward looking framework. So that's the next post.
