A Monte-Carlo of I.I.D. Normal Innovations

by Graham Giller March 31, 2009 00:50

When talking about the SPX data, I glibly asserted that the data was evidently not I.I.D. normal. I then proceeded to show how the Generalized Error Distribution can be used to describe the data quite well and to reject the hypothesis that the data is I.I.D. Normal with a reasonable degree of confidence.

It occurred to me that some readers of this blog might be a little less familiar with eyeballing financial data sets, so it might be interesting to generate such a sample path in a Monte-Carlo simulation of the process.

GARCH Analysis of IID Normal Data

The above plot has four panes. In the upper left we show the aggregation of about 2500 IID Normal draws (this is approximately the number of business days in a decade). There is no drift and no heterskedasticity in this data set, it is a pure random walk. Below that is the time series of estimated GARCH volatility. You can see that the process essentially discovers the population variance of 1, but it is occasionally kicked away from that value by outlying innovations. In the upper right is the time series of innovations. In my normal analysis this is after standardizing by dividing each innovation by the standard deviation forecast, using the GARCH model, from the prior data. For this particular dataset, the standardization has no particular impact. The series of well defined fuzz essentially reperesents the time series of a homoskedastic dataset, and its visual appearance is in stark contrast to the of a heteroskedastic series, such as that presented in the earlier post about the GED.

At the bottom, I have replaced the histogram of the standardized innovations with a plot illustrating the empirical distribution function and comparing it to the cumulative distribution function. The maximum distance between these two curves, after scaling, is the test statistic used in the Kolmogorov-Smirnov test, which is a powerful, distribution fee, bin free, test for univariate distribution identification. We see a p-Value of 21% for this data, which represents the probability of finding an maximum distance at least as large as the sample, which clearly cannot be used to reject the null hypothesis in this case.

 

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , , , ,

Monte-Carlo

If Not Normal then What?

by Graham Giller March 24, 2009 20:54
In the previous post we illustrated the evident abnormality of financial data by examining the longitudenal returns of the S&P 500 Index.

 

I used the Generalized Error Distribution as it possesses the ability to be smoothly transformed from a Normal Distribution into a leptokurtotic distribution and that allowed me to use the Maximum Likelihood Ratio Test to distinguish between the null hypothesis (that the data is I.I.D. Normal) and the alternate hypothesis (that it is not).

I subscribe to the theory that if something is right you should be able to draw the same conclusions via various methods and data sets. So I am going to look again at the likely models for the innovations of financial data (we're taking a GARCH(1,1) model as given); but, this time, I decided to look at the S&P Goldman Sachs Commodity Index and to use a test based on Pearson's χ² Test. (In the following the data is actually based on the first deliverable contract on the GSCI traded at the CME.)

Before that, however, we should discuss what the possible options are the for the PDF of the process innovations. The candidates are:

  • The Normal Distribution
  • Levy Flight
  • The Generalized Error Distribution
  • Student's t Distribution
  • something else…

 

Benoît Mandelbrot discussed the long tails in the distribution of the changes in commodity prices in his early papers on finance (these are collected in Fractals and Scaling In Finance). He found that the tails appeared to exhibit a scaling property and avocated the Levy Flight as it is a stable distribution with this scaling property. He work was decades pre-GARCH and the distribution has many undesirable properties — such as a divergent variance. I will not model it here.

Clearly we view the Normal Distribution as a non-starter. The GED and the Student's t Distribution both provide excellent matches to data and can provide quite leptokurtotic forms. Student's t has some difficulties for other parts of finance theory — only moments of order less than the degrees of freedom parameter exist which makes it difficult to use the negative exponential utility. GED does not have problems with negative exponential utility.

After a long winded discussion, we will advance rapidly through the results. A GARCH(1,1) model was fitted to the daily changesof the first deliverable GSCI future (combined to produce a synthetic series). n.b. I haven't reserved an out-of-sample period for the data as I view the utility of empirical GARCH as so well established that I don't think we really need to test for it. The analysis includes a histogram of the daily innovations extracted from the data. This histogram is fitted to a p.d.f. form using a maximum likelihood method and Poisson statistics for the bin counts. After the fit, a d.o.f. corrected χ² statistic is computed.

GARCH(1,1) models for GSCI and various PDFs (3 pages)

The results are: for the Normal Distribution, χ²/dof = 766.024145/98 with a p-Value of less than 0.00000001; for the GED, 126.231861/97 with a p-Value of 0.02468288; and for the Student-t, 101.748435/97 with a p-Value of 0.35078429. Remember that for a χ² distribution with n degrees of freedom the mean is n and the variance is 2n. The Normal Distribution is convincingly rejected; the Student t is the best fit and the GED is ok — neither can be rejected at a reasonable confidence level such as 0.001.

 

Currently rated 4.0 by 2 people

  • Currently 4/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , , , , , ,

Empirical

Why Use the Generalized Error Distribution?

by Graham Giller March 19, 2009 13:53

This post is to address the question why use the Generalized Error Distribution? The subject, the evident abnormality of financial data, should be very familiar to the intended audience of this blog; but I'm going to summarize some basic facts here as there have been requests as to why the GED should be used.

Firstly, longitudenal returns of financial asset prices are evidently not described by the Normal Distribution. Many statements one hears along the lines of "a once in a hundred years"event are made in the context on comparing the scale of a realized event with its expected rate under the normal distribution. However, financial data are so clearly non-normal (more specifically not identically and independently distributed, or I.I.D., normal) that only a naive analyst would even start off an argument by discussing that hypothesis.

Abnormality of S&P 500 Returns

Even without doing any statistical tests, a cursory analysis of the time series of daily S&P 500 Index returns (the upper panel in the above figure) would suggest that the returns are not homoskedastic — or constant in variance.

The lower panel shows the best fit of the normal distribution form to a histogram of daily index returns. The fit is clearly poor, and the data shows the pattern typical of leptokurtotic data. There is a deficit of events in the sides of the distribution (in the region around ±1σ) and an excess in the centre and in the tails.

Since the data seems heteroskedastic, and since there seem to be episodes of heteroskedasticity, this data is clearly a candidate to try to fit a GARCH model to. It's possible to specify a GARCH model with normally distributed innovations, but which would give rise to the leptokurtotic distribution we observe in the histogram, so we should test for that.

I'm interested in specifying the process distribution correctly because it directly affects the relative weighting of the various data periods in any regression analysis we do. Ordinary least squares is only the correct estimation procedure when the underlying data are i.i.d. normal. This procedure assumes that deviations at the level of 3σ–5σ, or more, are highly significant and will cause the estimated parameters to be chosen to explain these particular realizations more than those in the lower range.

In the case of the data above, the regression will listen strongly to the current period, although the process realization now many not be that characteristic of the entire period. One might argue that we should just replace OLS with generalized least squares which, if we weight with the appropriate covariance matrix, is equivalent to maximum likelihood estimation which is a very powerful technique. However, this does not circumvent the problem of estimation based on the normal distribution treating 3σ–5σ residuals as very very significant whereas, under a leptokurtotic distribution, they are not particularly so.

The GED is useful because it can be smoothely transformed from a Normal distribution into a leptokurtotic distribution ("fat tails") or even into a platykurtotic distribution ("thin tails"). This allows us to use the maximum likelihood ratio test to test the hypothesis as to whether the GARCH process innovations are IID normal.

Results of MLR test for IID Normal SPX innovations

This test convincingly rejects the null hypothesis that the GARCH process innovations are normally distributed (shape=1). The estimated shape parameter, which controls the kurtosis of the distribution, is also approximately 6σ from the null hypothesis value.

In another post I will go into more depth about the various distributional choices that are available once one rejects the Normal.

 

Currently rated 4.0 by 5 people

  • Currently 4/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , , , , , ,

Empirical

Rounding --- An Implicit Buy High, Sell Low Strategy

by Graham Giller March 13, 2009 14:10
Last year, before the crash of the emerging markets – pro articulum in general – Prof. Jeremy Siegel was featured in an advert played regularly on CNBC for Wisdom Tree, talking about the inherent "buy high, sell low" strategy embedded in cap. weighted indices.

The basic problem is that when the price of a subset of the index increases then their weight relative to the rest of the index also increases. The index tracking investor is then required to buy more of those components, at their new higher price. If their prices should subsequently decline, then the index tracking investor will be required to sell a little of the investment, for the same reasoning as before, at the new lower price.

Unfortunately, stocks do regularly go up and down relative to each other and so the logic embedded in the previous paragraph represents an embedded buy high – sell low strategy which is overlaid over the basic strategy represented by the index. This is one of the defects of cap. weighted indices and will lead a fund manager that attempts to track such an index to underperform through no fault of their own.

The Markowitz Portfolio is constructed to be Mean-Variance efficient and weights components so that the expected risk-adjusted profit from each position is equal. However, cap. weighting doesn't follow any utility driven formalism and it explicitly contradicts known facts about the market (it overweights large cap. stocks whereas academic reasarch by Fama and French indicates that small cap. stocks consistently outperform).

The adverts. caught my attention because I had just tackled a similar buy high – sell low defect in the basket I own to track the Compact Model Portfolio. The portfolio that tracks the CMP Index is equally weighted, meaning that we allocate the same fraction of the overall equity to each individual investment.

Now equal weighting also has an embedded strategy, but in this case it is reversion rather than momentum. With an equal weighted basket, every time returns occur we need to reduce the position in the stocks that outperformed and increase the position in the stocks that underperformed, in order that we maintain the equal weighting. This is an embedded sell high – buy low strategy.

I was aware of this, but as I watched my basket I realized that I kept repeating the opposite. On the daily rebalance, the strategy would buy some more of a stock that went up at the end of the day and then, then next day, if it lost money, it would sell at a loss. This was repeated again and again.

I finally realized that this was because I was rounding my position into round lots, of a given size. The conventional algorithm for rounding positive numbers is to add one half and then truncate to an integer. The number of lots to hold in a given company is the fraction of the capital allocated to that company divided by the product of the price and the lot size. Following conventional ½ rounding we tend to round up after we've made money and round down after we've lost money. This is an embedded buy high – sell low strategy.

I solved this by rounding against it. I round up on a losing day and round down on a winning day. i.e.

shares=lotsize×⌊capital/(lotsize×price)−½sign δprice⌋.

This seems to work.

n.b. The notation ⌊x⌋ means floor(x) which means the largest integer less than or equal to x. 

A Brief Summary of the Compact Model Portfolio

by Graham Giller March 09, 2009 22:39
In the very first post on this blog, I referred to the Compact Model Portfolio, which is a strategy I have researched and traded for a while.

The premise of the strategy is that the stock market is a voting mechanism for trading ideas (not a radical theory, I admit.) However, in this case we specifically assume that the market is able to pick the winning stocks but that it is not very good at trading them. i.e. That we can infer from market activity what the stocks we should own are, but that the market is not sufficiently efficient to eliminate the excess return that accrues to the owner of those stocks. I call this a semi-efficient markets approach.

The strategy is described in detail in the document linked to above and I do trade based upon this strategy. I find it appealing because it is a different way of looking at the market to the paradigm followed by traditional alpha trading. Basically, we examine the dollar volume of each stock in the market and use this to create a ranking based upon the markets' interest in each company. We then cherry pick this ranking for a subset of stocks to hold in a portfolio.

Technically we are assuming that the ordinal ranks are efficiently expressed but that the cardinal ranks are not — that the market can pick the stocks to own but that it doesn't do a good job of trading them.

The purpose of this post is not to recommend this strategy as an investment vehicle for the general public. It is to highlight a different way of looking at things. Quantitative traders can get trapped into thought ruts — particularly if their methodology leads to some success for them. It's to answer the question "what does the market think I should hold?" This way, the analyst has a base portfolio to compare their holdings too that is not wedded to implicit biases and scales in the way the major market indices (as currently composed) are.

This portfolio might also not be a suitable starting point for many investors. Based on it's current composition, the market is currently interested in holding SDS and SKF, which are ultrashort (i.e. two times leveraged) exchange traded funds. As a result of this holding, the index is currently profitable for this year. I recently added a link to the historical holdings of the Compact Model Portfolio. This allows one to follow the market's preferences as they change (it should be noted that the rankings involve a time scale of several months, so members will not change rapidly — they will drift up and down).

UPDATE (14:30 PM EDT): Of course, the on day I post this SKF and SDS are having a terrible day (at the time of writing SKF is down 21% and SDS down 11%). I hope this underscores the point that this post and the data associated with it should not be taken as blind investment calls — you should always verify that an investment is right for you.  

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , ,

Systems

Refreshed Hedge Fund Data

by Graham Giller March 07, 2009 23:24
We have a new month and new data for the dynamic trading risk factor. Last month's forecast was for a profit of 96 bp; however, the realization was a loss of 96 bp.

Dynamic Trading Risk Factor


The data series estimates update slightly to a sample mean of 43 bp/month drift (which has a t-Statistic of 2.30 and a p-value of 0.024). The sample standard deviation is 187 bp/month and the simple Sharpe ratio (the t-Statistic times the square root of 12) is 0.80.

I don't want to read a lot into the estimated form for the fit of a Generalized Error Distribution, after all there are only 98 data points in total, but we note that it has a spectral index of approximately 0.5 which is indicative of a platykurtotic distribution — i.e. one with censored tails. This should be viewed skeptically as it is inconsistent with the sample excess kurtosis of 3.73.

The forecast for March'09 is a loss of 9 bp.  

Powered by BlogEngine.NET 1.4.5.0
Theme by Mads Kristensen | Modified by Mooglegiant



RecentComments

Comment RSS

About the Author

Graham Giller - Headshot GRAHAM GILLER
Dr. Giller holds a doctorate from Oxford University in experimental elementary particle physics. His field of research was statistical astronomy using high energy cosmic rays. After leaving Oxford, he worked in the Process Driven Trading Group at Morgan Stanley, as a strategy researcher and portfolio manager. He then ran a CTA/CPO firm which concentrated on trading eurodollar futures using statistical models. From 2004, he has managed a private family investment office. In 2009, he joined a California based hedge fund startup, concentrating on high frequency alpha and volatility forecasting. My updated resume is on LinkedIn.

Pages


Disclaimer

Nothing on this site should be construed as a reccommendation to buy or sell any specific security nor as a solicitation of an order to buy or sell any specific security. Before making any trade for any reason you should consult your own financial advisor. The author may hold long or short positions in any of the securities discussed either before or after publication of an article mentioning such a security.

Copyright Notice

All post on this blog are © Copyright property of Giller Investments (New Jersey), LLC. All comments are the property of their respective authors and neither the author or this blog nor any entity associated with him are responsible for or accept any responsibility for their content. Offensive comments and spam may be removed at the authors discretion.

Data provided on this blog or through links to this blog are either property of Giller Investments (New Jersey), LLC or publicly available or derived from data that is publically available. Any data that is proprietary to Giller Investments (New Jersey), LLC is published here for the public interest and may be reproduced for private research or in public forums provided that suitable attribution and acknowledgement of ownership is made.

Privacy Policy

We use third-party advertising companies to serve ads when you visit our website. These companies may use information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by these companies, click here.