Further Comparison of OLS and Support Vector Machine Models Out-of-Sample

by Graham Giller August 26, 2010 11:23

In the prior post we noted the outperformance of Support Vector Regression over OLS models out-of-sample. This is referred to in the Machine Learning community as their superior ability to generalize. I think that the enhanced statistical reliability coupled with the fact that the univariate response model found by the SVM departs highly from our prior prejudices regarding smooth and low order responses is quite a striking result.

In this post we seek to replicate the response function of the SVM with a high-order polynomial model. This is to investigate whether the superior forecasting skill out-of-sample arises from the lower-order “wriggles” in the SVM response function or from the higher-order “kinks.” This is interesting because we can certainly replicate the lower-order features via classical linear methods, but it is unlikely that we can do such a thing for the higher-order features of the response. Thus we define our linear polynomial model as

LaTeX provided by MathTeX at forkosh.com.

Here the Pn(x) are Legendre Polynomials of order n. These are orthogonal on [-1,1], so are a useful basis to express our response function. Because the functions are orthogonal, the estimators should be independent in expectation. In addition, following Vapnik, we reject the Occam's Razor driven methodology of standard classical statistical analysis to find a parsimonious model (what my ex-boss, Peter Muller, used to refer to as “Keep It Simple Stupid”) and find the N large enough to match the testing set R² of the SVM.

Comparison of SVM and 30th Order Legendre Polynomial Models for Daily Range

The above chart illustrates a 30th order Legendre polynomial model replicating the response of the Support Vector Machine and exhibiting an equivalent out-of-sample forecasting skill. From the point-of-view of classical inference, there is no way an analyst would ever suggest using such a high order model on this data, and the t-statistics for the βn coefficients are all small, yet this is the type of functional response picked out by the Support Vector Machine!

Comparison of SVR and OLS Models for Daily Range

by Graham Giller August 25, 2010 14:20

Continuing the recent theme on the application of Machine Learning and Interior Analysis, here we investigate the utility of Support Vector Regression methods versus Ordinary Least Squares. The job is to predict the daily range of the top ranked stock in the Compact Model Portfolio from the prior value of that metric. By Daily Range we mean the ratio of the difference between the Closing Price and Opening Price to the difference between the Highest Price and Lowest Price. I chose this particular metric because it is an interior metric but it doesn't use any microstructure information. It is also frequency discussed in non-academic literature.

The data used is the daily range computed for the top ranked stock in the Compact Model Portfolio, with the period 01/03/2001 – 12/31/2009 use as the training data and the period 01/03/2010 – 08/24/2010 used as the testing data. This division is a simple binary sample cross-validation technique. The analysis was performed in R, and the code is appended to this post.

Before discussing the chart above, which exhibits many interesting features, let's talk about the methods. On the training set, the ksvm procedure was used to execute an ε-insensitive regression and allowed to use it's default methods and tolerances. The OLS procedure lm was similarly run without user tuning. I then used both models to predict responses in the testing set and used an OLS regression of the response onto the forecasts as a simple methodology for evaluating the quality of the systems. Notable differences were found. The out-of-sample β was established to be 2.01417 ± 1.30812 for the OLS model and 1.04366 ± 0.43193 for the SVR model. The 's were 0.01526 and 0.03676, respectively. Thus the SVR model is a much more accurate performer out-of-sample, as it is advertised to be.

The chart exhibits the out-of-sample predictor and response data as well as the OLS regression line and the SVR model. We see that the SVR model contains numerous wriggles and kinks, yet my instinct is to reject the information content of these features — making the assumption that they indicate a need to tune the kernel used by the system. However, intuition is not necessarily truth, so we are in need of a procedure to establish where the superior predictive power of this model comes from. Does it come from some simple non-linearity in response that the algorithm has picked up — or is it actually due to the more funky nature of the model. One way to establish this would be to see if we can create some kind of piecewise linear model that does as well as the SVR.


require(kernlab)
training<-read.table("CMP_Training.txt",header=TRUE)
testing<-read.table("CMP_Testing.txt",header=TRUE)
training$DailyRange<-(training$ClosingPrice-training$OpeningPrice)/(training$HighestPrice-training$LowestPrice)
training$PriorDailyRange<-(training$PriorClosingPrice-training$PriorOpeningPrice)
    /(training$PriorHighestPrice-training$PriorLowestPrice)
testing$DailyRange<-(testing$ClosingPrice-testing$OpeningPrice)/(testing$HighestPrice-testing$LowestPrice)
testing$PriorDailyRange<-(testing$PriorClosingPrice-testing$PriorOpeningPrice)
    /(testing$PriorHighestPrice-testing$PriorLowestPrice)
names(training)
training$sample<-!(is.na(training$DailyRange)|is.na(training$PriorDailyRange))
testing$sample<-!(is.na(testing$DailyRange)|is.na(testing$PriorDailyRange))
summary(linmod<-lm(DailyRange~PriorDailyRange,data=training,subset=training$sample))
summary(lm(testing$DailyRange~predict(linmod,newdata=testing)))
print(svrmod<-ksvm(DailyRange~PriorDailyRange,type='eps-svr',data=training))
summary(lm(rest(testing$DailyRange)~predict(svrmod,newdata=testing)))
isort<-order(rest(testing$PriorDailyRange))
plot(testing$PriorDailyRange[testing$sample],testing$DailyRange[testing$sample],axes=TRUE,
    xlab='Prior Daily Range',ylab='Daily Range',main='Comparison of SVR and OLS Models for Daily Range',
    sub=paste('Data: Compact Model Portfolio, Top Ranked Stock; Resolution: Daily; Training:',
    training$MarkDate[1],'--',training$MarkDate[length(training$MarkDate)],'Testing:',testing$MarkDate[1],
    '--',testing$MarkDate[length(testing$MarkDate)]))
lines(c(-1.1,1.1),c(0,0),col='gray')
lines(c(0,0),c(-1.1,1.1),col='gray')
abline(linmod,col='red')
lines(rest(testing$PriorDailyRange)[isort],predict(svrmod,newdata=testing)[isort],col='blue')

Analysis of the Relative Performance of Compact Model Portfolio Members

by Graham Giller June 24, 2010 12:20

We have discussed the composition and aggregate performance of the Compact Model Portfolio in other articles in this blog. Briefly, it is composed using a ranking that results from the application of time series analysis methodologies to the total value traded rather than the stock price.

Analysis of the Relative Performance of Compact Model Portfolio Members

In the above chart we exhibit a simple conditional analysis of the average return over the prior decade of portfolio members versus their ranking number. In these series, when the stock rank changes we follow the rank and not the stock. Thus the performance of the stock ranked “1” is the performance of the first ranked stock through time and not the performance of the stock currently ranked “1” (which is currently AAPL). In the chart we do see an out-performance for the higher ranked stocks, but its statistical significance is only notable when the general trend is estimated. If each rank were considered separately, we would not accept the hypothesis that the stock has significantly outperformed the benchmark.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , ,

Empirical | Model Portfolios

Using the Bootstrap to Understand the Effect of Leverage on Drawdowns

by Graham Giller April 29, 2010 15:57

Following on from the prior article, we will now study the effect of leverage on the severity of drawdowns. This is done for a series with a positive mean daily return over the last decade — so it should produce profitable investments in expectation. We will investigate the effect of leverage on the maximum drawdown in the series by using the bootstrap to elucidate the mean relationship rather than looking at one specific realization.

Our approach is as before, with the exception that we pick the leverage for each trial at randome between zero and four. Four is the maximum leverage permitted to a Pattern Day Trader, and so that seems a sensible limit for our analysis.

Bootstrap Analysis of the Effect of Leverage on Maximum Drawdown

The above chart shows the observed relationship between the leverage used and the maximum drawdown within the nine years of trading represented by each simulation. The curve fitted is to the expression below (and is done by non-linear least squares).

LaTeX Rendered by www.forkosh.com/mathtex.html

(Here M is the maximum drawdown and L is the leverage; I don't assert that this relationship is anything other than a convenient representation of the data.) We see that with a standard margin account (2× leverage permitted), we expect a maximum drawdown of around 75% of capital and the probability of a drawdown exceeding 50% of capital is of order unity. Near the higher levels of leverage, complete drawdown (maxmimum drawdown exceeding 99% of capital) is increasingly certain. We shall present an empirical model for these probabilities in the next post.

Bootstrapping the Historical Performance of the Compact Model Portfolio

by Graham Giller April 28, 2010 13:45

The Bootstrap is a technique for simulating the sampling distribution of a statistic invented by Bradley Efron. It is a technique that attempts to solve the following problem: the empirical p.d.f. of a dataset clearly rejects common parametrical representations or the statistic we are computing has a population distribution that is analytically difficult or impossible to compute; however, the statistic is useful and we need to estimate it's sampling distribution to place confidence limits on the observed value.

The method is discussed in many places, such as Efron's excellent little book The Jackknife, the Bootstrap, and Other Resampling Plans, but I will summarize it briefly: we simulate data drawn from the empirical distribution function of the data by resampling with replacement of the actual data. This is clearly not as good as sampling from the population distribution function, but there are strong theorems governing the convgence of the e.d.f. to the p.d.f. and it does allow us to produce monte-carlo simulations of data with all of the measured properties of the sample (although the procedure is a little more complicated in the presence of serially correlated data). It is important to note that the replacement is an important step — it means that the properties of the simulations we create do not exactly match the actual sample and that allows us to estimate quantities such as the bias of an estimator.

Bootstrap Analysis of the Compact Model Portfolio

The above charts show our use of The Bootstrap to analyze the series of daily returns of the Compact Model Portfolio. The upper chart shows five simulated total return time series (black) and the actual total return time series (red). The returns are accumulated and the dispersion of the final states due to a fortunate run of returns is very evident. The histograms show the distributions of the maximum drawdown and Sharpe Ratio for each simulated series. This are both popular metrics for quant. traders and are examples of statistics with awkward sampling distributions that a traditional analysis only gives use one opportunity to compute from historical data. The maximum drawdown histogram is fitted to the Gamma Distribution, and the Sharpe Ratio histogram to the Student's t Distribution. We learn from these charts that the standard deviation of the Sharpe Ratio is approximately equal to it's sample value and that the probability of a maximum drawdown exceeding 25% of capital is close to unity. It would not be possible to obtain this information via other methods.

Finally, I would like to acknowledge Greg Laughlin for stimulating my interest in using the Bootstrap method.

Does the Compact Model Traded Portfolio Track the Compact Model Portfolio Index

by Graham Giller April 27, 2010 10:54

I recently changed several things regarding the Compact Model Portfolio (in particular the hedging strategy). This change went live, at a new brokerage, on the 9th. of April. To check that things are ok, we need to verify whether the returns series from the traded portfolio are not significantly distinguishable from the index — i.e. we need to investigate whether the traded portfolio is accurately tracking the desired index performance.

We have a small data set, so far, of only 12 days, so we need to be careful about our inferences. However, we have three tools to use:

  • we can look at the time series of the total returns of both series and see if they appear similar (an eyeball or ballpark test);
  • we can perform a linear regression of the traded portfolio daily returns onto the index portfolio daily returns, and apply statistical tests to investigate the null hypothesis (α,β) = (0,1) i.e. perfect tracking
  • we can use the two-sided Kolmogorov-Smirnov test to investigate whether the empirical distributions of the respective daily return series are consistent with eachother.

Before presenting our results, let's discuss our expectations for the alternate hypothesis (that the traded portfolio does not track the index portfolio). The traded portfolio has frictions — brokerage fees, financing fees etc. — that are not represented in the ideal portfolio. In addition, the treatment of dividends is different. The ideal portfolio receives the dividend on the ex-dividend date, which is done to prevent spurious returns due to the expected ex-dividend drop; whereas, the traded portfolio will experience the ex-dividend drop and then receive the dividend income on the settlement date. The former effect should lead to a linear-regression α of less than zero. The latter effect should be a depression of the β below unity.

Compact Model Portfolio Verification Tests

For the brief data sample we have, we cannot reject the null hypothesis. However, we have now established the toolset needed to investigate this issue, and will return to it after more time has elapsed.

Powered by BlogEngine.NET 1.4.5.0
Theme by Mads Kristensen | Modified by Mooglegiant


Month List


RecentComments

Comment RSS

About the Author

Graham Giller - Headshot GRAHAM GILLER
Dr. Giller holds a doctorate from Oxford University in experimental elementary particle physics. His field of research was statistical astronomy using high energy cosmic rays. After leaving Oxford, he worked in the Process Driven Trading Group at Morgan Stanley, as a strategy researcher and portfolio manager. He then ran a CTA/CPO firm which concentrated on trading eurodollar futures using statistical models. From 2004, he has managed a private family investment office. In 2009, he joined a California based hedge fund startup, concentrating on high frequency alpha and volatility forecasting. A detailed resume is available.

Pages


Disclaimer

Nothing on this site should be construed as a reccommendation to buy or sell any specific security nor as a solicitation of an order to buy or sell any specific security. Before making any trade for any reason you should consult your own financial advisor. The author may hold long or short positions in any of the securities discussed either before or after publication of an article mentioning such a security.

Copyright Notice

All post on this blog are © Copyright property of Giller Investments (New Jersey), LLC. All comments are the property of their respective authors and neither the author or this blog nor any entity associated with him are responsible for or accept any responsibility for their content. Offensive comments and spam may be removed at the authors discretion.

Data provided on this blog or through links to this blog are either property of Giller Investments (New Jersey), LLC or publicly available or derived from data that is publically available. Any data that is proprietary to Giller Investments (New Jersey), LLC is published here for the public interest and may be reproduced for private research or in public forums provided that suitable attribution and acknowledgement of ownership is made.

Privacy Policy

We use third-party advertising companies to serve ads when you visit our website. These companies may use information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by these companies, click here.