Comparison of SVR and OLS Models for Daily Range

by Graham Giller August 25, 2010 14:20

Continuing the recent theme on the application of Machine Learning and Interior Analysis, here we investigate the utility of Support Vector Regression methods versus Ordinary Least Squares. The job is to predict the daily range of the top ranked stock in the Compact Model Portfolio from the prior value of that metric. By Daily Range we mean the ratio of the difference between the Closing Price and Opening Price to the difference between the Highest Price and Lowest Price. I chose this particular metric because it is an interior metric but it doesn't use any microstructure information. It is also frequency discussed in non-academic literature.

The data used is the daily range computed for the top ranked stock in the Compact Model Portfolio, with the period 01/03/2001 – 12/31/2009 use as the training data and the period 01/03/2010 – 08/24/2010 used as the testing data. This division is a simple binary sample cross-validation technique. The analysis was performed in R, and the code is appended to this post.

Before discussing the chart above, which exhibits many interesting features, let's talk about the methods. On the training set, the ksvm procedure was used to execute an ε-insensitive regression and allowed to use it's default methods and tolerances. The OLS procedure lm was similarly run without user tuning. I then used both models to predict responses in the testing set and used an OLS regression of the response onto the forecasts as a simple methodology for evaluating the quality of the systems. Notable differences were found. The out-of-sample β was established to be 2.01417 ± 1.30812 for the OLS model and 1.04366 ± 0.43193 for the SVR model. The 's were 0.01526 and 0.03676, respectively. Thus the SVR model is a much more accurate performer out-of-sample, as it is advertised to be.

The chart exhibits the out-of-sample predictor and response data as well as the OLS regression line and the SVR model. We see that the SVR model contains numerous wriggles and kinks, yet my instinct is to reject the information content of these features — making the assumption that they indicate a need to tune the kernel used by the system. However, intuition is not necessarily truth, so we are in need of a procedure to establish where the superior predictive power of this model comes from. Does it come from some simple non-linearity in response that the algorithm has picked up — or is it actually due to the more funky nature of the model. One way to establish this would be to see if we can create some kind of piecewise linear model that does as well as the SVR.


require(kernlab)
training<-read.table("CMP_Training.txt",header=TRUE)
testing<-read.table("CMP_Testing.txt",header=TRUE)
training$DailyRange<-(training$ClosingPrice-training$OpeningPrice)/(training$HighestPrice-training$LowestPrice)
training$PriorDailyRange<-(training$PriorClosingPrice-training$PriorOpeningPrice)
    /(training$PriorHighestPrice-training$PriorLowestPrice)
testing$DailyRange<-(testing$ClosingPrice-testing$OpeningPrice)/(testing$HighestPrice-testing$LowestPrice)
testing$PriorDailyRange<-(testing$PriorClosingPrice-testing$PriorOpeningPrice)
    /(testing$PriorHighestPrice-testing$PriorLowestPrice)
names(training)
training$sample<-!(is.na(training$DailyRange)|is.na(training$PriorDailyRange))
testing$sample<-!(is.na(testing$DailyRange)|is.na(testing$PriorDailyRange))
summary(linmod<-lm(DailyRange~PriorDailyRange,data=training,subset=training$sample))
summary(lm(testing$DailyRange~predict(linmod,newdata=testing)))
print(svrmod<-ksvm(DailyRange~PriorDailyRange,type='eps-svr',data=training))
summary(lm(rest(testing$DailyRange)~predict(svrmod,newdata=testing)))
isort<-order(rest(testing$PriorDailyRange))
plot(testing$PriorDailyRange[testing$sample],testing$DailyRange[testing$sample],axes=TRUE,
    xlab='Prior Daily Range',ylab='Daily Range',main='Comparison of SVR and OLS Models for Daily Range',
    sub=paste('Data: Compact Model Portfolio, Top Ranked Stock; Resolution: Daily; Training:',
    training$MarkDate[1],'--',training$MarkDate[length(training$MarkDate)],'Testing:',testing$MarkDate[1],
    '--',testing$MarkDate[length(testing$MarkDate)]))
lines(c(-1.1,1.1),c(0,0),col='gray')
lines(c(0,0),c(-1.1,1.1),col='gray')
abline(linmod,col='red')
lines(rest(testing$PriorDailyRange)[isort],predict(svrmod,newdata=testing)[isort],col='blue')
 

Comments are closed

Powered by BlogEngine.NET 1.4.5.0
Theme by Mads Kristensen | Modified by Mooglegiant



RecentComments

Comment RSS

About the Author

Graham Giller - Headshot GRAHAM GILLER
Dr. Giller holds a doctorate from Oxford University in experimental elementary particle physics. His field of research was statistical astronomy using high energy cosmic rays. After leaving Oxford, he worked in the Process Driven Trading Group at Morgan Stanley, as a strategy researcher and portfolio manager. He then ran a CTA/CPO firm which concentrated on trading eurodollar futures using statistical models. From 2004, he has managed a private family investment office. In 2009, he joined a California based hedge fund startup, concentrating on high frequency alpha and volatility forecasting. My updated resume is on LinkedIn.

Pages


Disclaimer

Nothing on this site should be construed as a reccommendation to buy or sell any specific security nor as a solicitation of an order to buy or sell any specific security. Before making any trade for any reason you should consult your own financial advisor. The author may hold long or short positions in any of the securities discussed either before or after publication of an article mentioning such a security.

Copyright Notice

All post on this blog are © Copyright property of Giller Investments (New Jersey), LLC. All comments are the property of their respective authors and neither the author or this blog nor any entity associated with him are responsible for or accept any responsibility for their content. Offensive comments and spam may be removed at the authors discretion.

Data provided on this blog or through links to this blog are either property of Giller Investments (New Jersey), LLC or publicly available or derived from data that is publically available. Any data that is proprietary to Giller Investments (New Jersey), LLC is published here for the public interest and may be reproduced for private research or in public forums provided that suitable attribution and acknowledgement of ownership is made.

Privacy Policy

We use third-party advertising companies to serve ads when you visit our website. These companies may use information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by these companies, click here.