Further Comparison of OLS and Support Vector Machine Models Out-of-Sample

by Graham Giller August 26, 2010 11:23

In the prior post we noted the outperformance of Support Vector Regression over OLS models out-of-sample. This is referred to in the Machine Learning community as their superior ability to generalize. I think that the enhanced statistical reliability coupled with the fact that the univariate response model found by the SVM departs highly from our prior prejudices regarding smooth and low order responses is quite a striking result.

In this post we seek to replicate the response function of the SVM with a high-order polynomial model. This is to investigate whether the superior forecasting skill out-of-sample arises from the lower-order “wriggles” in the SVM response function or from the higher-order “kinks.” This is interesting because we can certainly replicate the lower-order features via classical linear methods, but it is unlikely that we can do such a thing for the higher-order features of the response. Thus we define our linear polynomial model as

LaTeX provided by MathTeX at forkosh.com.

Here the Pn(x) are Legendre Polynomials of order n. These are orthogonal on [-1,1], so are a useful basis to express our response function. Because the functions are orthogonal, the estimators should be independent in expectation. In addition, following Vapnik, we reject the Occam's Razor driven methodology of standard classical statistical analysis to find a parsimonious model (what my ex-boss, Peter Muller, used to refer to as “Keep It Simple Stupid”) and find the N large enough to match the testing set R² of the SVM.

Comparison of SVM and 30th Order Legendre Polynomial Models for Daily Range

The above chart illustrates a 30th order Legendre polynomial model replicating the response of the Support Vector Machine and exhibiting an equivalent out-of-sample forecasting skill. From the point-of-view of classical inference, there is no way an analyst would ever suggest using such a high order model on this data, and the t-statistics for the βn coefficients are all small, yet this is the type of functional response picked out by the Support Vector Machine!

 

Comments are closed

Powered by BlogEngine.NET 1.4.5.0
Theme by Mads Kristensen | Modified by Mooglegiant



RecentComments

Comment RSS

About the Author

Graham Giller - Headshot GRAHAM GILLER
Dr. Giller holds a doctorate from Oxford University in experimental elementary particle physics. His field of research was statistical astronomy using high energy cosmic rays. After leaving Oxford, he worked in the Process Driven Trading Group at Morgan Stanley, as a strategy researcher and portfolio manager. He then ran a CTA/CPO firm which concentrated on trading eurodollar futures using statistical models. From 2004, he has managed a private family investment office. In 2009, he joined a California based hedge fund startup, concentrating on high frequency alpha and volatility forecasting. My updated resume is on LinkedIn.

Pages


Disclaimer

Nothing on this site should be construed as a reccommendation to buy or sell any specific security nor as a solicitation of an order to buy or sell any specific security. Before making any trade for any reason you should consult your own financial advisor. The author may hold long or short positions in any of the securities discussed either before or after publication of an article mentioning such a security.

Copyright Notice

All post on this blog are © Copyright property of Giller Investments (New Jersey), LLC. All comments are the property of their respective authors and neither the author or this blog nor any entity associated with him are responsible for or accept any responsibility for their content. Offensive comments and spam may be removed at the authors discretion.

Data provided on this blog or through links to this blog are either property of Giller Investments (New Jersey), LLC or publicly available or derived from data that is publically available. Any data that is proprietary to Giller Investments (New Jersey), LLC is published here for the public interest and may be reproduced for private research or in public forums provided that suitable attribution and acknowledgement of ownership is made.

Privacy Policy

We use third-party advertising companies to serve ads when you visit our website. These companies may use information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by these companies, click here.