Kernel Density Estimation with Error Bands

by Graham Giller August 07, 2009 12:42

As I mentioned in my post discusssing Kernel Density Estimators for the Dynamic Trading Risk Factor, one nice property of histogramming is that the sampling errors for the kernel density estimate that the histogram represents and well understood and straightforward to compute. Computing the sampling distribution for the estimator is considerably more complicated for kernel density estimators.

Dynamic Trading Risk Factor - Kernel Density Estimate with Errors

The above chart was prepared in MathematicaMathematica. On the laptop I have Mathematica 6 installed on, the k.d.e. chart takes about 30 seconds to run (Mathematica is a symbolic computation system and, as such, will always execute on the slower side), but the computation of the error bands took several hours — I set the job up at midnight and looked at the results over breakfast — and the resulting data is not particularly profound!

The expression for the mean square error of the point estimator is

LaTeX Rendered by www.forkosh.com/mathtex.html.

However, this expression is written in terms of the true population density and, since the entire premise of kernel density estimation is to estimate the density, we clearly do not know that. As the procedure is relatively efficient, when compared to histogramming, and an unbiased estimator, we can assume that the density estimate converges in expectation to the true population density. This justifies the step of replacing f(x) with its estimator in the above expression, which was what I did to compute the error bands.

I started this analysis to look at whether there was evidence for clustering of factor returns in the region of 2%, but am not seeing that in these procedures with a standard choice of bandwidth. The book I've been working from, Ward and Jones's Kernel Smoothing (Monographs on Statistics and Applied Probability) is silent on the sampling distributions for the estimators. Although I drew error bands on the plot, I don't actually know the probability that the true density estimate lies within those bands at a given point. We can reach for the Central Limit Theorem, and suggest that it is in the region of 68% of the probability mass — but that is truthfully just a guess. I think I have to go back to histogramming, with an arbitary bin-width, to assess whether the clustering is statistically significant.

 

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , , , ,

Empirical

Comments are closed

Powered by BlogEngine.NET 1.4.5.0
Theme by Mads Kristensen | Modified by Mooglegiant



RecentComments

Comment RSS

About the Author

Graham Giller - Headshot GRAHAM GILLER
Dr. Giller holds a doctorate from Oxford University in experimental elementary particle physics. His field of research was statistical astronomy using high energy cosmic rays. After leaving Oxford, he worked in the Process Driven Trading Group at Morgan Stanley, as a strategy researcher and portfolio manager. He then ran a CTA/CPO firm which concentrated on trading eurodollar futures using statistical models. From 2004, he has managed a private family investment office. In 2009, he joined a California based hedge fund startup, concentrating on high frequency alpha and volatility forecasting. My updated resume is on LinkedIn.

Pages


Disclaimer

Nothing on this site should be construed as a reccommendation to buy or sell any specific security nor as a solicitation of an order to buy or sell any specific security. Before making any trade for any reason you should consult your own financial advisor. The author may hold long or short positions in any of the securities discussed either before or after publication of an article mentioning such a security.

Copyright Notice

All post on this blog are © Copyright property of Giller Investments (New Jersey), LLC. All comments are the property of their respective authors and neither the author or this blog nor any entity associated with him are responsible for or accept any responsibility for their content. Offensive comments and spam may be removed at the authors discretion.

Data provided on this blog or through links to this blog are either property of Giller Investments (New Jersey), LLC or publicly available or derived from data that is publically available. Any data that is proprietary to Giller Investments (New Jersey), LLC is published here for the public interest and may be reproduced for private research or in public forums provided that suitable attribution and acknowledgement of ownership is made.

Privacy Policy

We use third-party advertising companies to serve ads when you visit our website. These companies may use information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by these companies, click here.