]]>Concrete question: let’s say that I have N, n-dimensional data points. I want to find an estimate for my (N+1)st data points having attributes (a1, a2, …, an). Now, I identify my k-nearest neighbors to my (N+1)st and produce an estimate for this data point. How do I produce a confidence interval for this estimate? Is it as simple as computing the sample standard deviation of the k points used and building a confidence interval about the estimate obtained for the (N+1)st point? Or is there an additional penalty that’s involved?

My question is about the number of classes. Say we have data from only one class for a two-class problem (normal-class and abnormal-class). Given a test-point and normal-class data points, is it possible to use 1-NN to say whether the test-point is a member of normal-class? I think this might be called an outlier detection problem as well.

