Appendix J: Including measurement error in non-parametric likelihood functions

This derivation details how to include measurement device error in the likelihood function of MAD, when it is inferred in one-dimension, using Gaussian kernel density, fixed bandwidth, and Gaussian error.  All other assumptions are stated as they are introduced.  The dimensionality of the parameter space is of little importance in this calculation and is generically represented by one-dimensional θ, but this derivation extends to any MAD parameterizations.

To include measurement device error in the likelihood function the following formulation is required

where zb is Type-B data and ε is the measurement device error. From here on f(zb│θ) is referred to as just ‘likelihood function’ and f(zb│θ,ε) is referred to as the ‘error-conditional likelihood function’. Finally, f(ε│θ)=f(ε), because it is assumed that the measurement device error is statistically independent of the MAD parameters.

As mentioned at the outset, the measurement device error is assumed to be Gaussian, such that we can state f(ε)↝N(0,σε^2 ), where σε^2 is the measurement device error variance.

Non-parametrically, the f(zb│θ,ε) error-conditional likelihood function can be estimated with

 

where the accent indicates a non-parametrically (NP) inferred probability distribution function (PDF) and is the number of realizations of the dependent random variable used for the inference.

Using a Gaussian kernel, with fixed bandwidth Equation 2 can be further expanded as

where K() is a kernel function, which will be defined as Gaussian next, but has several specific properties (integrates to unity, finite second moment, and is always non-negative [Scott & Sain, 2004]), h is called the bandwidth and is a fixed property calculated from the realizations of the experiment for the Type-B data zb(i) , where the accent and the superscript (i) indicate the ith realization. Note h is not a function of ε in this formulation.

 

Equation 3 is now fully expanded to

Now, we return to Equation 1. Via Equation 2, substitution of Equation 4, and the assumption of normality for f(ε), yields

The summation and constants not dependent on ε all commute outside of the integral, leaving:

Expansion of the first squared term inside of the argument, extraction of terms independent of ε to the outside of the integral, and collection of same order terms of ε inside the integrand gives:

To carry out the integration, consider the following identity for quadratic exponential arguments

With the definition of x=ε, the following coefficients are easily recognizable in Equation 7

Using Equation 8, Equation 7 can be modified again to

The argument of the summation and the exponential in Equation 9 is now free of terms relating to ε and only contain σε. The goal is to rewrite the likelihood function (using Equations 3 & 4). In fact, the exponential argument can be formulated in terms of a rescaled bandwidth hRS that is a function of h and σε, the subscript ‘RS’ is shorthand for ‘rescaled’).

Rearranging, Equation 9 becomes

If we define, , Equation 10 becomes

Thus, we have been able to show that integrating the Gaussian measurement device error distribution and directly accounting for the measurement device error in the value of the error-conditional likelihood distribution is in fact identical to simply re-scaling the bandwidth as a the sum of the original bandwidth plus a the measurement device error variance.

Last edited Nov 8, 2013 at 9:33 PM by frystacka, version 4

Comments

No comments yet.