A hypothesis test for the median of a single sample
Assume we have a continuous pdf for the random variable X. If the distribution is skewed, it is generally more appropriate to use the median (theta.5) as a location parameter than the mean. If the distribution is symmetric then the two parameters are equal.
Suppose we wish to test H0: theta.5 = thetaH versus Ha: theta.5 > thetaH
Under H0, X satisfies P(X > thetaH) = P(X < thetaH) = .5. Thus the number of X's that are greater than thetaH follow a binomial distribution with p=.5, so we can calculate a p-value directly from the binomial distribution, or in large samples use a normal approximation with mean = n/2 and variance = n/4.
A confidence interval for the median of a single sample
The order statistics from a sample satisfy X(1) < X(2) < ... < X(n) . To develop a confidence interval for the median, we wish to find an interval with P(X(a) < theta.5 < X(b) ) = 1 - alpha . When X(a) < theta.5 < X(b) , then at least a of the observations are less than theta.5, and at most b-1 of the observations are less than theta.5 . Because the number of observations less than theta.5 follows a binomial distribution, with parameter p=.5, the probability that we want to calculate is given by:
or in large samples can be calculated using the normal approximation to the binomial distribution.
The empirical cumulative distribution function (ecdf) and a confidence interval for the ecdf at a single point
The ecdf (x) = = the proportion of observations <= x . Since the number of observations less than x is a binomial random variable with parameter p = F(x), we can calculate the standard deviation of ecdf(x) and develop a confidence interval for F evaluated at x, using either the binomial distribution or (in large samples) a normal approximation to the binomial distribution.