

The resulting score ranges from d % ∈, where -1 corresponds to a complete 100% decline (to p 2 = 0) and ∞ an infinite increase (from p 1 = 0). an interrogative proportion is compared to a declarative one), p 1 is an initial ‘before’ proportion, and p 2 a subsequent ‘after’ proportion. Where for a change over time, following a treatment, or if one sample is subject to another condition (e.g. Percentage difference d % = ( p 2 – p 1) / p 1 = d / p 1, (1) However it has some important defects as a measure of change. Percentage difference is commonplace, and deserves a brief discussion. We read statements like ‘student numbers rose by 30%’, or ‘the rate of infection fell by 10%’. Percentage difference is an extremely commonly cited measure, but it is one that I have avoided for a number of reasons. Sometimes alternative approaches appear to obtain different results, and we need to consider which is correct. This analytical reduction should be the first step, simplifying the formula as far as possible into a formula consisting only of independent parameters.Īs we shall see, identifying ‘the simplest possible combination of independent parameters’ is not always straightforward. If they are not we should attempt to simplify the equation into one consisting only of independent proportions (or monotonic functions of them, like odds).

This is a process of formulating the metric into the simplest possible combination of independent parameters.Īs a general rule, before we carry out any computation of intervals, we must first identify whether terms are truly independent. In An algebra of intervals I remarked that the process of identifying a formula for a confidence interval for a metric involves a process of analytical reduction. Continue reading “Plotting the distributions of confidence intervals on algebraic operators on proportions” → Each bound should be treated separately, although they converge at p where α = 1. Where n is the sample size, p = f/ n is the observed proportion and α/2 is the error level for each tail. Lower bound w – = WilsonLower( p, n, α/2) = p′ – e′, Wallis (2021: 111) proposes two Wilson functions with three parameters. In this blog post we will refer to the interval p ∈ ( w –, w +) in terms of a functional notation. Only when n is large and p central does this distribution approximate to the Normal. I show the shape of the distribution for the Wilson score interval based on p (Wilson 1927) and other related distributions.

Elsewhere on this blog, I have developed the implications of this argument and plotted more distributions.

Instead we discover that it is profoundly shaped by the boundaries of the probabilistic range P=. In my book, I point out that even the simplest interval about the single proportion p cannot be Normal. This conceptual error has dogged discussion of confidence intervals in the statistics literature, and deeply affects how people rationalise about intervals. Most traditional discussions of confidence intervals assume that intervals are approximately Normal, an assumption Wallis (2021: 297) calls the ‘Normal fallacy’. We can then assess the overall shape, observing how these values tail off, a process that is much more instructive than arguing about ‘p-values’.
