Probit

Mathematical function, inverse of error function

In probability theory and statistics, the probit function is the quantile function associated with the standard normal distribution. It has applications in data analysis and machine learning, in particular exploratory statistical graphics and specialized regression modeling of binary response variables.

Mathematically, the probit is the inverse of the cumulative distribution function of the standard normal distribution, which is denoted as $\Phi (z)$ , so the probit is defined as

\operatorname {probit} (p)=\Phi ^{-1}(p)\quad {\text{for}}\quad p\in (0,1)

.

Largely because of the central limit theorem, the standard normal distribution plays a fundamental role in probability theory and statistics. If we consider the familiar fact that the standard normal distribution places 95% of probability between −1.96 and 1.96, and is symmetric around zero, it follows that

\Phi (-1.96)=0.025=1-\Phi (1.96).\,\!

The probit function gives the 'inverse' computation, generating a value of a standard normal random variable, associated with specified cumulative probability. Continuing the example,

\operatorname {probit} (0.025)=-1.96=-\operatorname {probit} (0.975)

.

In general,

\Phi (\operatorname {probit} (p))=p

and

\operatorname {probit} (\Phi (z))=z.

Computation

The normal distribution CDF and its inverse are not available in closed form, and computation requires careful use of numerical procedures. However, the functions are widely available in software for statistics and probability modeling, and in spreadsheets. In Microsoft Excel, for example, the probit function is available as norm.s.inv(p). In computing environments where numerical implementations of the inverse error function are available, the probit function may be obtained as

\operatorname {probit} (p)={\sqrt {2}}\,\operatorname {erf} ^{-1}(2p-1).

An example is MATLAB, where an 'erfinv' function is available. The language Mathematica implements 'InverseErf'. Other environments directly implement the probit function as is shown in the following session in the R programming language.

> qnorm(0.025)
[1] -1.959964
> pnorm(-1.96)
[1] 0.02499790

Details for computing the inverse error function can be found at . Wichura gives a fast algorithm for computing the probit function to 16 decimal places; this is used in R to generate random variates for the normal distribution.^[6]

An ordinary differential equation for the probit function

Another means of computation is based on forming a non-linear ordinary differential equation (ODE) for probit, as per the Steinbrecher and Shaw method.^[7] Abbreviating the probit function as $w(p)$ , the ODE is

{\frac {dw}{dp}}={\frac {1}{f(w)}}

where $f(w)$ is the probability density function of w.

In the case of the Gaussian:

{\frac {dw}{dp}}={\sqrt {2\pi }}\ e^{\frac {w^{2}}{2}}

Differentiating again:

{\frac {d^{2}w}{dp^{2}}}=w\left({\frac {dw}{dp}}\right)^{2}

with the centre (initial) conditions

w\left(1/2\right)=0,

w'\left(1/2\right)={\sqrt {2\pi }}.

This equation may be solved by several methods, including the classical power series approach. From this, solutions of arbitrarily high accuracy may be developed based on Steinbrecher's approach to the series for the inverse error function. The power series solution is given by

w(p)={\sqrt {\frac {\pi }{2}}}\sum _{k=0}^{\infty }{\frac {d_{k}}{(2k+1)}}(2p-1)^{(2k+1)}

where the coefficients $d_{k}$ satisfy the non-linear recurrence

d_{k+1}={\frac {\pi }{4}}\sum _{j=0}^{k}{\frac {d_{j}d_{k-j}}{(j+1)(2j+1)}}

with $d_{0}=1$ . In this form the ratio $d_{k+1}/d_{k}\rightarrow 1$ as $k\rightarrow \infty$ .

Logit

Comparison of the logit function with a scaled probit (i.e. the inverse CDF of the normal distribution), comparing

\operatorname {logit} (x)

vs.

\Phi ^{-1}(x)/{\sqrt {\frac {\pi }{8}}}

, which makes the slopes the same at the origin.

Closely related to the probit function (and probit model) are the logit function and logit model. The inverse of the logistic function is given by

\operatorname {logit} (p)=\log \left({\frac {p}{1-p}}\right).

Analogously to the probit model, we may assume that such a quantity is related linearly to a set of predictors, resulting in the logit model, the basis in particular of logistic regression model, the most prevalent form of regression analysis for categorical response data. In current statistical practice, probit and logit regression models are often handled as cases of the generalized linear model.

Share this article:

This article uses material from the Wikipedia article Probit, and is written by contributors. Text is available under a CC BY-SA 4.0 International License; additional terms may apply. Images, videos and audio are available under their respective licenses.

[1] [1]
Bliss, C. I. (1934). "The method of probits". Science. 79 (2037): 38–39. Bibcode:1934Sci....79...38B. doi:10.1126/science.79.2037.38. JSTOR 1659792. PMID 17813446.

[FOOTNOTEBliss193439-2] [2]
Bliss 1934, p. 39.

[3] [3]
Finney, D.J. (1947), Probit Analysis. (1st edition) Cambridge University Press, Cambridge, UK.

[4] [4]
Finney, D.J. (1971). Probit Analysis (3rd ed.). Cambridge University Press, Cambridge, UK. ISBN 0-521-08041-X. OCLC 174198382.

[5] [5]
Collett, D. (1991). Modelling Binary Data. Chapman and Hall / CRC.

[6] [6]
Wichura, M.J. (1988). "Algorithm AS241: The Percentage Points of the Normal Distribution". Applied Statistics. 37 (3). Blackwell Publishing: 477–484. doi:10.2307/2347330. JSTOR 2347330.

[7] [7]
Steinbrecher, G., Shaw, W.T. (2008). "Quantile mechanics". European Journal of Applied Mathematics. 19 (2): 87–112. doi:10.1017/S0956792508007341. S2CID 6899308.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

Probit

Probit

Conceptual development

Diagnosing deviation of a distribution from normality

Computation

An ordinary differential equation for the probit function

Logit

See also

References

External links

Share this article: