Lp norm linear regression
To find the parameters β = (β1, …,βk)T which minimize the Lp norm for the linear regression problem,
the IRLS algorithm at step t + 1 involves solving the weighted linear least squares problem:[4]
where W(t) is the diagonal matrix of weights, usually with all elements set initially to:
and updated after each iteration to:
In the case p = 1, this corresponds to least absolute deviation regression (in this case, the problem would be better approached by use of linear programming methods,[5] so the result would be exact) and the formula is:
To avoid dividing by zero, regularization must be done, so in practice the formula is:
where is some small value, like 0.0001.[5] Note the use of in the weighting function is equivalent to the Huber loss function in robust estimation. [6]