Linear regression minimising MAD in sklearn [on hold]

The standard sklearn linear regression class finds an approximated linear relationship between variate and covariates that minimises the mean squared error (MSE). Specifically, let $N$ be the number of observations and let us ignore the intercept for simplicity. Let $y_j$ be the variate value of the $j$-th observation and $x_{1,j}, dots, x_{n,j}$ be the values of the $n$ covariates of the $j$-th observation. The linear relationship is of the form
$$ y = beta_1 x_1 + dots beta_n x_n;$$
where the coefficients $beta_1, dots, beta_n$ are given by
$$beta_1, dots, beta_n = underset{tildebeta_1, dots, tildebeta_n}{mathrm{argmin}} left( sum_{j = 1}^N left( y_j - tildebeta_1x_{1, j} - dots -tildebeta_nx_{n, j}right)^2 right).$$

I now wish to find the coefficients that minimise the mean absolute deviation (MAD) instead of the mean squared error. Namely, I want the coefficients given by
$$beta_1, dots, beta_n = underset{tildebeta_1, dots, tildebeta_n}{mathrm{argmin}} left( sum_{j = 1}^N left| y_j - tildebeta_1x_{1, j} - dots -tildebeta_nx_{n, j}right| right).$$

I understand that, in sharp contrast to the MSE case, the lack of differentiability of the absolute value function at $0$ implies there is no analytic solution for the MAD case. But the latter is still a convex optimisation problem, and, according to this answer, it can be easily solved by means of linear programming.

Is it possible to implement this linear regression in sklearn? What about using other statistics toolkits?

edited 1 hour ago

Stephan Kolassa

44.2k692161

asked 3 hours ago

Giovanni De Gaetano

1235

New contributor

put on hold as off-topic by Peter Flom♦ 1 hour ago

This question appears to be off-topic. The users who voted to close gave this specific reason:

"This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Peter Flom

If this question can be reworded to fit the rules in the help center, please edit the question.

2

$begingroup$
I just nominated this for reopening. Yes, the question is about how to perform a task in sklearn or Python in general. But it needs statistical expertise to understand or answer, which is explicitly on-topic.
$endgroup$
– Stephan Kolassa
38 mins ago

$begingroup$
@StephanKolassa I agree with you - the question should be reopened..
$endgroup$
– James Phillips
6 mins ago

add a comment |

Is it possible to implement this linear regression in sklearn? What about using other statistics toolkits?

edited 1 hour ago

Stephan Kolassa

44.2k692161

asked 3 hours ago

Giovanni De Gaetano

1235

New contributor

put on hold as off-topic by Peter Flom♦ 1 hour ago

This question appears to be off-topic. The users who voted to close gave this specific reason:

"This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Peter Flom

If this question can be reworded to fit the rules in the help center, please edit the question.

2

$begingroup$
I just nominated this for reopening. Yes, the question is about how to perform a task in sklearn or Python in general. But it needs statistical expertise to understand or answer, which is explicitly on-topic.
$endgroup$
– Stephan Kolassa
38 mins ago

$begingroup$
@StephanKolassa I agree with you - the question should be reopened..
$endgroup$
– James Phillips
6 mins ago

add a comment |

Is it possible to implement this linear regression in sklearn? What about using other statistics toolkits?

edited 1 hour ago

Stephan Kolassa

44.2k692161

asked 3 hours ago

Giovanni De Gaetano

1235

New contributor

Is it possible to implement this linear regression in sklearn? What about using other statistics toolkits?

regression multiple-regression scikit-learn

edited 1 hour ago

Stephan Kolassa

44.2k692161

asked 3 hours ago

Giovanni De Gaetano

1235

New contributor

edited 1 hour ago

Stephan Kolassa

44.2k692161

asked 3 hours ago

Giovanni De Gaetano

1235

New contributor

edited 1 hour ago

Stephan Kolassa

44.2k692161

edited 1 hour ago

Stephan Kolassa

44.2k692161

edited 1 hour ago

Stephan Kolassa

44.2k692161

asked 3 hours ago

Giovanni De Gaetano

1235

New contributor

asked 3 hours ago

Giovanni De Gaetano

1235

asked 3 hours ago

Giovanni De Gaetano

1235

New contributor

Giovanni De Gaetano is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

put on hold as off-topic by Peter Flom♦ 1 hour ago

This question appears to be off-topic. The users who voted to close gave this specific reason:

"This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Peter Flom

If this question can be reworded to fit the rules in the help center, please edit the question.

put on hold as off-topic by Peter Flom♦ 1 hour ago

This question appears to be off-topic. The users who voted to close gave this specific reason:

"This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Peter Flom

If this question can be reworded to fit the rules in the help center, please edit the question.

2

$begingroup$
I just nominated this for reopening. Yes, the question is about how to perform a task in sklearn or Python in general. But it needs statistical expertise to understand or answer, which is explicitly on-topic.
$endgroup$
– Stephan Kolassa
38 mins ago

$begingroup$
@StephanKolassa I agree with you - the question should be reopened..
$endgroup$
– James Phillips
6 mins ago

add a comment |

2

$begingroup$
I just nominated this for reopening. Yes, the question is about how to perform a task in sklearn or Python in general. But it needs statistical expertise to understand or answer, which is explicitly on-topic.
$endgroup$
– Stephan Kolassa
38 mins ago

$begingroup$
@StephanKolassa I agree with you - the question should be reopened..
$endgroup$
– James Phillips
6 mins ago

I just nominated this for reopening. Yes, the question is about how to perform a task in sklearn or Python in general. But it needs statistical expertise to understand or answer, which is explicitly on-topic.

– Stephan Kolassa
38 mins ago

@StephanKolassa I agree with you - the question should be reopened..

– James Phillips
6 mins ago

add a comment |

1 Answer
1

active

oldest

votes

The expected MAD is minimized by the median of the distribution (Hanley, 2001, The American Statistician). Therefore, you are looking for a model that will yield the conditional median, instead of the conditional mean.

This is a special case of quantile-regression, specifically for the 50% quantile. Roger Koenker is the main guru for quantile regression; see in particular his eponymous book.

There are ways to do quantile regression in Python. This tutorial may be helpful. If you are open to using R, you can use the quantreg package.

answered 3 hours ago

Stephan Kolassa

44.2k692161

2

$begingroup$
In python it is available vis statsmodels statsmodels.org/dev/generated/…
$endgroup$
– Tim♦
2 hours ago

$begingroup$
Thanks! It is an easy way to look at the problem indeed...
$endgroup$
– Giovanni De Gaetano
2 hours ago

add a comment |

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

This is a special case of quantile-regression, specifically for the 50% quantile. Roger Koenker is the main guru for quantile regression; see in particular his eponymous book.

There are ways to do quantile regression in Python. This tutorial may be helpful. If you are open to using R, you can use the quantreg package.

answered 3 hours ago

Stephan Kolassa

44.2k692161

2

$begingroup$
In python it is available vis statsmodels statsmodels.org/dev/generated/…
$endgroup$
– Tim♦
2 hours ago

$begingroup$
Thanks! It is an easy way to look at the problem indeed...
$endgroup$
– Giovanni De Gaetano
2 hours ago

add a comment |

This is a special case of quantile-regression, specifically for the 50% quantile. Roger Koenker is the main guru for quantile regression; see in particular his eponymous book.

There are ways to do quantile regression in Python. This tutorial may be helpful. If you are open to using R, you can use the quantreg package.

answered 3 hours ago

Stephan Kolassa

44.2k692161

2

$begingroup$
In python it is available vis statsmodels statsmodels.org/dev/generated/…
$endgroup$
– Tim♦
2 hours ago

$begingroup$
Thanks! It is an easy way to look at the problem indeed...
$endgroup$
– Giovanni De Gaetano
2 hours ago

add a comment |

This is a special case of quantile-regression, specifically for the 50% quantile. Roger Koenker is the main guru for quantile regression; see in particular his eponymous book.

There are ways to do quantile regression in Python. This tutorial may be helpful. If you are open to using R, you can use the quantreg package.

answered 3 hours ago

Stephan Kolassa

44.2k692161

This is a special case of quantile-regression, specifically for the 50% quantile. Roger Koenker is the main guru for quantile regression; see in particular his eponymous book.

There are ways to do quantile regression in Python. This tutorial may be helpful. If you are open to using R, you can use the quantreg package.

answered 3 hours ago

Stephan Kolassa

44.2k692161

answered 3 hours ago

Stephan Kolassa

44.2k692161

answered 3 hours ago

Stephan Kolassa

44.2k692161

answered 3 hours ago

Stephan Kolassa

44.2k692161

2

$begingroup$
In python it is available vis statsmodels statsmodels.org/dev/generated/…
$endgroup$
– Tim♦
2 hours ago

$begingroup$
Thanks! It is an easy way to look at the problem indeed...
$endgroup$
– Giovanni De Gaetano
2 hours ago

add a comment |

2

$begingroup$
In python it is available vis statsmodels statsmodels.org/dev/generated/…
$endgroup$
– Tim♦
2 hours ago

$begingroup$
Thanks! It is an easy way to look at the problem indeed...
$endgroup$
– Giovanni De Gaetano
2 hours ago

In python it is available vis statsmodels statsmodels.org/dev/generated/…

– Tim♦
2 hours ago

Thanks! It is an easy way to look at the problem indeed...

– Giovanni De Gaetano
2 hours ago

add a comment |

This page is only for reference, If you need detailed information, please check here

Op6Q3XW3MSQjr3x4O,Flt4cQFfX4h1eR X,6hs2 9ev53qFMGhfFYfscNYTH MSzdhaXuuCHTNL 0G7ohVAL,L,j

搜尋此網誌

Xrhrft