Slow Query with LIMIT
We have this query:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
It runs fine under 0.1ms. But when we add LIMIT:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 5
It runs over 20,000ms. Some notes:
- The messages table has over 300,000 rows, not that big.
- We have indexes on all 3 columns in the WHERE condition.
- The account_id and created_at conditions filter out 3,000 messages from the 300,000 rows. Out of the 3,000 messages in that account_id, there are only 3 unread messages.
So we isolated each WHERE condition and found out about a certain behaviour -- we tweaked the LIMIT to correspond with the amount of unread messages:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 3
It runs under 0.100ms. Any explanation for the difference in performance?
postgresql select postgresql-performance postgresql-9.2 offset-fetch
bumped to the homepage by Community♦ 9 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
We have this query:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
It runs fine under 0.1ms. But when we add LIMIT:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 5
It runs over 20,000ms. Some notes:
- The messages table has over 300,000 rows, not that big.
- We have indexes on all 3 columns in the WHERE condition.
- The account_id and created_at conditions filter out 3,000 messages from the 300,000 rows. Out of the 3,000 messages in that account_id, there are only 3 unread messages.
So we isolated each WHERE condition and found out about a certain behaviour -- we tweaked the LIMIT to correspond with the amount of unread messages:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 3
It runs under 0.100ms. Any explanation for the difference in performance?
postgresql select postgresql-performance postgresql-9.2 offset-fetch
bumped to the homepage by Community♦ 9 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Key to understanding are the respective query plans. RunEXPLAIN ANALYZE
post the output on explain.depesz.com and add the links to your question. You'll see different plans for the two queries.
– Erwin Brandstetter
Oct 24 '14 at 3:29
can you post an ... explain (costs true, buffers true, analyze true) SELECT ... of both queries? it will reveal the problem. fixing it should be easy them. many thanks, hans
– Hans-Jürgen Schönig
Oct 27 '14 at 11:50
add a comment |
We have this query:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
It runs fine under 0.1ms. But when we add LIMIT:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 5
It runs over 20,000ms. Some notes:
- The messages table has over 300,000 rows, not that big.
- We have indexes on all 3 columns in the WHERE condition.
- The account_id and created_at conditions filter out 3,000 messages from the 300,000 rows. Out of the 3,000 messages in that account_id, there are only 3 unread messages.
So we isolated each WHERE condition and found out about a certain behaviour -- we tweaked the LIMIT to correspond with the amount of unread messages:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 3
It runs under 0.100ms. Any explanation for the difference in performance?
postgresql select postgresql-performance postgresql-9.2 offset-fetch
We have this query:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
It runs fine under 0.1ms. But when we add LIMIT:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 5
It runs over 20,000ms. Some notes:
- The messages table has over 300,000 rows, not that big.
- We have indexes on all 3 columns in the WHERE condition.
- The account_id and created_at conditions filter out 3,000 messages from the 300,000 rows. Out of the 3,000 messages in that account_id, there are only 3 unread messages.
So we isolated each WHERE condition and found out about a certain behaviour -- we tweaked the LIMIT to correspond with the amount of unread messages:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 3
It runs under 0.100ms. Any explanation for the difference in performance?
postgresql select postgresql-performance postgresql-9.2 offset-fetch
postgresql select postgresql-performance postgresql-9.2 offset-fetch
edited Aug 8 '18 at 19:51
Evan Carroll
32.1k969219
32.1k969219
asked Oct 24 '14 at 1:32
Ace SubidoAce Subido
111
111
bumped to the homepage by Community♦ 9 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 9 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Key to understanding are the respective query plans. RunEXPLAIN ANALYZE
post the output on explain.depesz.com and add the links to your question. You'll see different plans for the two queries.
– Erwin Brandstetter
Oct 24 '14 at 3:29
can you post an ... explain (costs true, buffers true, analyze true) SELECT ... of both queries? it will reveal the problem. fixing it should be easy them. many thanks, hans
– Hans-Jürgen Schönig
Oct 27 '14 at 11:50
add a comment |
Key to understanding are the respective query plans. RunEXPLAIN ANALYZE
post the output on explain.depesz.com and add the links to your question. You'll see different plans for the two queries.
– Erwin Brandstetter
Oct 24 '14 at 3:29
can you post an ... explain (costs true, buffers true, analyze true) SELECT ... of both queries? it will reveal the problem. fixing it should be easy them. many thanks, hans
– Hans-Jürgen Schönig
Oct 27 '14 at 11:50
Key to understanding are the respective query plans. Run
EXPLAIN ANALYZE
post the output on explain.depesz.com and add the links to your question. You'll see different plans for the two queries.– Erwin Brandstetter
Oct 24 '14 at 3:29
Key to understanding are the respective query plans. Run
EXPLAIN ANALYZE
post the output on explain.depesz.com and add the links to your question. You'll see different plans for the two queries.– Erwin Brandstetter
Oct 24 '14 at 3:29
can you post an ... explain (costs true, buffers true, analyze true) SELECT ... of both queries? it will reveal the problem. fixing it should be easy them. many thanks, hans
– Hans-Jürgen Schönig
Oct 27 '14 at 11:50
can you post an ... explain (costs true, buffers true, analyze true) SELECT ... of both queries? it will reveal the problem. fixing it should be easy them. many thanks, hans
– Hans-Jürgen Schönig
Oct 27 '14 at 11:50
add a comment |
1 Answer
1
active
oldest
votes
Adding LIMIT
to the query often makes Postgres choose a different query plan. If statistics or cost estimates stray too far from actual data distribution / actual costs, you may end up with a slower query, even though Postgres reckoned it would be faster. Irregular data distribution may cause the query planner to misjudge the selectivity of WHERE
conditions.
In particular, increasing the STATISTICS
target for status
, account_id
, created_at
and id
might help.
Details:
- How can I avoid PostgreSQL sometimes choosing a bad query plan for one of two nearly identical queries?
Aside from that, a (partial) multi column index may help. Depends on the complete situation. Which of your WHERE
conditions are constant? How can they vary? Is some additional maintenance cost and disk space acceptable for a special index?
For example:
CREATE INDEX messages_special_idx ON messages (account_id, created_at, id)
WHERE status = 'unread'
AND created_at >= '2014-01-01 01:00:00';
Run ANALYZE messages;
before you try your queries again.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f80988%2fslow-query-with-limit%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Adding LIMIT
to the query often makes Postgres choose a different query plan. If statistics or cost estimates stray too far from actual data distribution / actual costs, you may end up with a slower query, even though Postgres reckoned it would be faster. Irregular data distribution may cause the query planner to misjudge the selectivity of WHERE
conditions.
In particular, increasing the STATISTICS
target for status
, account_id
, created_at
and id
might help.
Details:
- How can I avoid PostgreSQL sometimes choosing a bad query plan for one of two nearly identical queries?
Aside from that, a (partial) multi column index may help. Depends on the complete situation. Which of your WHERE
conditions are constant? How can they vary? Is some additional maintenance cost and disk space acceptable for a special index?
For example:
CREATE INDEX messages_special_idx ON messages (account_id, created_at, id)
WHERE status = 'unread'
AND created_at >= '2014-01-01 01:00:00';
Run ANALYZE messages;
before you try your queries again.
add a comment |
Adding LIMIT
to the query often makes Postgres choose a different query plan. If statistics or cost estimates stray too far from actual data distribution / actual costs, you may end up with a slower query, even though Postgres reckoned it would be faster. Irregular data distribution may cause the query planner to misjudge the selectivity of WHERE
conditions.
In particular, increasing the STATISTICS
target for status
, account_id
, created_at
and id
might help.
Details:
- How can I avoid PostgreSQL sometimes choosing a bad query plan for one of two nearly identical queries?
Aside from that, a (partial) multi column index may help. Depends on the complete situation. Which of your WHERE
conditions are constant? How can they vary? Is some additional maintenance cost and disk space acceptable for a special index?
For example:
CREATE INDEX messages_special_idx ON messages (account_id, created_at, id)
WHERE status = 'unread'
AND created_at >= '2014-01-01 01:00:00';
Run ANALYZE messages;
before you try your queries again.
add a comment |
Adding LIMIT
to the query often makes Postgres choose a different query plan. If statistics or cost estimates stray too far from actual data distribution / actual costs, you may end up with a slower query, even though Postgres reckoned it would be faster. Irregular data distribution may cause the query planner to misjudge the selectivity of WHERE
conditions.
In particular, increasing the STATISTICS
target for status
, account_id
, created_at
and id
might help.
Details:
- How can I avoid PostgreSQL sometimes choosing a bad query plan for one of two nearly identical queries?
Aside from that, a (partial) multi column index may help. Depends on the complete situation. Which of your WHERE
conditions are constant? How can they vary? Is some additional maintenance cost and disk space acceptable for a special index?
For example:
CREATE INDEX messages_special_idx ON messages (account_id, created_at, id)
WHERE status = 'unread'
AND created_at >= '2014-01-01 01:00:00';
Run ANALYZE messages;
before you try your queries again.
Adding LIMIT
to the query often makes Postgres choose a different query plan. If statistics or cost estimates stray too far from actual data distribution / actual costs, you may end up with a slower query, even though Postgres reckoned it would be faster. Irregular data distribution may cause the query planner to misjudge the selectivity of WHERE
conditions.
In particular, increasing the STATISTICS
target for status
, account_id
, created_at
and id
might help.
Details:
- How can I avoid PostgreSQL sometimes choosing a bad query plan for one of two nearly identical queries?
Aside from that, a (partial) multi column index may help. Depends on the complete situation. Which of your WHERE
conditions are constant? How can they vary? Is some additional maintenance cost and disk space acceptable for a special index?
For example:
CREATE INDEX messages_special_idx ON messages (account_id, created_at, id)
WHERE status = 'unread'
AND created_at >= '2014-01-01 01:00:00';
Run ANALYZE messages;
before you try your queries again.
edited Dec 28 '17 at 7:17
answered Oct 24 '14 at 3:53
Erwin BrandstetterErwin Brandstetter
92.1k9175288
92.1k9175288
add a comment |
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f80988%2fslow-query-with-limit%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Key to understanding are the respective query plans. Run
EXPLAIN ANALYZE
post the output on explain.depesz.com and add the links to your question. You'll see different plans for the two queries.– Erwin Brandstetter
Oct 24 '14 at 3:29
can you post an ... explain (costs true, buffers true, analyze true) SELECT ... of both queries? it will reveal the problem. fixing it should be easy them. many thanks, hans
– Hans-Jürgen Schönig
Oct 27 '14 at 11:50