Slow Query with LIMIT












2















We have this query:



SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC


It runs fine under 0.1ms. But when we add LIMIT:



SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 5


It runs over 20,000ms. Some notes:




  • The messages table has over 300,000 rows, not that big.

  • We have indexes on all 3 columns in the WHERE condition.

  • The account_id and created_at conditions filter out 3,000 messages from the 300,000 rows. Out of the 3,000 messages in that account_id, there are only 3 unread messages.


So we isolated each WHERE condition and found out about a certain behaviour -- we tweaked the LIMIT to correspond with the amount of unread messages:



SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 3


It runs under 0.100ms. Any explanation for the difference in performance?










share|improve this question
















bumped to the homepage by Community 9 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
















  • Key to understanding are the respective query plans. Run EXPLAIN ANALYZE post the output on explain.depesz.com and add the links to your question. You'll see different plans for the two queries.

    – Erwin Brandstetter
    Oct 24 '14 at 3:29













  • can you post an ... explain (costs true, buffers true, analyze true) SELECT ... of both queries? it will reveal the problem. fixing it should be easy them. many thanks, hans

    – Hans-Jürgen Schönig
    Oct 27 '14 at 11:50
















2















We have this query:



SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC


It runs fine under 0.1ms. But when we add LIMIT:



SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 5


It runs over 20,000ms. Some notes:




  • The messages table has over 300,000 rows, not that big.

  • We have indexes on all 3 columns in the WHERE condition.

  • The account_id and created_at conditions filter out 3,000 messages from the 300,000 rows. Out of the 3,000 messages in that account_id, there are only 3 unread messages.


So we isolated each WHERE condition and found out about a certain behaviour -- we tweaked the LIMIT to correspond with the amount of unread messages:



SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 3


It runs under 0.100ms. Any explanation for the difference in performance?










share|improve this question
















bumped to the homepage by Community 9 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
















  • Key to understanding are the respective query plans. Run EXPLAIN ANALYZE post the output on explain.depesz.com and add the links to your question. You'll see different plans for the two queries.

    – Erwin Brandstetter
    Oct 24 '14 at 3:29













  • can you post an ... explain (costs true, buffers true, analyze true) SELECT ... of both queries? it will reveal the problem. fixing it should be easy them. many thanks, hans

    – Hans-Jürgen Schönig
    Oct 27 '14 at 11:50














2












2








2








We have this query:



SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC


It runs fine under 0.1ms. But when we add LIMIT:



SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 5


It runs over 20,000ms. Some notes:




  • The messages table has over 300,000 rows, not that big.

  • We have indexes on all 3 columns in the WHERE condition.

  • The account_id and created_at conditions filter out 3,000 messages from the 300,000 rows. Out of the 3,000 messages in that account_id, there are only 3 unread messages.


So we isolated each WHERE condition and found out about a certain behaviour -- we tweaked the LIMIT to correspond with the amount of unread messages:



SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 3


It runs under 0.100ms. Any explanation for the difference in performance?










share|improve this question
















We have this query:



SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC


It runs fine under 0.1ms. But when we add LIMIT:



SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 5


It runs over 20,000ms. Some notes:




  • The messages table has over 300,000 rows, not that big.

  • We have indexes on all 3 columns in the WHERE condition.

  • The account_id and created_at conditions filter out 3,000 messages from the 300,000 rows. Out of the 3,000 messages in that account_id, there are only 3 unread messages.


So we isolated each WHERE condition and found out about a certain behaviour -- we tweaked the LIMIT to correspond with the amount of unread messages:



SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 3


It runs under 0.100ms. Any explanation for the difference in performance?







postgresql select postgresql-performance postgresql-9.2 offset-fetch






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Aug 8 '18 at 19:51









Evan Carroll

32.1k969219




32.1k969219










asked Oct 24 '14 at 1:32









Ace SubidoAce Subido

111




111





bumped to the homepage by Community 9 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 9 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • Key to understanding are the respective query plans. Run EXPLAIN ANALYZE post the output on explain.depesz.com and add the links to your question. You'll see different plans for the two queries.

    – Erwin Brandstetter
    Oct 24 '14 at 3:29













  • can you post an ... explain (costs true, buffers true, analyze true) SELECT ... of both queries? it will reveal the problem. fixing it should be easy them. many thanks, hans

    – Hans-Jürgen Schönig
    Oct 27 '14 at 11:50



















  • Key to understanding are the respective query plans. Run EXPLAIN ANALYZE post the output on explain.depesz.com and add the links to your question. You'll see different plans for the two queries.

    – Erwin Brandstetter
    Oct 24 '14 at 3:29













  • can you post an ... explain (costs true, buffers true, analyze true) SELECT ... of both queries? it will reveal the problem. fixing it should be easy them. many thanks, hans

    – Hans-Jürgen Schönig
    Oct 27 '14 at 11:50

















Key to understanding are the respective query plans. Run EXPLAIN ANALYZE post the output on explain.depesz.com and add the links to your question. You'll see different plans for the two queries.

– Erwin Brandstetter
Oct 24 '14 at 3:29







Key to understanding are the respective query plans. Run EXPLAIN ANALYZE post the output on explain.depesz.com and add the links to your question. You'll see different plans for the two queries.

– Erwin Brandstetter
Oct 24 '14 at 3:29















can you post an ... explain (costs true, buffers true, analyze true) SELECT ... of both queries? it will reveal the problem. fixing it should be easy them. many thanks, hans

– Hans-Jürgen Schönig
Oct 27 '14 at 11:50





can you post an ... explain (costs true, buffers true, analyze true) SELECT ... of both queries? it will reveal the problem. fixing it should be easy them. many thanks, hans

– Hans-Jürgen Schönig
Oct 27 '14 at 11:50










1 Answer
1






active

oldest

votes


















0














Adding LIMIT to the query often makes Postgres choose a different query plan. If statistics or cost estimates stray too far from actual data distribution / actual costs, you may end up with a slower query, even though Postgres reckoned it would be faster. Irregular data distribution may cause the query planner to misjudge the selectivity of WHERE conditions.



In particular, increasing the STATISTICS target for status, account_id, created_at and id might help.



Details:




  • How can I avoid PostgreSQL sometimes choosing a bad query plan for one of two nearly identical queries?


Aside from that, a (partial) multi column index may help. Depends on the complete situation. Which of your WHERE conditions are constant? How can they vary? Is some additional maintenance cost and disk space acceptable for a special index?



For example:



CREATE INDEX messages_special_idx ON messages (account_id, created_at, id)
WHERE status = 'unread'
AND created_at >= '2014-01-01 01:00:00';


Run ANALYZE messages; before you try your queries again.






share|improve this answer

























    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "182"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f80988%2fslow-query-with-limit%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Adding LIMIT to the query often makes Postgres choose a different query plan. If statistics or cost estimates stray too far from actual data distribution / actual costs, you may end up with a slower query, even though Postgres reckoned it would be faster. Irregular data distribution may cause the query planner to misjudge the selectivity of WHERE conditions.



    In particular, increasing the STATISTICS target for status, account_id, created_at and id might help.



    Details:




    • How can I avoid PostgreSQL sometimes choosing a bad query plan for one of two nearly identical queries?


    Aside from that, a (partial) multi column index may help. Depends on the complete situation. Which of your WHERE conditions are constant? How can they vary? Is some additional maintenance cost and disk space acceptable for a special index?



    For example:



    CREATE INDEX messages_special_idx ON messages (account_id, created_at, id)
    WHERE status = 'unread'
    AND created_at >= '2014-01-01 01:00:00';


    Run ANALYZE messages; before you try your queries again.






    share|improve this answer






























      0














      Adding LIMIT to the query often makes Postgres choose a different query plan. If statistics or cost estimates stray too far from actual data distribution / actual costs, you may end up with a slower query, even though Postgres reckoned it would be faster. Irregular data distribution may cause the query planner to misjudge the selectivity of WHERE conditions.



      In particular, increasing the STATISTICS target for status, account_id, created_at and id might help.



      Details:




      • How can I avoid PostgreSQL sometimes choosing a bad query plan for one of two nearly identical queries?


      Aside from that, a (partial) multi column index may help. Depends on the complete situation. Which of your WHERE conditions are constant? How can they vary? Is some additional maintenance cost and disk space acceptable for a special index?



      For example:



      CREATE INDEX messages_special_idx ON messages (account_id, created_at, id)
      WHERE status = 'unread'
      AND created_at >= '2014-01-01 01:00:00';


      Run ANALYZE messages; before you try your queries again.






      share|improve this answer




























        0












        0








        0







        Adding LIMIT to the query often makes Postgres choose a different query plan. If statistics or cost estimates stray too far from actual data distribution / actual costs, you may end up with a slower query, even though Postgres reckoned it would be faster. Irregular data distribution may cause the query planner to misjudge the selectivity of WHERE conditions.



        In particular, increasing the STATISTICS target for status, account_id, created_at and id might help.



        Details:




        • How can I avoid PostgreSQL sometimes choosing a bad query plan for one of two nearly identical queries?


        Aside from that, a (partial) multi column index may help. Depends on the complete situation. Which of your WHERE conditions are constant? How can they vary? Is some additional maintenance cost and disk space acceptable for a special index?



        For example:



        CREATE INDEX messages_special_idx ON messages (account_id, created_at, id)
        WHERE status = 'unread'
        AND created_at >= '2014-01-01 01:00:00';


        Run ANALYZE messages; before you try your queries again.






        share|improve this answer















        Adding LIMIT to the query often makes Postgres choose a different query plan. If statistics or cost estimates stray too far from actual data distribution / actual costs, you may end up with a slower query, even though Postgres reckoned it would be faster. Irregular data distribution may cause the query planner to misjudge the selectivity of WHERE conditions.



        In particular, increasing the STATISTICS target for status, account_id, created_at and id might help.



        Details:




        • How can I avoid PostgreSQL sometimes choosing a bad query plan for one of two nearly identical queries?


        Aside from that, a (partial) multi column index may help. Depends on the complete situation. Which of your WHERE conditions are constant? How can they vary? Is some additional maintenance cost and disk space acceptable for a special index?



        For example:



        CREATE INDEX messages_special_idx ON messages (account_id, created_at, id)
        WHERE status = 'unread'
        AND created_at >= '2014-01-01 01:00:00';


        Run ANALYZE messages; before you try your queries again.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Dec 28 '17 at 7:17

























        answered Oct 24 '14 at 3:53









        Erwin BrandstetterErwin Brandstetter

        92.1k9175288




        92.1k9175288






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Database Administrators Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f80988%2fslow-query-with-limit%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            SQL Server 17 - Attemping to backup to remote NAS but Access is denied

            Always On Availability groups resolving state after failover - Remote harden of transaction...

            Restoring from pg_dump with foreign key constraints