Suboptimal query plan when updating partitioned table












0















Background




  • I have a simple CTE used to update a declaratively-partitioned table.

  • The subquery runs quickly (1.7sec per EXPLAIN ANALYZE) and returns 3,769 records (CTE y in the query below).

  • The UPDATE seeks to update non-index columns of a declaratively-partitioned table. A few characteristics of the table:


    • Contains 21 million records


    • 184 partitions -- yes, too many -- each child partition has primary/ foreign keys and indices


    • fillfactor=100 (plan to reduce to potentially use HOT on same pages, but not certain that query plan affected by lack of page space)




Problem arises upon UPDATE, per the incremental query plan shown (EXPLAIN) after the subquery (via CTE) runs (several nested loops of hash joins):



-> Hash Join /* Run for each child partition, based on CTE PK = child table PK */
-> Nested Loop
-> CTE Scan
-> Append
-> Index Scan /* for index on each child partition */
... /* Index scans (for each child partition) */
-> Hash
-> Seq Scan /* on child table */
... /* Hash Joins (for each child partition) */


Query



The following query is the UPDATE statement causing the issue. Basically, the query performs a couple functions using a value from parent_table that cannot be nested into a single SQL statement (so two CTEs used), then UPDATE the same parent_table for the result (the functions are expensive, so the result stored in the table itself).



WITH x AS (
SELECT t."p1", t."p2", f(t."b1") OVER "win_x" AS "c1"
FROM parent_table AS "t"
WHERE t."p1" IN ('val1','val2')
WINDOW "win_x" AS (PARTITION BY "p1" ORDER BY "p1","p2")
), y AS (
SELECT x."p1", x."p2", f(x."c1") OVER "win_y" AS "c2"
FROM x
WINDOW "win_y" AS (PARTITION BY "p1" ORDER BY "p1","p2")
)
UPDATE parent_table AS "t2"
SET ("a1")=(y."c2")
FROM y INNER JOIN parent_table AS "t" USING ("p1","p2")
WHERE t2."p1"=y."p1" AND t2."p2"=y."p2";


Question



How can I perform the UPDATE without the nested loop hash joins for each of the 184 child tables?



System Info



Postgres version 10.3









share



























    0















    Background




    • I have a simple CTE used to update a declaratively-partitioned table.

    • The subquery runs quickly (1.7sec per EXPLAIN ANALYZE) and returns 3,769 records (CTE y in the query below).

    • The UPDATE seeks to update non-index columns of a declaratively-partitioned table. A few characteristics of the table:


      • Contains 21 million records


      • 184 partitions -- yes, too many -- each child partition has primary/ foreign keys and indices


      • fillfactor=100 (plan to reduce to potentially use HOT on same pages, but not certain that query plan affected by lack of page space)




    Problem arises upon UPDATE, per the incremental query plan shown (EXPLAIN) after the subquery (via CTE) runs (several nested loops of hash joins):



    -> Hash Join /* Run for each child partition, based on CTE PK = child table PK */
    -> Nested Loop
    -> CTE Scan
    -> Append
    -> Index Scan /* for index on each child partition */
    ... /* Index scans (for each child partition) */
    -> Hash
    -> Seq Scan /* on child table */
    ... /* Hash Joins (for each child partition) */


    Query



    The following query is the UPDATE statement causing the issue. Basically, the query performs a couple functions using a value from parent_table that cannot be nested into a single SQL statement (so two CTEs used), then UPDATE the same parent_table for the result (the functions are expensive, so the result stored in the table itself).



    WITH x AS (
    SELECT t."p1", t."p2", f(t."b1") OVER "win_x" AS "c1"
    FROM parent_table AS "t"
    WHERE t."p1" IN ('val1','val2')
    WINDOW "win_x" AS (PARTITION BY "p1" ORDER BY "p1","p2")
    ), y AS (
    SELECT x."p1", x."p2", f(x."c1") OVER "win_y" AS "c2"
    FROM x
    WINDOW "win_y" AS (PARTITION BY "p1" ORDER BY "p1","p2")
    )
    UPDATE parent_table AS "t2"
    SET ("a1")=(y."c2")
    FROM y INNER JOIN parent_table AS "t" USING ("p1","p2")
    WHERE t2."p1"=y."p1" AND t2."p2"=y."p2";


    Question



    How can I perform the UPDATE without the nested loop hash joins for each of the 184 child tables?



    System Info



    Postgres version 10.3









    share

























      0












      0








      0








      Background




      • I have a simple CTE used to update a declaratively-partitioned table.

      • The subquery runs quickly (1.7sec per EXPLAIN ANALYZE) and returns 3,769 records (CTE y in the query below).

      • The UPDATE seeks to update non-index columns of a declaratively-partitioned table. A few characteristics of the table:


        • Contains 21 million records


        • 184 partitions -- yes, too many -- each child partition has primary/ foreign keys and indices


        • fillfactor=100 (plan to reduce to potentially use HOT on same pages, but not certain that query plan affected by lack of page space)




      Problem arises upon UPDATE, per the incremental query plan shown (EXPLAIN) after the subquery (via CTE) runs (several nested loops of hash joins):



      -> Hash Join /* Run for each child partition, based on CTE PK = child table PK */
      -> Nested Loop
      -> CTE Scan
      -> Append
      -> Index Scan /* for index on each child partition */
      ... /* Index scans (for each child partition) */
      -> Hash
      -> Seq Scan /* on child table */
      ... /* Hash Joins (for each child partition) */


      Query



      The following query is the UPDATE statement causing the issue. Basically, the query performs a couple functions using a value from parent_table that cannot be nested into a single SQL statement (so two CTEs used), then UPDATE the same parent_table for the result (the functions are expensive, so the result stored in the table itself).



      WITH x AS (
      SELECT t."p1", t."p2", f(t."b1") OVER "win_x" AS "c1"
      FROM parent_table AS "t"
      WHERE t."p1" IN ('val1','val2')
      WINDOW "win_x" AS (PARTITION BY "p1" ORDER BY "p1","p2")
      ), y AS (
      SELECT x."p1", x."p2", f(x."c1") OVER "win_y" AS "c2"
      FROM x
      WINDOW "win_y" AS (PARTITION BY "p1" ORDER BY "p1","p2")
      )
      UPDATE parent_table AS "t2"
      SET ("a1")=(y."c2")
      FROM y INNER JOIN parent_table AS "t" USING ("p1","p2")
      WHERE t2."p1"=y."p1" AND t2."p2"=y."p2";


      Question



      How can I perform the UPDATE without the nested loop hash joins for each of the 184 child tables?



      System Info



      Postgres version 10.3









      share














      Background




      • I have a simple CTE used to update a declaratively-partitioned table.

      • The subquery runs quickly (1.7sec per EXPLAIN ANALYZE) and returns 3,769 records (CTE y in the query below).

      • The UPDATE seeks to update non-index columns of a declaratively-partitioned table. A few characteristics of the table:


        • Contains 21 million records


        • 184 partitions -- yes, too many -- each child partition has primary/ foreign keys and indices


        • fillfactor=100 (plan to reduce to potentially use HOT on same pages, but not certain that query plan affected by lack of page space)




      Problem arises upon UPDATE, per the incremental query plan shown (EXPLAIN) after the subquery (via CTE) runs (several nested loops of hash joins):



      -> Hash Join /* Run for each child partition, based on CTE PK = child table PK */
      -> Nested Loop
      -> CTE Scan
      -> Append
      -> Index Scan /* for index on each child partition */
      ... /* Index scans (for each child partition) */
      -> Hash
      -> Seq Scan /* on child table */
      ... /* Hash Joins (for each child partition) */


      Query



      The following query is the UPDATE statement causing the issue. Basically, the query performs a couple functions using a value from parent_table that cannot be nested into a single SQL statement (so two CTEs used), then UPDATE the same parent_table for the result (the functions are expensive, so the result stored in the table itself).



      WITH x AS (
      SELECT t."p1", t."p2", f(t."b1") OVER "win_x" AS "c1"
      FROM parent_table AS "t"
      WHERE t."p1" IN ('val1','val2')
      WINDOW "win_x" AS (PARTITION BY "p1" ORDER BY "p1","p2")
      ), y AS (
      SELECT x."p1", x."p2", f(x."c1") OVER "win_y" AS "c2"
      FROM x
      WINDOW "win_y" AS (PARTITION BY "p1" ORDER BY "p1","p2")
      )
      UPDATE parent_table AS "t2"
      SET ("a1")=(y."c2")
      FROM y INNER JOIN parent_table AS "t" USING ("p1","p2")
      WHERE t2."p1"=y."p1" AND t2."p2"=y."p2";


      Question



      How can I perform the UPDATE without the nested loop hash joins for each of the 184 child tables?



      System Info



      Postgres version 10.3







      postgresql





      share












      share










      share



      share










      asked 47 secs ago









      WheeWhee

      406




      406






















          0






          active

          oldest

          votes











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "182"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f228740%2fsuboptimal-query-plan-when-updating-partitioned-table%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Database Administrators Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f228740%2fsuboptimal-query-plan-when-updating-partitioned-table%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          SQL Server 17 - Attemping to backup to remote NAS but Access is denied

          Always On Availability groups resolving state after failover - Remote harden of transaction...

          Restoring from pg_dump with foreign key constraints