Poor SQL Server Performance With Nested “OR” Conditions In LEFT OUTER JOIN












1















I have a tricky bit of SQL that's giving me a bit of trouble. I'd like to introduce several new columns to a key table whose components may or may not be used based on user configuration (Dimension's 1-10). For architectural reasons these are stored as "UNIQUEIDENTIFIER"s.



However upon the introduction of these new dimension columns in my DB, I'm now facing performance problems in several other stored procs that manipulate data in this key table, one of which I will include below. It appears as though the method that I have implemented to support these new columns is inefficient and I'm not exactly sure why that is the case and the best way to remedy it.



(Included SQL)



--In real usage this would be set by user configuration
DECLARE @Dimension1Enabled BIT = 1
DECLARE @Dimension2Enabled BIT = 1
DECLARE @Dimension3Enabled BIT = 1
DECLARE @Dimension4Enabled BIT = 1
DECLARE @Dimension5Enabled BIT = 1
DECLARE @Dimension6Enabled BIT = 1
DECLARE @Dimension7Enabled BIT = 1
DECLARE @Dimension8Enabled BIT = 1
DECLARE @Dimension9Enabled BIT = 1
DECLARE @Dimension10Enabled BIT = 1

/*
--Here's the table definition for reference
CREATE TABLE [dbo].[Table1](
[Column1] [int] NOT NULL,
[Column2] [uniqueidentifier] NOT NULL,
[Column3] [uniqueidentifier] NOT NULL,
[Column4] [char](3) NOT NULL,
[Dimension1] [uniqueidentifier] NULL,
[Dimension2] [uniqueidentifier] NULL,
[Dimension3] [uniqueidentifier] NULL,
[Dimension4] [uniqueidentifier] NULL,
[Dimension5] [uniqueidentifier] NULL,
[Dimension6] [uniqueidentifier] NULL,
[Dimension7] [uniqueidentifier] NULL,
[Dimension8] [uniqueidentifier] NULL,
[Dimension9] [uniqueidentifier] NULL,
[Dimension10] [uniqueidentifier] NULL,
[Period] [int] NOT NULL,
[Amt] [numeric](28, 12) NOT NULL,
[EndingBal] [numeric](28, 12) NOT NULL,
[PlanningSourceID] [int] NOT NULL,
[Column5] [uniqueidentifier] NULL
) ON [PRIMARY]

--Simply contains 0-12
CREATE TABLE [dbo].[Table2] (
[FiscalPeriod] [int] NOT NULL
,[FiscalPeriodDescription] [varchar](50) NOT NULL
,[FiscalPeriodDescriptionMonthName] [varchar](20) NULL
,[FiscalPeriodName] [varchar](255) NULL
)
*/

SELECT pfd.*
FROM (
SELECT PFDE1.Column1
,PFDE1.Column2
,PFDE1.Column3
,PFDE1.Column4
,PFDE1.Dimension1
,PFDE1.Dimension2
,PFDE1.[Dimension3]
,PFDE1.[Dimension4]
,PFDE1.[Dimension5]
,PFDE1.[Dimension6]
,PFDE1.[Dimension7]
,PFDE1.[Dimension8]
,PFDE1.[Dimension9]
,PFDE1.[Dimension10]
,p.FiscalPeriod AS Period
,0 AS Amt
,0 AS EndingBal
,PFDE1.Column5
FROM (
SELECT DISTINCT PFDE1.Column1
,PFDE1.Column2
,PFDE1.Column3
,PFDE1.Column4
,PFDE1.Dimension1
,PFDE1.Dimension2
,PFDE1.[Dimension3]
,PFDE1.[Dimension4]
,PFDE1.[Dimension5]
,PFDE1.[Dimension6]
,PFDE1.[Dimension7]
,PFDE1.[Dimension8]
,PFDE1.[Dimension9]
,PFDE1.[Dimension10]
,PFDE1.Column5
FROM [dbo].[Table1] PFDE1
) PFDE1
CROSS JOIN [dbo].[Table2] p
) pfd
LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = pfd.Column1
AND PFDE.Column2 = pfd.Column2
AND PFDE.Column3 = pfd.Column3
AND PFDE.Column4 = pfd.Column4
--This section causes great slowness
AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = pfd.Dimension1)
AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = pfd.Dimension2)
AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = pfd.Dimension3)
AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = pfd.Dimension4)
AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = pfd.Dimension5)
AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = pfd.Dimension6)
AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = pfd.Dimension7)
AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = pfd.Dimension8)
AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = pfd.Dimension9)
AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = pfd.Dimension10)
AND PFDE.Period = pfd.Period
AND PFDE.Column5 = pfd.Column5
WHERE pfde.Column1 IS NULL


With 50,000 records in Table1 the performance of this query is quite poor, several minutes at least. Remove the "@DimensionXEnabled = 0 OR" condition though and the query will run in several seconds.



I feel I have several options to optimize this, I could do one of the following:




  1. Change the "Dimension" columns to NOT NULL and populate with 0 Guid Values - I believe this would have a performance impact because now I'm coping a large amount of data with each insert. I've tested this option and it does work better but there is a definite performance impact and I'd like to avoid it if possible.


  2. Convert the query to dynamic SQL and generate the join conditions dynamically - Would need to do this in 100+ other places and have a DB full of dynamic SQL, not looking forward to this option either.


  3. Wrap the "Dimension1 = Dimension1" in ISNULL statements rather than based the join condition off of a variable - I believe this would result in a inefficient join statement as the query would no longer be able to use indexes.



I feel that the SQL optimizer is making bad decisions with this query, if I write the query without the OR condition it run's quickly, even though the results are exactly the same. The query plans look almost identical other than that in the long running join, most of the processing time is used in the LEFT OUTER JOIN statements merge:



(With "OR" Conditions)



enter image description here



(Without "OR" Conditions)



enter image description here



I'm unsure how to proceed, my question is, is there a better option?



Update 1



I've been doing more research into this, attempting various ways to get around the poor performance, so far I haven't found any good solutions just more information:




  1. Tried column "IS NULL" without success (PFDE.Dimension1 IS NULL OR PFDE.Dimension1 = pfd.Dimension1), ran even slower, that doesn't seem to do it.


  2. Tired join "Hints" (HASH, MERGE, LOOP), no performance impact here, in fact in some cases they just ran slower.


  3. Tried option 3, actually this seems to run very quick, I think because Table2 doesn't have any indexes, this approach could be a problem in the future though...


  4. I've tried dynamic SQL (@AaronBertrand recommendation), actually the results were somewhat underwhelming (30 sec+), not sure why it performed so poorly, but that might be something that I could improve with some further optimizations, I'm looking into this more..


  5. I tried the "OPTION (RECOMPILE);" mentioned in this post. Saw definite improvements (~15 secs), still not as fast as not dimensions but maybe something else I can iterate on.











share|improve this question





























    1















    I have a tricky bit of SQL that's giving me a bit of trouble. I'd like to introduce several new columns to a key table whose components may or may not be used based on user configuration (Dimension's 1-10). For architectural reasons these are stored as "UNIQUEIDENTIFIER"s.



    However upon the introduction of these new dimension columns in my DB, I'm now facing performance problems in several other stored procs that manipulate data in this key table, one of which I will include below. It appears as though the method that I have implemented to support these new columns is inefficient and I'm not exactly sure why that is the case and the best way to remedy it.



    (Included SQL)



    --In real usage this would be set by user configuration
    DECLARE @Dimension1Enabled BIT = 1
    DECLARE @Dimension2Enabled BIT = 1
    DECLARE @Dimension3Enabled BIT = 1
    DECLARE @Dimension4Enabled BIT = 1
    DECLARE @Dimension5Enabled BIT = 1
    DECLARE @Dimension6Enabled BIT = 1
    DECLARE @Dimension7Enabled BIT = 1
    DECLARE @Dimension8Enabled BIT = 1
    DECLARE @Dimension9Enabled BIT = 1
    DECLARE @Dimension10Enabled BIT = 1

    /*
    --Here's the table definition for reference
    CREATE TABLE [dbo].[Table1](
    [Column1] [int] NOT NULL,
    [Column2] [uniqueidentifier] NOT NULL,
    [Column3] [uniqueidentifier] NOT NULL,
    [Column4] [char](3) NOT NULL,
    [Dimension1] [uniqueidentifier] NULL,
    [Dimension2] [uniqueidentifier] NULL,
    [Dimension3] [uniqueidentifier] NULL,
    [Dimension4] [uniqueidentifier] NULL,
    [Dimension5] [uniqueidentifier] NULL,
    [Dimension6] [uniqueidentifier] NULL,
    [Dimension7] [uniqueidentifier] NULL,
    [Dimension8] [uniqueidentifier] NULL,
    [Dimension9] [uniqueidentifier] NULL,
    [Dimension10] [uniqueidentifier] NULL,
    [Period] [int] NOT NULL,
    [Amt] [numeric](28, 12) NOT NULL,
    [EndingBal] [numeric](28, 12) NOT NULL,
    [PlanningSourceID] [int] NOT NULL,
    [Column5] [uniqueidentifier] NULL
    ) ON [PRIMARY]

    --Simply contains 0-12
    CREATE TABLE [dbo].[Table2] (
    [FiscalPeriod] [int] NOT NULL
    ,[FiscalPeriodDescription] [varchar](50) NOT NULL
    ,[FiscalPeriodDescriptionMonthName] [varchar](20) NULL
    ,[FiscalPeriodName] [varchar](255) NULL
    )
    */

    SELECT pfd.*
    FROM (
    SELECT PFDE1.Column1
    ,PFDE1.Column2
    ,PFDE1.Column3
    ,PFDE1.Column4
    ,PFDE1.Dimension1
    ,PFDE1.Dimension2
    ,PFDE1.[Dimension3]
    ,PFDE1.[Dimension4]
    ,PFDE1.[Dimension5]
    ,PFDE1.[Dimension6]
    ,PFDE1.[Dimension7]
    ,PFDE1.[Dimension8]
    ,PFDE1.[Dimension9]
    ,PFDE1.[Dimension10]
    ,p.FiscalPeriod AS Period
    ,0 AS Amt
    ,0 AS EndingBal
    ,PFDE1.Column5
    FROM (
    SELECT DISTINCT PFDE1.Column1
    ,PFDE1.Column2
    ,PFDE1.Column3
    ,PFDE1.Column4
    ,PFDE1.Dimension1
    ,PFDE1.Dimension2
    ,PFDE1.[Dimension3]
    ,PFDE1.[Dimension4]
    ,PFDE1.[Dimension5]
    ,PFDE1.[Dimension6]
    ,PFDE1.[Dimension7]
    ,PFDE1.[Dimension8]
    ,PFDE1.[Dimension9]
    ,PFDE1.[Dimension10]
    ,PFDE1.Column5
    FROM [dbo].[Table1] PFDE1
    ) PFDE1
    CROSS JOIN [dbo].[Table2] p
    ) pfd
    LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = pfd.Column1
    AND PFDE.Column2 = pfd.Column2
    AND PFDE.Column3 = pfd.Column3
    AND PFDE.Column4 = pfd.Column4
    --This section causes great slowness
    AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = pfd.Dimension1)
    AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = pfd.Dimension2)
    AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = pfd.Dimension3)
    AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = pfd.Dimension4)
    AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = pfd.Dimension5)
    AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = pfd.Dimension6)
    AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = pfd.Dimension7)
    AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = pfd.Dimension8)
    AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = pfd.Dimension9)
    AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = pfd.Dimension10)
    AND PFDE.Period = pfd.Period
    AND PFDE.Column5 = pfd.Column5
    WHERE pfde.Column1 IS NULL


    With 50,000 records in Table1 the performance of this query is quite poor, several minutes at least. Remove the "@DimensionXEnabled = 0 OR" condition though and the query will run in several seconds.



    I feel I have several options to optimize this, I could do one of the following:




    1. Change the "Dimension" columns to NOT NULL and populate with 0 Guid Values - I believe this would have a performance impact because now I'm coping a large amount of data with each insert. I've tested this option and it does work better but there is a definite performance impact and I'd like to avoid it if possible.


    2. Convert the query to dynamic SQL and generate the join conditions dynamically - Would need to do this in 100+ other places and have a DB full of dynamic SQL, not looking forward to this option either.


    3. Wrap the "Dimension1 = Dimension1" in ISNULL statements rather than based the join condition off of a variable - I believe this would result in a inefficient join statement as the query would no longer be able to use indexes.



    I feel that the SQL optimizer is making bad decisions with this query, if I write the query without the OR condition it run's quickly, even though the results are exactly the same. The query plans look almost identical other than that in the long running join, most of the processing time is used in the LEFT OUTER JOIN statements merge:



    (With "OR" Conditions)



    enter image description here



    (Without "OR" Conditions)



    enter image description here



    I'm unsure how to proceed, my question is, is there a better option?



    Update 1



    I've been doing more research into this, attempting various ways to get around the poor performance, so far I haven't found any good solutions just more information:




    1. Tried column "IS NULL" without success (PFDE.Dimension1 IS NULL OR PFDE.Dimension1 = pfd.Dimension1), ran even slower, that doesn't seem to do it.


    2. Tired join "Hints" (HASH, MERGE, LOOP), no performance impact here, in fact in some cases they just ran slower.


    3. Tried option 3, actually this seems to run very quick, I think because Table2 doesn't have any indexes, this approach could be a problem in the future though...


    4. I've tried dynamic SQL (@AaronBertrand recommendation), actually the results were somewhat underwhelming (30 sec+), not sure why it performed so poorly, but that might be something that I could improve with some further optimizations, I'm looking into this more..


    5. I tried the "OPTION (RECOMPILE);" mentioned in this post. Saw definite improvements (~15 secs), still not as fast as not dimensions but maybe something else I can iterate on.











    share|improve this question



























      1












      1








      1








      I have a tricky bit of SQL that's giving me a bit of trouble. I'd like to introduce several new columns to a key table whose components may or may not be used based on user configuration (Dimension's 1-10). For architectural reasons these are stored as "UNIQUEIDENTIFIER"s.



      However upon the introduction of these new dimension columns in my DB, I'm now facing performance problems in several other stored procs that manipulate data in this key table, one of which I will include below. It appears as though the method that I have implemented to support these new columns is inefficient and I'm not exactly sure why that is the case and the best way to remedy it.



      (Included SQL)



      --In real usage this would be set by user configuration
      DECLARE @Dimension1Enabled BIT = 1
      DECLARE @Dimension2Enabled BIT = 1
      DECLARE @Dimension3Enabled BIT = 1
      DECLARE @Dimension4Enabled BIT = 1
      DECLARE @Dimension5Enabled BIT = 1
      DECLARE @Dimension6Enabled BIT = 1
      DECLARE @Dimension7Enabled BIT = 1
      DECLARE @Dimension8Enabled BIT = 1
      DECLARE @Dimension9Enabled BIT = 1
      DECLARE @Dimension10Enabled BIT = 1

      /*
      --Here's the table definition for reference
      CREATE TABLE [dbo].[Table1](
      [Column1] [int] NOT NULL,
      [Column2] [uniqueidentifier] NOT NULL,
      [Column3] [uniqueidentifier] NOT NULL,
      [Column4] [char](3) NOT NULL,
      [Dimension1] [uniqueidentifier] NULL,
      [Dimension2] [uniqueidentifier] NULL,
      [Dimension3] [uniqueidentifier] NULL,
      [Dimension4] [uniqueidentifier] NULL,
      [Dimension5] [uniqueidentifier] NULL,
      [Dimension6] [uniqueidentifier] NULL,
      [Dimension7] [uniqueidentifier] NULL,
      [Dimension8] [uniqueidentifier] NULL,
      [Dimension9] [uniqueidentifier] NULL,
      [Dimension10] [uniqueidentifier] NULL,
      [Period] [int] NOT NULL,
      [Amt] [numeric](28, 12) NOT NULL,
      [EndingBal] [numeric](28, 12) NOT NULL,
      [PlanningSourceID] [int] NOT NULL,
      [Column5] [uniqueidentifier] NULL
      ) ON [PRIMARY]

      --Simply contains 0-12
      CREATE TABLE [dbo].[Table2] (
      [FiscalPeriod] [int] NOT NULL
      ,[FiscalPeriodDescription] [varchar](50) NOT NULL
      ,[FiscalPeriodDescriptionMonthName] [varchar](20) NULL
      ,[FiscalPeriodName] [varchar](255) NULL
      )
      */

      SELECT pfd.*
      FROM (
      SELECT PFDE1.Column1
      ,PFDE1.Column2
      ,PFDE1.Column3
      ,PFDE1.Column4
      ,PFDE1.Dimension1
      ,PFDE1.Dimension2
      ,PFDE1.[Dimension3]
      ,PFDE1.[Dimension4]
      ,PFDE1.[Dimension5]
      ,PFDE1.[Dimension6]
      ,PFDE1.[Dimension7]
      ,PFDE1.[Dimension8]
      ,PFDE1.[Dimension9]
      ,PFDE1.[Dimension10]
      ,p.FiscalPeriod AS Period
      ,0 AS Amt
      ,0 AS EndingBal
      ,PFDE1.Column5
      FROM (
      SELECT DISTINCT PFDE1.Column1
      ,PFDE1.Column2
      ,PFDE1.Column3
      ,PFDE1.Column4
      ,PFDE1.Dimension1
      ,PFDE1.Dimension2
      ,PFDE1.[Dimension3]
      ,PFDE1.[Dimension4]
      ,PFDE1.[Dimension5]
      ,PFDE1.[Dimension6]
      ,PFDE1.[Dimension7]
      ,PFDE1.[Dimension8]
      ,PFDE1.[Dimension9]
      ,PFDE1.[Dimension10]
      ,PFDE1.Column5
      FROM [dbo].[Table1] PFDE1
      ) PFDE1
      CROSS JOIN [dbo].[Table2] p
      ) pfd
      LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = pfd.Column1
      AND PFDE.Column2 = pfd.Column2
      AND PFDE.Column3 = pfd.Column3
      AND PFDE.Column4 = pfd.Column4
      --This section causes great slowness
      AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = pfd.Dimension1)
      AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = pfd.Dimension2)
      AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = pfd.Dimension3)
      AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = pfd.Dimension4)
      AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = pfd.Dimension5)
      AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = pfd.Dimension6)
      AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = pfd.Dimension7)
      AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = pfd.Dimension8)
      AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = pfd.Dimension9)
      AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = pfd.Dimension10)
      AND PFDE.Period = pfd.Period
      AND PFDE.Column5 = pfd.Column5
      WHERE pfde.Column1 IS NULL


      With 50,000 records in Table1 the performance of this query is quite poor, several minutes at least. Remove the "@DimensionXEnabled = 0 OR" condition though and the query will run in several seconds.



      I feel I have several options to optimize this, I could do one of the following:




      1. Change the "Dimension" columns to NOT NULL and populate with 0 Guid Values - I believe this would have a performance impact because now I'm coping a large amount of data with each insert. I've tested this option and it does work better but there is a definite performance impact and I'd like to avoid it if possible.


      2. Convert the query to dynamic SQL and generate the join conditions dynamically - Would need to do this in 100+ other places and have a DB full of dynamic SQL, not looking forward to this option either.


      3. Wrap the "Dimension1 = Dimension1" in ISNULL statements rather than based the join condition off of a variable - I believe this would result in a inefficient join statement as the query would no longer be able to use indexes.



      I feel that the SQL optimizer is making bad decisions with this query, if I write the query without the OR condition it run's quickly, even though the results are exactly the same. The query plans look almost identical other than that in the long running join, most of the processing time is used in the LEFT OUTER JOIN statements merge:



      (With "OR" Conditions)



      enter image description here



      (Without "OR" Conditions)



      enter image description here



      I'm unsure how to proceed, my question is, is there a better option?



      Update 1



      I've been doing more research into this, attempting various ways to get around the poor performance, so far I haven't found any good solutions just more information:




      1. Tried column "IS NULL" without success (PFDE.Dimension1 IS NULL OR PFDE.Dimension1 = pfd.Dimension1), ran even slower, that doesn't seem to do it.


      2. Tired join "Hints" (HASH, MERGE, LOOP), no performance impact here, in fact in some cases they just ran slower.


      3. Tried option 3, actually this seems to run very quick, I think because Table2 doesn't have any indexes, this approach could be a problem in the future though...


      4. I've tried dynamic SQL (@AaronBertrand recommendation), actually the results were somewhat underwhelming (30 sec+), not sure why it performed so poorly, but that might be something that I could improve with some further optimizations, I'm looking into this more..


      5. I tried the "OPTION (RECOMPILE);" mentioned in this post. Saw definite improvements (~15 secs), still not as fast as not dimensions but maybe something else I can iterate on.











      share|improve this question
















      I have a tricky bit of SQL that's giving me a bit of trouble. I'd like to introduce several new columns to a key table whose components may or may not be used based on user configuration (Dimension's 1-10). For architectural reasons these are stored as "UNIQUEIDENTIFIER"s.



      However upon the introduction of these new dimension columns in my DB, I'm now facing performance problems in several other stored procs that manipulate data in this key table, one of which I will include below. It appears as though the method that I have implemented to support these new columns is inefficient and I'm not exactly sure why that is the case and the best way to remedy it.



      (Included SQL)



      --In real usage this would be set by user configuration
      DECLARE @Dimension1Enabled BIT = 1
      DECLARE @Dimension2Enabled BIT = 1
      DECLARE @Dimension3Enabled BIT = 1
      DECLARE @Dimension4Enabled BIT = 1
      DECLARE @Dimension5Enabled BIT = 1
      DECLARE @Dimension6Enabled BIT = 1
      DECLARE @Dimension7Enabled BIT = 1
      DECLARE @Dimension8Enabled BIT = 1
      DECLARE @Dimension9Enabled BIT = 1
      DECLARE @Dimension10Enabled BIT = 1

      /*
      --Here's the table definition for reference
      CREATE TABLE [dbo].[Table1](
      [Column1] [int] NOT NULL,
      [Column2] [uniqueidentifier] NOT NULL,
      [Column3] [uniqueidentifier] NOT NULL,
      [Column4] [char](3) NOT NULL,
      [Dimension1] [uniqueidentifier] NULL,
      [Dimension2] [uniqueidentifier] NULL,
      [Dimension3] [uniqueidentifier] NULL,
      [Dimension4] [uniqueidentifier] NULL,
      [Dimension5] [uniqueidentifier] NULL,
      [Dimension6] [uniqueidentifier] NULL,
      [Dimension7] [uniqueidentifier] NULL,
      [Dimension8] [uniqueidentifier] NULL,
      [Dimension9] [uniqueidentifier] NULL,
      [Dimension10] [uniqueidentifier] NULL,
      [Period] [int] NOT NULL,
      [Amt] [numeric](28, 12) NOT NULL,
      [EndingBal] [numeric](28, 12) NOT NULL,
      [PlanningSourceID] [int] NOT NULL,
      [Column5] [uniqueidentifier] NULL
      ) ON [PRIMARY]

      --Simply contains 0-12
      CREATE TABLE [dbo].[Table2] (
      [FiscalPeriod] [int] NOT NULL
      ,[FiscalPeriodDescription] [varchar](50) NOT NULL
      ,[FiscalPeriodDescriptionMonthName] [varchar](20) NULL
      ,[FiscalPeriodName] [varchar](255) NULL
      )
      */

      SELECT pfd.*
      FROM (
      SELECT PFDE1.Column1
      ,PFDE1.Column2
      ,PFDE1.Column3
      ,PFDE1.Column4
      ,PFDE1.Dimension1
      ,PFDE1.Dimension2
      ,PFDE1.[Dimension3]
      ,PFDE1.[Dimension4]
      ,PFDE1.[Dimension5]
      ,PFDE1.[Dimension6]
      ,PFDE1.[Dimension7]
      ,PFDE1.[Dimension8]
      ,PFDE1.[Dimension9]
      ,PFDE1.[Dimension10]
      ,p.FiscalPeriod AS Period
      ,0 AS Amt
      ,0 AS EndingBal
      ,PFDE1.Column5
      FROM (
      SELECT DISTINCT PFDE1.Column1
      ,PFDE1.Column2
      ,PFDE1.Column3
      ,PFDE1.Column4
      ,PFDE1.Dimension1
      ,PFDE1.Dimension2
      ,PFDE1.[Dimension3]
      ,PFDE1.[Dimension4]
      ,PFDE1.[Dimension5]
      ,PFDE1.[Dimension6]
      ,PFDE1.[Dimension7]
      ,PFDE1.[Dimension8]
      ,PFDE1.[Dimension9]
      ,PFDE1.[Dimension10]
      ,PFDE1.Column5
      FROM [dbo].[Table1] PFDE1
      ) PFDE1
      CROSS JOIN [dbo].[Table2] p
      ) pfd
      LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = pfd.Column1
      AND PFDE.Column2 = pfd.Column2
      AND PFDE.Column3 = pfd.Column3
      AND PFDE.Column4 = pfd.Column4
      --This section causes great slowness
      AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = pfd.Dimension1)
      AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = pfd.Dimension2)
      AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = pfd.Dimension3)
      AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = pfd.Dimension4)
      AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = pfd.Dimension5)
      AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = pfd.Dimension6)
      AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = pfd.Dimension7)
      AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = pfd.Dimension8)
      AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = pfd.Dimension9)
      AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = pfd.Dimension10)
      AND PFDE.Period = pfd.Period
      AND PFDE.Column5 = pfd.Column5
      WHERE pfde.Column1 IS NULL


      With 50,000 records in Table1 the performance of this query is quite poor, several minutes at least. Remove the "@DimensionXEnabled = 0 OR" condition though and the query will run in several seconds.



      I feel I have several options to optimize this, I could do one of the following:




      1. Change the "Dimension" columns to NOT NULL and populate with 0 Guid Values - I believe this would have a performance impact because now I'm coping a large amount of data with each insert. I've tested this option and it does work better but there is a definite performance impact and I'd like to avoid it if possible.


      2. Convert the query to dynamic SQL and generate the join conditions dynamically - Would need to do this in 100+ other places and have a DB full of dynamic SQL, not looking forward to this option either.


      3. Wrap the "Dimension1 = Dimension1" in ISNULL statements rather than based the join condition off of a variable - I believe this would result in a inefficient join statement as the query would no longer be able to use indexes.



      I feel that the SQL optimizer is making bad decisions with this query, if I write the query without the OR condition it run's quickly, even though the results are exactly the same. The query plans look almost identical other than that in the long running join, most of the processing time is used in the LEFT OUTER JOIN statements merge:



      (With "OR" Conditions)



      enter image description here



      (Without "OR" Conditions)



      enter image description here



      I'm unsure how to proceed, my question is, is there a better option?



      Update 1



      I've been doing more research into this, attempting various ways to get around the poor performance, so far I haven't found any good solutions just more information:




      1. Tried column "IS NULL" without success (PFDE.Dimension1 IS NULL OR PFDE.Dimension1 = pfd.Dimension1), ran even slower, that doesn't seem to do it.


      2. Tired join "Hints" (HASH, MERGE, LOOP), no performance impact here, in fact in some cases they just ran slower.


      3. Tried option 3, actually this seems to run very quick, I think because Table2 doesn't have any indexes, this approach could be a problem in the future though...


      4. I've tried dynamic SQL (@AaronBertrand recommendation), actually the results were somewhat underwhelming (30 sec+), not sure why it performed so poorly, but that might be something that I could improve with some further optimizations, I'm looking into this more..


      5. I tried the "OPTION (RECOMPILE);" mentioned in this post. Saw definite improvements (~15 secs), still not as fast as not dimensions but maybe something else I can iterate on.








      sql-server database-design performance query-performance optimization






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 17 at 22:48







      David Rogers

















      asked Jan 17 at 15:54









      David RogersDavid Rogers

      1095




      1095






















          2 Answers
          2






          active

          oldest

          votes


















          2














          Sounds like a kitchen sink query. You can make this work well for most cases with dynamic SQL, since you can generate a different execution plan for each combination of parameters.



          DECLARE @sql nvarchar(max) = N'
          SELECT DISTINCT PFDE1.Column1
          ,PFDE1.Column2
          ,PFDE1.Column3
          ,PFDE1.Column4
          ,PFDE1.Dimension1
          ,PFDE1.Dimension2
          ...
          ,PFDE1.[Dimension10]
          ,PFDE1.Column5
          FROM [dbo].[Table1] PFDE1
          ) PFDE1
          CROSS JOIN [dbo].[Table2] p
          ) pfd
          LEFT JOIN [dbo].[Table1] pfde
          ON PFDE.Column1 = pfd.Column1
          AND PFDE.Column2 = pfd.Column2
          AND PFDE.Column3 = pfd.Column3
          AND PFDE.Column4 = pfd.Column4'

          + CASE WHEN @Dimension1Enabled = 1 THEN
          N' AND PFDE.Dimension1 = pfd.Dimension1' ELSE N'' END
          + CASE WHEN @Dimension2Enabled = 1 THEN
          N' AND PFDE.Dimension2 = pfd.Dimension2' ELSE N'' END

          + ...

          + CASE WHEN @Dimension10Enabled = 1 THEN
          N' AND PFDE.Dimension10 = pfd.Dimension10' ELSE '' END

          + N' AND PFDE.Period = pfd.Period
          AND PFDE.Column5 = pfd.Column5
          WHERE pfde.Column1 IS NULL;';

          EXEC sys.sp_executesql @sql;


          I'd also drop the DISTINCT - if it's really necessary because of duplicates then there is an underlying problem to be fixed as well.






          share|improve this answer
























          • Yep, this would be a good implementation of option 2, but I'm kinda hoping to avoiding this as if I followed this approach here I'd have to resort to this everywhere else I implement this, there are 100+ places and I'd prefer not to maintain a DB full of dynamic SQL...

            – David Rogers
            Jan 17 at 16:21











          • Well 1 and 3 don't sound like viable options either, since 1 is a performance hit and 3 is not going to change anything (they boil down to the same logic), so... sometimes you just have to pick your poison. A pattern is just a pattern and if dealing with users screaming is more important than a little extra code, you need to make a choice.

            – Aaron Bertrand
            Jan 17 at 16:35













          • Well each has it's advantages and disadvantages, the issue is I'm not just making a decision for this one query, I'm making a decision for 10+ tables, 100+ procs, and many, many other associated items, it will take over a month to implement and test it properly, and once it's done I'll have to live with the consequences in over 100 production environment for years. So this question really is only the tip of the iceberg, a small manifestation of a broader architectural decision that I need to make and then live with. Though I accept that I may need to go with more than one option...

            – David Rogers
            Jan 17 at 23:37











          • Well we as a peer group can tell you your options, and tell you which ones we think are best, but we can't crowd-source your decision for you. As for the testing mentioned in your update, did you look at any execution plans? Any query that takes 15 or 20 seconds is already 100 times worse than it probably should be. Why is it taking so long? What are the wait types? Is it being blocked? Are you returning 40 billion rows to an SSMS grid over a slow network connection? If you read the kitchen sink article I posted, dynamic SQL can help for presence/absence of parameters, and option (recompile)

            – Aaron Bertrand
            Jan 18 at 12:16













          • ...can help for data skew. You can also combine them. But given a 15 second query, using dynamic SQL or not is unlikely to fix that, there is something else bigger going on here.

            – Aaron Bertrand
            Jan 18 at 12:20





















          0














          IMO,Biggest mistake is the place where you are using CROSS JOIN.



          First self join [dbo].[Table1] and filter the resultset then with this resultset use CROSS JOIN.It will give tremendous boost .



          SELECT pfd.*
          FROM (
          SELECT PFDE1.Column1
          ,PFDE1.Column2
          ,PFDE1.Column3
          ,PFDE1.Column4
          ,PFDE1.Dimension1
          ,PFDE1.Dimension2
          ,PFDE1.[Dimension3]
          ,PFDE1.[Dimension4]
          ,PFDE1.[Dimension5]
          ,PFDE1.[Dimension6]
          ,PFDE1.[Dimension7]
          ,PFDE1.[Dimension8]
          ,PFDE1.[Dimension9]
          ,PFDE1.[Dimension10]
          ,p.FiscalPeriod AS Period
          ,0 AS Amt
          ,0 AS EndingBal
          ,PFDE1.Column5
          FROM (
          SELECT DISTINCT PFDE1.Column1
          ,PFDE1.Column2
          ,PFDE1.Column3
          ,PFDE1.Column4
          ,PFDE1.Dimension1
          ,PFDE1.Dimension2
          ,PFDE1.[Dimension3]
          ,PFDE1.[Dimension4]
          ,PFDE1.[Dimension5]
          ,PFDE1.[Dimension6]
          ,PFDE1.[Dimension7]
          ,PFDE1.[Dimension8]
          ,PFDE1.[Dimension9]
          ,PFDE1.[Dimension10]
          ,PFDE1.Column5
          FROM [dbo].[Table1] PFDE1
          LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = PFDE1.Column1
          AND PFDE.Column2 = PFDE1.Column2
          AND PFDE.Column3 = PFDE1.Column3
          AND PFDE.Column4 = PFDE1.Column4
          AND PFDE.Period = PFDE1.Period
          AND PFDE.Column5 = PFDE1.Column5
          AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = PFDE1.Dimension1)
          AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = PFDE1.Dimension2)
          AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = PFDE1.Dimension3)
          AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = PFDE1.Dimension4)
          AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = PFDE1.Dimension5)
          AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = PFDE1.Dimension6)
          AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = PFDE1.Dimension7)
          AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = PFDE1.Dimension8)
          AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = PFDE1.Dimension9)
          AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = PFDE1.Dimension10)

          WHERE pfde.Column1 IS NULL
          ) PFDE1
          CROSS JOIN [dbo].[Table2] p
          ) pfd


          Hope my point is clear,it is untested.



          No doubts Dynamic Sql will further improve it, EXEC sys.sp_executesql @sql




          I'd also drop the DISTINCT - if it's really necessary because of
          duplicates then there is an underlying problem to be fixed as well.






          share























            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "182"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f227393%2fpoor-sql-server-performance-with-nested-or-conditions-in-left-outer-join%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            2














            Sounds like a kitchen sink query. You can make this work well for most cases with dynamic SQL, since you can generate a different execution plan for each combination of parameters.



            DECLARE @sql nvarchar(max) = N'
            SELECT DISTINCT PFDE1.Column1
            ,PFDE1.Column2
            ,PFDE1.Column3
            ,PFDE1.Column4
            ,PFDE1.Dimension1
            ,PFDE1.Dimension2
            ...
            ,PFDE1.[Dimension10]
            ,PFDE1.Column5
            FROM [dbo].[Table1] PFDE1
            ) PFDE1
            CROSS JOIN [dbo].[Table2] p
            ) pfd
            LEFT JOIN [dbo].[Table1] pfde
            ON PFDE.Column1 = pfd.Column1
            AND PFDE.Column2 = pfd.Column2
            AND PFDE.Column3 = pfd.Column3
            AND PFDE.Column4 = pfd.Column4'

            + CASE WHEN @Dimension1Enabled = 1 THEN
            N' AND PFDE.Dimension1 = pfd.Dimension1' ELSE N'' END
            + CASE WHEN @Dimension2Enabled = 1 THEN
            N' AND PFDE.Dimension2 = pfd.Dimension2' ELSE N'' END

            + ...

            + CASE WHEN @Dimension10Enabled = 1 THEN
            N' AND PFDE.Dimension10 = pfd.Dimension10' ELSE '' END

            + N' AND PFDE.Period = pfd.Period
            AND PFDE.Column5 = pfd.Column5
            WHERE pfde.Column1 IS NULL;';

            EXEC sys.sp_executesql @sql;


            I'd also drop the DISTINCT - if it's really necessary because of duplicates then there is an underlying problem to be fixed as well.






            share|improve this answer
























            • Yep, this would be a good implementation of option 2, but I'm kinda hoping to avoiding this as if I followed this approach here I'd have to resort to this everywhere else I implement this, there are 100+ places and I'd prefer not to maintain a DB full of dynamic SQL...

              – David Rogers
              Jan 17 at 16:21











            • Well 1 and 3 don't sound like viable options either, since 1 is a performance hit and 3 is not going to change anything (they boil down to the same logic), so... sometimes you just have to pick your poison. A pattern is just a pattern and if dealing with users screaming is more important than a little extra code, you need to make a choice.

              – Aaron Bertrand
              Jan 17 at 16:35













            • Well each has it's advantages and disadvantages, the issue is I'm not just making a decision for this one query, I'm making a decision for 10+ tables, 100+ procs, and many, many other associated items, it will take over a month to implement and test it properly, and once it's done I'll have to live with the consequences in over 100 production environment for years. So this question really is only the tip of the iceberg, a small manifestation of a broader architectural decision that I need to make and then live with. Though I accept that I may need to go with more than one option...

              – David Rogers
              Jan 17 at 23:37











            • Well we as a peer group can tell you your options, and tell you which ones we think are best, but we can't crowd-source your decision for you. As for the testing mentioned in your update, did you look at any execution plans? Any query that takes 15 or 20 seconds is already 100 times worse than it probably should be. Why is it taking so long? What are the wait types? Is it being blocked? Are you returning 40 billion rows to an SSMS grid over a slow network connection? If you read the kitchen sink article I posted, dynamic SQL can help for presence/absence of parameters, and option (recompile)

              – Aaron Bertrand
              Jan 18 at 12:16













            • ...can help for data skew. You can also combine them. But given a 15 second query, using dynamic SQL or not is unlikely to fix that, there is something else bigger going on here.

              – Aaron Bertrand
              Jan 18 at 12:20


















            2














            Sounds like a kitchen sink query. You can make this work well for most cases with dynamic SQL, since you can generate a different execution plan for each combination of parameters.



            DECLARE @sql nvarchar(max) = N'
            SELECT DISTINCT PFDE1.Column1
            ,PFDE1.Column2
            ,PFDE1.Column3
            ,PFDE1.Column4
            ,PFDE1.Dimension1
            ,PFDE1.Dimension2
            ...
            ,PFDE1.[Dimension10]
            ,PFDE1.Column5
            FROM [dbo].[Table1] PFDE1
            ) PFDE1
            CROSS JOIN [dbo].[Table2] p
            ) pfd
            LEFT JOIN [dbo].[Table1] pfde
            ON PFDE.Column1 = pfd.Column1
            AND PFDE.Column2 = pfd.Column2
            AND PFDE.Column3 = pfd.Column3
            AND PFDE.Column4 = pfd.Column4'

            + CASE WHEN @Dimension1Enabled = 1 THEN
            N' AND PFDE.Dimension1 = pfd.Dimension1' ELSE N'' END
            + CASE WHEN @Dimension2Enabled = 1 THEN
            N' AND PFDE.Dimension2 = pfd.Dimension2' ELSE N'' END

            + ...

            + CASE WHEN @Dimension10Enabled = 1 THEN
            N' AND PFDE.Dimension10 = pfd.Dimension10' ELSE '' END

            + N' AND PFDE.Period = pfd.Period
            AND PFDE.Column5 = pfd.Column5
            WHERE pfde.Column1 IS NULL;';

            EXEC sys.sp_executesql @sql;


            I'd also drop the DISTINCT - if it's really necessary because of duplicates then there is an underlying problem to be fixed as well.






            share|improve this answer
























            • Yep, this would be a good implementation of option 2, but I'm kinda hoping to avoiding this as if I followed this approach here I'd have to resort to this everywhere else I implement this, there are 100+ places and I'd prefer not to maintain a DB full of dynamic SQL...

              – David Rogers
              Jan 17 at 16:21











            • Well 1 and 3 don't sound like viable options either, since 1 is a performance hit and 3 is not going to change anything (they boil down to the same logic), so... sometimes you just have to pick your poison. A pattern is just a pattern and if dealing with users screaming is more important than a little extra code, you need to make a choice.

              – Aaron Bertrand
              Jan 17 at 16:35













            • Well each has it's advantages and disadvantages, the issue is I'm not just making a decision for this one query, I'm making a decision for 10+ tables, 100+ procs, and many, many other associated items, it will take over a month to implement and test it properly, and once it's done I'll have to live with the consequences in over 100 production environment for years. So this question really is only the tip of the iceberg, a small manifestation of a broader architectural decision that I need to make and then live with. Though I accept that I may need to go with more than one option...

              – David Rogers
              Jan 17 at 23:37











            • Well we as a peer group can tell you your options, and tell you which ones we think are best, but we can't crowd-source your decision for you. As for the testing mentioned in your update, did you look at any execution plans? Any query that takes 15 or 20 seconds is already 100 times worse than it probably should be. Why is it taking so long? What are the wait types? Is it being blocked? Are you returning 40 billion rows to an SSMS grid over a slow network connection? If you read the kitchen sink article I posted, dynamic SQL can help for presence/absence of parameters, and option (recompile)

              – Aaron Bertrand
              Jan 18 at 12:16













            • ...can help for data skew. You can also combine them. But given a 15 second query, using dynamic SQL or not is unlikely to fix that, there is something else bigger going on here.

              – Aaron Bertrand
              Jan 18 at 12:20
















            2












            2








            2







            Sounds like a kitchen sink query. You can make this work well for most cases with dynamic SQL, since you can generate a different execution plan for each combination of parameters.



            DECLARE @sql nvarchar(max) = N'
            SELECT DISTINCT PFDE1.Column1
            ,PFDE1.Column2
            ,PFDE1.Column3
            ,PFDE1.Column4
            ,PFDE1.Dimension1
            ,PFDE1.Dimension2
            ...
            ,PFDE1.[Dimension10]
            ,PFDE1.Column5
            FROM [dbo].[Table1] PFDE1
            ) PFDE1
            CROSS JOIN [dbo].[Table2] p
            ) pfd
            LEFT JOIN [dbo].[Table1] pfde
            ON PFDE.Column1 = pfd.Column1
            AND PFDE.Column2 = pfd.Column2
            AND PFDE.Column3 = pfd.Column3
            AND PFDE.Column4 = pfd.Column4'

            + CASE WHEN @Dimension1Enabled = 1 THEN
            N' AND PFDE.Dimension1 = pfd.Dimension1' ELSE N'' END
            + CASE WHEN @Dimension2Enabled = 1 THEN
            N' AND PFDE.Dimension2 = pfd.Dimension2' ELSE N'' END

            + ...

            + CASE WHEN @Dimension10Enabled = 1 THEN
            N' AND PFDE.Dimension10 = pfd.Dimension10' ELSE '' END

            + N' AND PFDE.Period = pfd.Period
            AND PFDE.Column5 = pfd.Column5
            WHERE pfde.Column1 IS NULL;';

            EXEC sys.sp_executesql @sql;


            I'd also drop the DISTINCT - if it's really necessary because of duplicates then there is an underlying problem to be fixed as well.






            share|improve this answer













            Sounds like a kitchen sink query. You can make this work well for most cases with dynamic SQL, since you can generate a different execution plan for each combination of parameters.



            DECLARE @sql nvarchar(max) = N'
            SELECT DISTINCT PFDE1.Column1
            ,PFDE1.Column2
            ,PFDE1.Column3
            ,PFDE1.Column4
            ,PFDE1.Dimension1
            ,PFDE1.Dimension2
            ...
            ,PFDE1.[Dimension10]
            ,PFDE1.Column5
            FROM [dbo].[Table1] PFDE1
            ) PFDE1
            CROSS JOIN [dbo].[Table2] p
            ) pfd
            LEFT JOIN [dbo].[Table1] pfde
            ON PFDE.Column1 = pfd.Column1
            AND PFDE.Column2 = pfd.Column2
            AND PFDE.Column3 = pfd.Column3
            AND PFDE.Column4 = pfd.Column4'

            + CASE WHEN @Dimension1Enabled = 1 THEN
            N' AND PFDE.Dimension1 = pfd.Dimension1' ELSE N'' END
            + CASE WHEN @Dimension2Enabled = 1 THEN
            N' AND PFDE.Dimension2 = pfd.Dimension2' ELSE N'' END

            + ...

            + CASE WHEN @Dimension10Enabled = 1 THEN
            N' AND PFDE.Dimension10 = pfd.Dimension10' ELSE '' END

            + N' AND PFDE.Period = pfd.Period
            AND PFDE.Column5 = pfd.Column5
            WHERE pfde.Column1 IS NULL;';

            EXEC sys.sp_executesql @sql;


            I'd also drop the DISTINCT - if it's really necessary because of duplicates then there is an underlying problem to be fixed as well.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Jan 17 at 16:16









            Aaron BertrandAaron Bertrand

            150k18284482




            150k18284482













            • Yep, this would be a good implementation of option 2, but I'm kinda hoping to avoiding this as if I followed this approach here I'd have to resort to this everywhere else I implement this, there are 100+ places and I'd prefer not to maintain a DB full of dynamic SQL...

              – David Rogers
              Jan 17 at 16:21











            • Well 1 and 3 don't sound like viable options either, since 1 is a performance hit and 3 is not going to change anything (they boil down to the same logic), so... sometimes you just have to pick your poison. A pattern is just a pattern and if dealing with users screaming is more important than a little extra code, you need to make a choice.

              – Aaron Bertrand
              Jan 17 at 16:35













            • Well each has it's advantages and disadvantages, the issue is I'm not just making a decision for this one query, I'm making a decision for 10+ tables, 100+ procs, and many, many other associated items, it will take over a month to implement and test it properly, and once it's done I'll have to live with the consequences in over 100 production environment for years. So this question really is only the tip of the iceberg, a small manifestation of a broader architectural decision that I need to make and then live with. Though I accept that I may need to go with more than one option...

              – David Rogers
              Jan 17 at 23:37











            • Well we as a peer group can tell you your options, and tell you which ones we think are best, but we can't crowd-source your decision for you. As for the testing mentioned in your update, did you look at any execution plans? Any query that takes 15 or 20 seconds is already 100 times worse than it probably should be. Why is it taking so long? What are the wait types? Is it being blocked? Are you returning 40 billion rows to an SSMS grid over a slow network connection? If you read the kitchen sink article I posted, dynamic SQL can help for presence/absence of parameters, and option (recompile)

              – Aaron Bertrand
              Jan 18 at 12:16













            • ...can help for data skew. You can also combine them. But given a 15 second query, using dynamic SQL or not is unlikely to fix that, there is something else bigger going on here.

              – Aaron Bertrand
              Jan 18 at 12:20





















            • Yep, this would be a good implementation of option 2, but I'm kinda hoping to avoiding this as if I followed this approach here I'd have to resort to this everywhere else I implement this, there are 100+ places and I'd prefer not to maintain a DB full of dynamic SQL...

              – David Rogers
              Jan 17 at 16:21











            • Well 1 and 3 don't sound like viable options either, since 1 is a performance hit and 3 is not going to change anything (they boil down to the same logic), so... sometimes you just have to pick your poison. A pattern is just a pattern and if dealing with users screaming is more important than a little extra code, you need to make a choice.

              – Aaron Bertrand
              Jan 17 at 16:35













            • Well each has it's advantages and disadvantages, the issue is I'm not just making a decision for this one query, I'm making a decision for 10+ tables, 100+ procs, and many, many other associated items, it will take over a month to implement and test it properly, and once it's done I'll have to live with the consequences in over 100 production environment for years. So this question really is only the tip of the iceberg, a small manifestation of a broader architectural decision that I need to make and then live with. Though I accept that I may need to go with more than one option...

              – David Rogers
              Jan 17 at 23:37











            • Well we as a peer group can tell you your options, and tell you which ones we think are best, but we can't crowd-source your decision for you. As for the testing mentioned in your update, did you look at any execution plans? Any query that takes 15 or 20 seconds is already 100 times worse than it probably should be. Why is it taking so long? What are the wait types? Is it being blocked? Are you returning 40 billion rows to an SSMS grid over a slow network connection? If you read the kitchen sink article I posted, dynamic SQL can help for presence/absence of parameters, and option (recompile)

              – Aaron Bertrand
              Jan 18 at 12:16













            • ...can help for data skew. You can also combine them. But given a 15 second query, using dynamic SQL or not is unlikely to fix that, there is something else bigger going on here.

              – Aaron Bertrand
              Jan 18 at 12:20



















            Yep, this would be a good implementation of option 2, but I'm kinda hoping to avoiding this as if I followed this approach here I'd have to resort to this everywhere else I implement this, there are 100+ places and I'd prefer not to maintain a DB full of dynamic SQL...

            – David Rogers
            Jan 17 at 16:21





            Yep, this would be a good implementation of option 2, but I'm kinda hoping to avoiding this as if I followed this approach here I'd have to resort to this everywhere else I implement this, there are 100+ places and I'd prefer not to maintain a DB full of dynamic SQL...

            – David Rogers
            Jan 17 at 16:21













            Well 1 and 3 don't sound like viable options either, since 1 is a performance hit and 3 is not going to change anything (they boil down to the same logic), so... sometimes you just have to pick your poison. A pattern is just a pattern and if dealing with users screaming is more important than a little extra code, you need to make a choice.

            – Aaron Bertrand
            Jan 17 at 16:35







            Well 1 and 3 don't sound like viable options either, since 1 is a performance hit and 3 is not going to change anything (they boil down to the same logic), so... sometimes you just have to pick your poison. A pattern is just a pattern and if dealing with users screaming is more important than a little extra code, you need to make a choice.

            – Aaron Bertrand
            Jan 17 at 16:35















            Well each has it's advantages and disadvantages, the issue is I'm not just making a decision for this one query, I'm making a decision for 10+ tables, 100+ procs, and many, many other associated items, it will take over a month to implement and test it properly, and once it's done I'll have to live with the consequences in over 100 production environment for years. So this question really is only the tip of the iceberg, a small manifestation of a broader architectural decision that I need to make and then live with. Though I accept that I may need to go with more than one option...

            – David Rogers
            Jan 17 at 23:37





            Well each has it's advantages and disadvantages, the issue is I'm not just making a decision for this one query, I'm making a decision for 10+ tables, 100+ procs, and many, many other associated items, it will take over a month to implement and test it properly, and once it's done I'll have to live with the consequences in over 100 production environment for years. So this question really is only the tip of the iceberg, a small manifestation of a broader architectural decision that I need to make and then live with. Though I accept that I may need to go with more than one option...

            – David Rogers
            Jan 17 at 23:37













            Well we as a peer group can tell you your options, and tell you which ones we think are best, but we can't crowd-source your decision for you. As for the testing mentioned in your update, did you look at any execution plans? Any query that takes 15 or 20 seconds is already 100 times worse than it probably should be. Why is it taking so long? What are the wait types? Is it being blocked? Are you returning 40 billion rows to an SSMS grid over a slow network connection? If you read the kitchen sink article I posted, dynamic SQL can help for presence/absence of parameters, and option (recompile)

            – Aaron Bertrand
            Jan 18 at 12:16







            Well we as a peer group can tell you your options, and tell you which ones we think are best, but we can't crowd-source your decision for you. As for the testing mentioned in your update, did you look at any execution plans? Any query that takes 15 or 20 seconds is already 100 times worse than it probably should be. Why is it taking so long? What are the wait types? Is it being blocked? Are you returning 40 billion rows to an SSMS grid over a slow network connection? If you read the kitchen sink article I posted, dynamic SQL can help for presence/absence of parameters, and option (recompile)

            – Aaron Bertrand
            Jan 18 at 12:16















            ...can help for data skew. You can also combine them. But given a 15 second query, using dynamic SQL or not is unlikely to fix that, there is something else bigger going on here.

            – Aaron Bertrand
            Jan 18 at 12:20







            ...can help for data skew. You can also combine them. But given a 15 second query, using dynamic SQL or not is unlikely to fix that, there is something else bigger going on here.

            – Aaron Bertrand
            Jan 18 at 12:20















            0














            IMO,Biggest mistake is the place where you are using CROSS JOIN.



            First self join [dbo].[Table1] and filter the resultset then with this resultset use CROSS JOIN.It will give tremendous boost .



            SELECT pfd.*
            FROM (
            SELECT PFDE1.Column1
            ,PFDE1.Column2
            ,PFDE1.Column3
            ,PFDE1.Column4
            ,PFDE1.Dimension1
            ,PFDE1.Dimension2
            ,PFDE1.[Dimension3]
            ,PFDE1.[Dimension4]
            ,PFDE1.[Dimension5]
            ,PFDE1.[Dimension6]
            ,PFDE1.[Dimension7]
            ,PFDE1.[Dimension8]
            ,PFDE1.[Dimension9]
            ,PFDE1.[Dimension10]
            ,p.FiscalPeriod AS Period
            ,0 AS Amt
            ,0 AS EndingBal
            ,PFDE1.Column5
            FROM (
            SELECT DISTINCT PFDE1.Column1
            ,PFDE1.Column2
            ,PFDE1.Column3
            ,PFDE1.Column4
            ,PFDE1.Dimension1
            ,PFDE1.Dimension2
            ,PFDE1.[Dimension3]
            ,PFDE1.[Dimension4]
            ,PFDE1.[Dimension5]
            ,PFDE1.[Dimension6]
            ,PFDE1.[Dimension7]
            ,PFDE1.[Dimension8]
            ,PFDE1.[Dimension9]
            ,PFDE1.[Dimension10]
            ,PFDE1.Column5
            FROM [dbo].[Table1] PFDE1
            LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = PFDE1.Column1
            AND PFDE.Column2 = PFDE1.Column2
            AND PFDE.Column3 = PFDE1.Column3
            AND PFDE.Column4 = PFDE1.Column4
            AND PFDE.Period = PFDE1.Period
            AND PFDE.Column5 = PFDE1.Column5
            AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = PFDE1.Dimension1)
            AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = PFDE1.Dimension2)
            AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = PFDE1.Dimension3)
            AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = PFDE1.Dimension4)
            AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = PFDE1.Dimension5)
            AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = PFDE1.Dimension6)
            AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = PFDE1.Dimension7)
            AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = PFDE1.Dimension8)
            AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = PFDE1.Dimension9)
            AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = PFDE1.Dimension10)

            WHERE pfde.Column1 IS NULL
            ) PFDE1
            CROSS JOIN [dbo].[Table2] p
            ) pfd


            Hope my point is clear,it is untested.



            No doubts Dynamic Sql will further improve it, EXEC sys.sp_executesql @sql




            I'd also drop the DISTINCT - if it's really necessary because of
            duplicates then there is an underlying problem to be fixed as well.






            share




























              0














              IMO,Biggest mistake is the place where you are using CROSS JOIN.



              First self join [dbo].[Table1] and filter the resultset then with this resultset use CROSS JOIN.It will give tremendous boost .



              SELECT pfd.*
              FROM (
              SELECT PFDE1.Column1
              ,PFDE1.Column2
              ,PFDE1.Column3
              ,PFDE1.Column4
              ,PFDE1.Dimension1
              ,PFDE1.Dimension2
              ,PFDE1.[Dimension3]
              ,PFDE1.[Dimension4]
              ,PFDE1.[Dimension5]
              ,PFDE1.[Dimension6]
              ,PFDE1.[Dimension7]
              ,PFDE1.[Dimension8]
              ,PFDE1.[Dimension9]
              ,PFDE1.[Dimension10]
              ,p.FiscalPeriod AS Period
              ,0 AS Amt
              ,0 AS EndingBal
              ,PFDE1.Column5
              FROM (
              SELECT DISTINCT PFDE1.Column1
              ,PFDE1.Column2
              ,PFDE1.Column3
              ,PFDE1.Column4
              ,PFDE1.Dimension1
              ,PFDE1.Dimension2
              ,PFDE1.[Dimension3]
              ,PFDE1.[Dimension4]
              ,PFDE1.[Dimension5]
              ,PFDE1.[Dimension6]
              ,PFDE1.[Dimension7]
              ,PFDE1.[Dimension8]
              ,PFDE1.[Dimension9]
              ,PFDE1.[Dimension10]
              ,PFDE1.Column5
              FROM [dbo].[Table1] PFDE1
              LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = PFDE1.Column1
              AND PFDE.Column2 = PFDE1.Column2
              AND PFDE.Column3 = PFDE1.Column3
              AND PFDE.Column4 = PFDE1.Column4
              AND PFDE.Period = PFDE1.Period
              AND PFDE.Column5 = PFDE1.Column5
              AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = PFDE1.Dimension1)
              AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = PFDE1.Dimension2)
              AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = PFDE1.Dimension3)
              AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = PFDE1.Dimension4)
              AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = PFDE1.Dimension5)
              AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = PFDE1.Dimension6)
              AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = PFDE1.Dimension7)
              AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = PFDE1.Dimension8)
              AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = PFDE1.Dimension9)
              AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = PFDE1.Dimension10)

              WHERE pfde.Column1 IS NULL
              ) PFDE1
              CROSS JOIN [dbo].[Table2] p
              ) pfd


              Hope my point is clear,it is untested.



              No doubts Dynamic Sql will further improve it, EXEC sys.sp_executesql @sql




              I'd also drop the DISTINCT - if it's really necessary because of
              duplicates then there is an underlying problem to be fixed as well.






              share


























                0












                0








                0







                IMO,Biggest mistake is the place where you are using CROSS JOIN.



                First self join [dbo].[Table1] and filter the resultset then with this resultset use CROSS JOIN.It will give tremendous boost .



                SELECT pfd.*
                FROM (
                SELECT PFDE1.Column1
                ,PFDE1.Column2
                ,PFDE1.Column3
                ,PFDE1.Column4
                ,PFDE1.Dimension1
                ,PFDE1.Dimension2
                ,PFDE1.[Dimension3]
                ,PFDE1.[Dimension4]
                ,PFDE1.[Dimension5]
                ,PFDE1.[Dimension6]
                ,PFDE1.[Dimension7]
                ,PFDE1.[Dimension8]
                ,PFDE1.[Dimension9]
                ,PFDE1.[Dimension10]
                ,p.FiscalPeriod AS Period
                ,0 AS Amt
                ,0 AS EndingBal
                ,PFDE1.Column5
                FROM (
                SELECT DISTINCT PFDE1.Column1
                ,PFDE1.Column2
                ,PFDE1.Column3
                ,PFDE1.Column4
                ,PFDE1.Dimension1
                ,PFDE1.Dimension2
                ,PFDE1.[Dimension3]
                ,PFDE1.[Dimension4]
                ,PFDE1.[Dimension5]
                ,PFDE1.[Dimension6]
                ,PFDE1.[Dimension7]
                ,PFDE1.[Dimension8]
                ,PFDE1.[Dimension9]
                ,PFDE1.[Dimension10]
                ,PFDE1.Column5
                FROM [dbo].[Table1] PFDE1
                LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = PFDE1.Column1
                AND PFDE.Column2 = PFDE1.Column2
                AND PFDE.Column3 = PFDE1.Column3
                AND PFDE.Column4 = PFDE1.Column4
                AND PFDE.Period = PFDE1.Period
                AND PFDE.Column5 = PFDE1.Column5
                AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = PFDE1.Dimension1)
                AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = PFDE1.Dimension2)
                AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = PFDE1.Dimension3)
                AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = PFDE1.Dimension4)
                AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = PFDE1.Dimension5)
                AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = PFDE1.Dimension6)
                AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = PFDE1.Dimension7)
                AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = PFDE1.Dimension8)
                AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = PFDE1.Dimension9)
                AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = PFDE1.Dimension10)

                WHERE pfde.Column1 IS NULL
                ) PFDE1
                CROSS JOIN [dbo].[Table2] p
                ) pfd


                Hope my point is clear,it is untested.



                No doubts Dynamic Sql will further improve it, EXEC sys.sp_executesql @sql




                I'd also drop the DISTINCT - if it's really necessary because of
                duplicates then there is an underlying problem to be fixed as well.






                share













                IMO,Biggest mistake is the place where you are using CROSS JOIN.



                First self join [dbo].[Table1] and filter the resultset then with this resultset use CROSS JOIN.It will give tremendous boost .



                SELECT pfd.*
                FROM (
                SELECT PFDE1.Column1
                ,PFDE1.Column2
                ,PFDE1.Column3
                ,PFDE1.Column4
                ,PFDE1.Dimension1
                ,PFDE1.Dimension2
                ,PFDE1.[Dimension3]
                ,PFDE1.[Dimension4]
                ,PFDE1.[Dimension5]
                ,PFDE1.[Dimension6]
                ,PFDE1.[Dimension7]
                ,PFDE1.[Dimension8]
                ,PFDE1.[Dimension9]
                ,PFDE1.[Dimension10]
                ,p.FiscalPeriod AS Period
                ,0 AS Amt
                ,0 AS EndingBal
                ,PFDE1.Column5
                FROM (
                SELECT DISTINCT PFDE1.Column1
                ,PFDE1.Column2
                ,PFDE1.Column3
                ,PFDE1.Column4
                ,PFDE1.Dimension1
                ,PFDE1.Dimension2
                ,PFDE1.[Dimension3]
                ,PFDE1.[Dimension4]
                ,PFDE1.[Dimension5]
                ,PFDE1.[Dimension6]
                ,PFDE1.[Dimension7]
                ,PFDE1.[Dimension8]
                ,PFDE1.[Dimension9]
                ,PFDE1.[Dimension10]
                ,PFDE1.Column5
                FROM [dbo].[Table1] PFDE1
                LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = PFDE1.Column1
                AND PFDE.Column2 = PFDE1.Column2
                AND PFDE.Column3 = PFDE1.Column3
                AND PFDE.Column4 = PFDE1.Column4
                AND PFDE.Period = PFDE1.Period
                AND PFDE.Column5 = PFDE1.Column5
                AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = PFDE1.Dimension1)
                AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = PFDE1.Dimension2)
                AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = PFDE1.Dimension3)
                AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = PFDE1.Dimension4)
                AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = PFDE1.Dimension5)
                AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = PFDE1.Dimension6)
                AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = PFDE1.Dimension7)
                AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = PFDE1.Dimension8)
                AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = PFDE1.Dimension9)
                AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = PFDE1.Dimension10)

                WHERE pfde.Column1 IS NULL
                ) PFDE1
                CROSS JOIN [dbo].[Table2] p
                ) pfd


                Hope my point is clear,it is untested.



                No doubts Dynamic Sql will further improve it, EXEC sys.sp_executesql @sql




                I'd also drop the DISTINCT - if it's really necessary because of
                duplicates then there is an underlying problem to be fixed as well.







                share











                share


                share










                answered 6 mins ago









                KumarHarshKumarHarsh

                81358




                81358






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Database Administrators Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f227393%2fpoor-sql-server-performance-with-nested-or-conditions-in-left-outer-join%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    SQL Server 17 - Attemping to backup to remote NAS but Access is denied

                    Always On Availability groups resolving state after failover - Remote harden of transaction...

                    Restoring from pg_dump with foreign key constraints