Poor SQL Server Performance With Nested “OR” Conditions In LEFT OUTER JOIN

I have a tricky bit of SQL that's giving me a bit of trouble. I'd like to introduce several new columns to a key table whose components may or may not be used based on user configuration (Dimension's 1-10). For architectural reasons these are stored as "UNIQUEIDENTIFIER"s.

However upon the introduction of these new dimension columns in my DB, I'm now facing performance problems in several other stored procs that manipulate data in this key table, one of which I will include below. It appears as though the method that I have implemented to support these new columns is inefficient and I'm not exactly sure why that is the case and the best way to remedy it.

(Included SQL)

--In real usage this would be set by user configuration

DECLARE @Dimension1Enabled BIT = 1

DECLARE @Dimension2Enabled BIT = 1

DECLARE @Dimension3Enabled BIT = 1

DECLARE @Dimension4Enabled BIT = 1

DECLARE @Dimension5Enabled BIT = 1

DECLARE @Dimension6Enabled BIT = 1

DECLARE @Dimension7Enabled BIT = 1

DECLARE @Dimension8Enabled BIT = 1

DECLARE @Dimension9Enabled BIT = 1

DECLARE @Dimension10Enabled BIT = 1



/*

--Here's the table definition for reference

CREATE TABLE [dbo].[Table1](

    [Column1] [int] NOT NULL,

    [Column2] [uniqueidentifier] NOT NULL,

    [Column3] [uniqueidentifier] NOT NULL,

    [Column4] [char](3) NOT NULL,

    [Dimension1] [uniqueidentifier] NULL,

    [Dimension2] [uniqueidentifier] NULL,

    [Dimension3] [uniqueidentifier] NULL,

    [Dimension4] [uniqueidentifier] NULL,

    [Dimension5] [uniqueidentifier] NULL,

    [Dimension6] [uniqueidentifier] NULL,

    [Dimension7] [uniqueidentifier] NULL,

    [Dimension8] [uniqueidentifier] NULL,

    [Dimension9] [uniqueidentifier] NULL,

    [Dimension10] [uniqueidentifier] NULL,

    [Period] [int] NOT NULL,

    [Amt] [numeric](28, 12) NOT NULL,

    [EndingBal] [numeric](28, 12) NOT NULL,

    [PlanningSourceID] [int] NOT NULL,

    [Column5] [uniqueidentifier] NULL

) ON [PRIMARY]



--Simply contains 0-12

CREATE TABLE [dbo].[Table2] (

    [FiscalPeriod] [int] NOT NULL

    ,[FiscalPeriodDescription] [varchar](50) NOT NULL

    ,[FiscalPeriodDescriptionMonthName] [varchar](20) NULL

    ,[FiscalPeriodName] [varchar](255) NULL

)

*/



SELECT pfd.*

FROM (

    SELECT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ,PFDE1.[Dimension3]

        ,PFDE1.[Dimension4]

        ,PFDE1.[Dimension5]

        ,PFDE1.[Dimension6]

        ,PFDE1.[Dimension7]

        ,PFDE1.[Dimension8]

        ,PFDE1.[Dimension9]

        ,PFDE1.[Dimension10]

        ,p.FiscalPeriod AS Period

        ,0 AS Amt

        ,0 AS EndingBal

        ,PFDE1.Column5

    FROM (

        SELECT DISTINCT PFDE1.Column1

            ,PFDE1.Column2

            ,PFDE1.Column3

            ,PFDE1.Column4

            ,PFDE1.Dimension1

            ,PFDE1.Dimension2

            ,PFDE1.[Dimension3]

            ,PFDE1.[Dimension4]

            ,PFDE1.[Dimension5]

            ,PFDE1.[Dimension6]

            ,PFDE1.[Dimension7]

            ,PFDE1.[Dimension8]

            ,PFDE1.[Dimension9]

            ,PFDE1.[Dimension10]

            ,PFDE1.Column5

        FROM [dbo].[Table1] PFDE1

        ) PFDE1

    CROSS JOIN [dbo].[Table2] p

    ) pfd

LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = pfd.Column1

    AND PFDE.Column2 = pfd.Column2

    AND PFDE.Column3 = pfd.Column3

    AND PFDE.Column4 = pfd.Column4

    --This section causes great slowness

    AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = pfd.Dimension1)

    AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = pfd.Dimension2)

    AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = pfd.Dimension3)

    AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = pfd.Dimension4)

    AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = pfd.Dimension5)

    AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = pfd.Dimension6)

    AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = pfd.Dimension7)

    AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = pfd.Dimension8)

    AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = pfd.Dimension9)

    AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = pfd.Dimension10) 

    AND PFDE.Period = pfd.Period

    AND PFDE.Column5 = pfd.Column5

WHERE pfde.Column1 IS NULL

With 50,000 records in Table1 the performance of this query is quite poor, several minutes at least. Remove the "@DimensionXEnabled = 0 OR" condition though and the query will run in several seconds.

I feel I have several options to optimize this, I could do one of the following:

Change the "Dimension" columns to NOT NULL and populate with 0 Guid Values - I believe this would have a performance impact because now I'm coping a large amount of data with each insert. I've tested this option and it does work better but there is a definite performance impact and I'd like to avoid it if possible.

Convert the query to dynamic SQL and generate the join conditions dynamically - Would need to do this in 100+ other places and have a DB full of dynamic SQL, not looking forward to this option either.

Wrap the "Dimension1 = Dimension1" in ISNULL statements rather than based the join condition off of a variable - I believe this would result in a inefficient join statement as the query would no longer be able to use indexes.

I feel that the SQL optimizer is making bad decisions with this query, if I write the query without the OR condition it run's quickly, even though the results are exactly the same. The query plans look almost identical other than that in the long running join, most of the processing time is used in the LEFT OUTER JOIN statements merge:

(With "OR" Conditions)

enter image description here

(Without "OR" Conditions)

enter image description here

I'm unsure how to proceed, my question is, is there a better option?

Update 1

I've been doing more research into this, attempting various ways to get around the poor performance, so far I haven't found any good solutions just more information:

Tried column "IS NULL" without success (PFDE.Dimension1 IS NULL OR PFDE.Dimension1 = pfd.Dimension1), ran even slower, that doesn't seem to do it.

Tired join "Hints" (HASH, MERGE, LOOP), no performance impact here, in fact in some cases they just ran slower.

Tried option 3, actually this seems to run very quick, I think because Table2 doesn't have any indexes, this approach could be a problem in the future though...

I've tried dynamic SQL (@AaronBertrand recommendation), actually the results were somewhat underwhelming (30 sec+), not sure why it performed so poorly, but that might be something that I could improve with some further optimizations, I'm looking into this more..

I tried the "OPTION (RECOMPILE);" mentioned in this post. Saw definite improvements (~15 secs), still not as fast as not dimensions but maybe something else I can iterate on.

edited Jan 17 at 22:48

asked Jan 17 at 15:54

David Rogers

1095

add a comment |

(Included SQL)

--In real usage this would be set by user configuration

DECLARE @Dimension1Enabled BIT = 1

DECLARE @Dimension2Enabled BIT = 1

DECLARE @Dimension3Enabled BIT = 1

DECLARE @Dimension4Enabled BIT = 1

DECLARE @Dimension5Enabled BIT = 1

DECLARE @Dimension6Enabled BIT = 1

DECLARE @Dimension7Enabled BIT = 1

DECLARE @Dimension8Enabled BIT = 1

DECLARE @Dimension9Enabled BIT = 1

DECLARE @Dimension10Enabled BIT = 1



/*

--Here's the table definition for reference

CREATE TABLE [dbo].[Table1](

    [Column1] [int] NOT NULL,

    [Column2] [uniqueidentifier] NOT NULL,

    [Column3] [uniqueidentifier] NOT NULL,

    [Column4] [char](3) NOT NULL,

    [Dimension1] [uniqueidentifier] NULL,

    [Dimension2] [uniqueidentifier] NULL,

    [Dimension3] [uniqueidentifier] NULL,

    [Dimension4] [uniqueidentifier] NULL,

    [Dimension5] [uniqueidentifier] NULL,

    [Dimension6] [uniqueidentifier] NULL,

    [Dimension7] [uniqueidentifier] NULL,

    [Dimension8] [uniqueidentifier] NULL,

    [Dimension9] [uniqueidentifier] NULL,

    [Dimension10] [uniqueidentifier] NULL,

    [Period] [int] NOT NULL,

    [Amt] [numeric](28, 12) NOT NULL,

    [EndingBal] [numeric](28, 12) NOT NULL,

    [PlanningSourceID] [int] NOT NULL,

    [Column5] [uniqueidentifier] NULL

) ON [PRIMARY]



--Simply contains 0-12

CREATE TABLE [dbo].[Table2] (

    [FiscalPeriod] [int] NOT NULL

    ,[FiscalPeriodDescription] [varchar](50) NOT NULL

    ,[FiscalPeriodDescriptionMonthName] [varchar](20) NULL

    ,[FiscalPeriodName] [varchar](255) NULL

)

*/



SELECT pfd.*

FROM (

    SELECT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ,PFDE1.[Dimension3]

        ,PFDE1.[Dimension4]

        ,PFDE1.[Dimension5]

        ,PFDE1.[Dimension6]

        ,PFDE1.[Dimension7]

        ,PFDE1.[Dimension8]

        ,PFDE1.[Dimension9]

        ,PFDE1.[Dimension10]

        ,p.FiscalPeriod AS Period

        ,0 AS Amt

        ,0 AS EndingBal

        ,PFDE1.Column5

    FROM (

        SELECT DISTINCT PFDE1.Column1

            ,PFDE1.Column2

            ,PFDE1.Column3

            ,PFDE1.Column4

            ,PFDE1.Dimension1

            ,PFDE1.Dimension2

            ,PFDE1.[Dimension3]

            ,PFDE1.[Dimension4]

            ,PFDE1.[Dimension5]

            ,PFDE1.[Dimension6]

            ,PFDE1.[Dimension7]

            ,PFDE1.[Dimension8]

            ,PFDE1.[Dimension9]

            ,PFDE1.[Dimension10]

            ,PFDE1.Column5

        FROM [dbo].[Table1] PFDE1

        ) PFDE1

    CROSS JOIN [dbo].[Table2] p

    ) pfd

LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = pfd.Column1

    AND PFDE.Column2 = pfd.Column2

    AND PFDE.Column3 = pfd.Column3

    AND PFDE.Column4 = pfd.Column4

    --This section causes great slowness

    AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = pfd.Dimension1)

    AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = pfd.Dimension2)

    AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = pfd.Dimension3)

    AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = pfd.Dimension4)

    AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = pfd.Dimension5)

    AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = pfd.Dimension6)

    AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = pfd.Dimension7)

    AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = pfd.Dimension8)

    AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = pfd.Dimension9)

    AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = pfd.Dimension10) 

    AND PFDE.Period = pfd.Period

    AND PFDE.Column5 = pfd.Column5

WHERE pfde.Column1 IS NULL

With 50,000 records in Table1 the performance of this query is quite poor, several minutes at least. Remove the "@DimensionXEnabled = 0 OR" condition though and the query will run in several seconds.

I feel I have several options to optimize this, I could do one of the following:

Change the "Dimension" columns to NOT NULL and populate with 0 Guid Values - I believe this would have a performance impact because now I'm coping a large amount of data with each insert. I've tested this option and it does work better but there is a definite performance impact and I'd like to avoid it if possible.

Convert the query to dynamic SQL and generate the join conditions dynamically - Would need to do this in 100+ other places and have a DB full of dynamic SQL, not looking forward to this option either.

Wrap the "Dimension1 = Dimension1" in ISNULL statements rather than based the join condition off of a variable - I believe this would result in a inefficient join statement as the query would no longer be able to use indexes.

(With "OR" Conditions)

enter image description here

(Without "OR" Conditions)

enter image description here

I'm unsure how to proceed, my question is, is there a better option?

Update 1

I've been doing more research into this, attempting various ways to get around the poor performance, so far I haven't found any good solutions just more information:

Tried column "IS NULL" without success (PFDE.Dimension1 IS NULL OR PFDE.Dimension1 = pfd.Dimension1), ran even slower, that doesn't seem to do it.

Tired join "Hints" (HASH, MERGE, LOOP), no performance impact here, in fact in some cases they just ran slower.

Tried option 3, actually this seems to run very quick, I think because Table2 doesn't have any indexes, this approach could be a problem in the future though...

I've tried dynamic SQL (@AaronBertrand recommendation), actually the results were somewhat underwhelming (30 sec+), not sure why it performed so poorly, but that might be something that I could improve with some further optimizations, I'm looking into this more..

I tried the "OPTION (RECOMPILE);" mentioned in this post. Saw definite improvements (~15 secs), still not as fast as not dimensions but maybe something else I can iterate on.

edited Jan 17 at 22:48

asked Jan 17 at 15:54

David Rogers

1095

add a comment |

(Included SQL)

--In real usage this would be set by user configuration

DECLARE @Dimension1Enabled BIT = 1

DECLARE @Dimension2Enabled BIT = 1

DECLARE @Dimension3Enabled BIT = 1

DECLARE @Dimension4Enabled BIT = 1

DECLARE @Dimension5Enabled BIT = 1

DECLARE @Dimension6Enabled BIT = 1

DECLARE @Dimension7Enabled BIT = 1

DECLARE @Dimension8Enabled BIT = 1

DECLARE @Dimension9Enabled BIT = 1

DECLARE @Dimension10Enabled BIT = 1



/*

--Here's the table definition for reference

CREATE TABLE [dbo].[Table1](

    [Column1] [int] NOT NULL,

    [Column2] [uniqueidentifier] NOT NULL,

    [Column3] [uniqueidentifier] NOT NULL,

    [Column4] [char](3) NOT NULL,

    [Dimension1] [uniqueidentifier] NULL,

    [Dimension2] [uniqueidentifier] NULL,

    [Dimension3] [uniqueidentifier] NULL,

    [Dimension4] [uniqueidentifier] NULL,

    [Dimension5] [uniqueidentifier] NULL,

    [Dimension6] [uniqueidentifier] NULL,

    [Dimension7] [uniqueidentifier] NULL,

    [Dimension8] [uniqueidentifier] NULL,

    [Dimension9] [uniqueidentifier] NULL,

    [Dimension10] [uniqueidentifier] NULL,

    [Period] [int] NOT NULL,

    [Amt] [numeric](28, 12) NOT NULL,

    [EndingBal] [numeric](28, 12) NOT NULL,

    [PlanningSourceID] [int] NOT NULL,

    [Column5] [uniqueidentifier] NULL

) ON [PRIMARY]



--Simply contains 0-12

CREATE TABLE [dbo].[Table2] (

    [FiscalPeriod] [int] NOT NULL

    ,[FiscalPeriodDescription] [varchar](50) NOT NULL

    ,[FiscalPeriodDescriptionMonthName] [varchar](20) NULL

    ,[FiscalPeriodName] [varchar](255) NULL

)

*/



SELECT pfd.*

FROM (

    SELECT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ,PFDE1.[Dimension3]

        ,PFDE1.[Dimension4]

        ,PFDE1.[Dimension5]

        ,PFDE1.[Dimension6]

        ,PFDE1.[Dimension7]

        ,PFDE1.[Dimension8]

        ,PFDE1.[Dimension9]

        ,PFDE1.[Dimension10]

        ,p.FiscalPeriod AS Period

        ,0 AS Amt

        ,0 AS EndingBal

        ,PFDE1.Column5

    FROM (

        SELECT DISTINCT PFDE1.Column1

            ,PFDE1.Column2

            ,PFDE1.Column3

            ,PFDE1.Column4

            ,PFDE1.Dimension1

            ,PFDE1.Dimension2

            ,PFDE1.[Dimension3]

            ,PFDE1.[Dimension4]

            ,PFDE1.[Dimension5]

            ,PFDE1.[Dimension6]

            ,PFDE1.[Dimension7]

            ,PFDE1.[Dimension8]

            ,PFDE1.[Dimension9]

            ,PFDE1.[Dimension10]

            ,PFDE1.Column5

        FROM [dbo].[Table1] PFDE1

        ) PFDE1

    CROSS JOIN [dbo].[Table2] p

    ) pfd

LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = pfd.Column1

    AND PFDE.Column2 = pfd.Column2

    AND PFDE.Column3 = pfd.Column3

    AND PFDE.Column4 = pfd.Column4

    --This section causes great slowness

    AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = pfd.Dimension1)

    AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = pfd.Dimension2)

    AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = pfd.Dimension3)

    AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = pfd.Dimension4)

    AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = pfd.Dimension5)

    AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = pfd.Dimension6)

    AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = pfd.Dimension7)

    AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = pfd.Dimension8)

    AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = pfd.Dimension9)

    AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = pfd.Dimension10) 

    AND PFDE.Period = pfd.Period

    AND PFDE.Column5 = pfd.Column5

WHERE pfde.Column1 IS NULL

With 50,000 records in Table1 the performance of this query is quite poor, several minutes at least. Remove the "@DimensionXEnabled = 0 OR" condition though and the query will run in several seconds.

I feel I have several options to optimize this, I could do one of the following:

Change the "Dimension" columns to NOT NULL and populate with 0 Guid Values - I believe this would have a performance impact because now I'm coping a large amount of data with each insert. I've tested this option and it does work better but there is a definite performance impact and I'd like to avoid it if possible.

Convert the query to dynamic SQL and generate the join conditions dynamically - Would need to do this in 100+ other places and have a DB full of dynamic SQL, not looking forward to this option either.

Wrap the "Dimension1 = Dimension1" in ISNULL statements rather than based the join condition off of a variable - I believe this would result in a inefficient join statement as the query would no longer be able to use indexes.

(With "OR" Conditions)

enter image description here

(Without "OR" Conditions)

enter image description here

I'm unsure how to proceed, my question is, is there a better option?

Update 1

I've been doing more research into this, attempting various ways to get around the poor performance, so far I haven't found any good solutions just more information:

Tried column "IS NULL" without success (PFDE.Dimension1 IS NULL OR PFDE.Dimension1 = pfd.Dimension1), ran even slower, that doesn't seem to do it.

Tired join "Hints" (HASH, MERGE, LOOP), no performance impact here, in fact in some cases they just ran slower.

Tried option 3, actually this seems to run very quick, I think because Table2 doesn't have any indexes, this approach could be a problem in the future though...

I've tried dynamic SQL (@AaronBertrand recommendation), actually the results were somewhat underwhelming (30 sec+), not sure why it performed so poorly, but that might be something that I could improve with some further optimizations, I'm looking into this more..

I tried the "OPTION (RECOMPILE);" mentioned in this post. Saw definite improvements (~15 secs), still not as fast as not dimensions but maybe something else I can iterate on.

edited Jan 17 at 22:48

asked Jan 17 at 15:54

David Rogers

1095

(Included SQL)

--In real usage this would be set by user configuration

DECLARE @Dimension1Enabled BIT = 1

DECLARE @Dimension2Enabled BIT = 1

DECLARE @Dimension3Enabled BIT = 1

DECLARE @Dimension4Enabled BIT = 1

DECLARE @Dimension5Enabled BIT = 1

DECLARE @Dimension6Enabled BIT = 1

DECLARE @Dimension7Enabled BIT = 1

DECLARE @Dimension8Enabled BIT = 1

DECLARE @Dimension9Enabled BIT = 1

DECLARE @Dimension10Enabled BIT = 1



/*

--Here's the table definition for reference

CREATE TABLE [dbo].[Table1](

    [Column1] [int] NOT NULL,

    [Column2] [uniqueidentifier] NOT NULL,

    [Column3] [uniqueidentifier] NOT NULL,

    [Column4] [char](3) NOT NULL,

    [Dimension1] [uniqueidentifier] NULL,

    [Dimension2] [uniqueidentifier] NULL,

    [Dimension3] [uniqueidentifier] NULL,

    [Dimension4] [uniqueidentifier] NULL,

    [Dimension5] [uniqueidentifier] NULL,

    [Dimension6] [uniqueidentifier] NULL,

    [Dimension7] [uniqueidentifier] NULL,

    [Dimension8] [uniqueidentifier] NULL,

    [Dimension9] [uniqueidentifier] NULL,

    [Dimension10] [uniqueidentifier] NULL,

    [Period] [int] NOT NULL,

    [Amt] [numeric](28, 12) NOT NULL,

    [EndingBal] [numeric](28, 12) NOT NULL,

    [PlanningSourceID] [int] NOT NULL,

    [Column5] [uniqueidentifier] NULL

) ON [PRIMARY]



--Simply contains 0-12

CREATE TABLE [dbo].[Table2] (

    [FiscalPeriod] [int] NOT NULL

    ,[FiscalPeriodDescription] [varchar](50) NOT NULL

    ,[FiscalPeriodDescriptionMonthName] [varchar](20) NULL

    ,[FiscalPeriodName] [varchar](255) NULL

)

*/



SELECT pfd.*

FROM (

    SELECT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ,PFDE1.[Dimension3]

        ,PFDE1.[Dimension4]

        ,PFDE1.[Dimension5]

        ,PFDE1.[Dimension6]

        ,PFDE1.[Dimension7]

        ,PFDE1.[Dimension8]

        ,PFDE1.[Dimension9]

        ,PFDE1.[Dimension10]

        ,p.FiscalPeriod AS Period

        ,0 AS Amt

        ,0 AS EndingBal

        ,PFDE1.Column5

    FROM (

        SELECT DISTINCT PFDE1.Column1

            ,PFDE1.Column2

            ,PFDE1.Column3

            ,PFDE1.Column4

            ,PFDE1.Dimension1

            ,PFDE1.Dimension2

            ,PFDE1.[Dimension3]

            ,PFDE1.[Dimension4]

            ,PFDE1.[Dimension5]

            ,PFDE1.[Dimension6]

            ,PFDE1.[Dimension7]

            ,PFDE1.[Dimension8]

            ,PFDE1.[Dimension9]

            ,PFDE1.[Dimension10]

            ,PFDE1.Column5

        FROM [dbo].[Table1] PFDE1

        ) PFDE1

    CROSS JOIN [dbo].[Table2] p

    ) pfd

LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = pfd.Column1

    AND PFDE.Column2 = pfd.Column2

    AND PFDE.Column3 = pfd.Column3

    AND PFDE.Column4 = pfd.Column4

    --This section causes great slowness

    AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = pfd.Dimension1)

    AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = pfd.Dimension2)

    AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = pfd.Dimension3)

    AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = pfd.Dimension4)

    AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = pfd.Dimension5)

    AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = pfd.Dimension6)

    AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = pfd.Dimension7)

    AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = pfd.Dimension8)

    AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = pfd.Dimension9)

    AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = pfd.Dimension10) 

    AND PFDE.Period = pfd.Period

    AND PFDE.Column5 = pfd.Column5

WHERE pfde.Column1 IS NULL

With 50,000 records in Table1 the performance of this query is quite poor, several minutes at least. Remove the "@DimensionXEnabled = 0 OR" condition though and the query will run in several seconds.

I feel I have several options to optimize this, I could do one of the following:

Change the "Dimension" columns to NOT NULL and populate with 0 Guid Values - I believe this would have a performance impact because now I'm coping a large amount of data with each insert. I've tested this option and it does work better but there is a definite performance impact and I'd like to avoid it if possible.

Convert the query to dynamic SQL and generate the join conditions dynamically - Would need to do this in 100+ other places and have a DB full of dynamic SQL, not looking forward to this option either.

Wrap the "Dimension1 = Dimension1" in ISNULL statements rather than based the join condition off of a variable - I believe this would result in a inefficient join statement as the query would no longer be able to use indexes.

(With "OR" Conditions)

enter image description here

(Without "OR" Conditions)

enter image description here

I'm unsure how to proceed, my question is, is there a better option?

Update 1

I've been doing more research into this, attempting various ways to get around the poor performance, so far I haven't found any good solutions just more information:

Tried column "IS NULL" without success (PFDE.Dimension1 IS NULL OR PFDE.Dimension1 = pfd.Dimension1), ran even slower, that doesn't seem to do it.

Tired join "Hints" (HASH, MERGE, LOOP), no performance impact here, in fact in some cases they just ran slower.

Tried option 3, actually this seems to run very quick, I think because Table2 doesn't have any indexes, this approach could be a problem in the future though...

I've tried dynamic SQL (@AaronBertrand recommendation), actually the results were somewhat underwhelming (30 sec+), not sure why it performed so poorly, but that might be something that I could improve with some further optimizations, I'm looking into this more..

I tried the "OPTION (RECOMPILE);" mentioned in this post. Saw definite improvements (~15 secs), still not as fast as not dimensions but maybe something else I can iterate on.

sql-server database-design performance query-performance optimization

edited Jan 17 at 22:48

asked Jan 17 at 15:54

David Rogers

1095

edited Jan 17 at 22:48

asked Jan 17 at 15:54

David Rogers

1095

edited Jan 17 at 22:48

asked Jan 17 at 15:54

David Rogers

1095

asked Jan 17 at 15:54

David Rogers

1095

asked Jan 17 at 15:54

David Rogers

1095

add a comment |

2 Answers
2

active

oldest

votes

Sounds like a kitchen sink query. You can make this work well for most cases with dynamic SQL, since you can generate a different execution plan for each combination of parameters.

DECLARE @sql nvarchar(max) = N'

    SELECT DISTINCT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ...

        ,PFDE1.[Dimension10]

        ,PFDE1.Column5

    FROM [dbo].[Table1] PFDE1

    ) PFDE1

CROSS JOIN [dbo].[Table2] p

) pfd

LEFT JOIN [dbo].[Table1] pfde 

ON PFDE.Column1 = pfd.Column1

AND PFDE.Column2 = pfd.Column2

AND PFDE.Column3 = pfd.Column3

AND PFDE.Column4 = pfd.Column4'



+ CASE WHEN @Dimension1Enabled = 1 THEN 

  N' AND PFDE.Dimension1 = pfd.Dimension1' ELSE N'' END

+ CASE WHEN @Dimension2Enabled = 1 THEN 

  N' AND PFDE.Dimension2 = pfd.Dimension2' ELSE N'' END



+ ...



+ CASE WHEN @Dimension10Enabled = 1 THEN 

  N' AND PFDE.Dimension10 = pfd.Dimension10' ELSE '' END



+ N'    AND PFDE.Period = pfd.Period

  AND PFDE.Column5 = pfd.Column5

  WHERE pfde.Column1 IS NULL;';



EXEC sys.sp_executesql @sql;

I'd also drop the DISTINCT - if it's really necessary because of duplicates then there is an underlying problem to be fixed as well.

answered Jan 17 at 16:16

Aaron Bertrand♦

150k18284482

Yep, this would be a good implementation of option 2, but I'm kinda hoping to avoiding this as if I followed this approach here I'd have to resort to this everywhere else I implement this, there are 100+ places and I'd prefer not to maintain a DB full of dynamic SQL...

– David Rogers
Jan 17 at 16:21

Well 1 and 3 don't sound like viable options either, since 1 is a performance hit and 3 is not going to change anything (they boil down to the same logic), so... sometimes you just have to pick your poison. A pattern is just a pattern and if dealing with users screaming is more important than a little extra code, you need to make a choice.

– Aaron Bertrand♦
Jan 17 at 16:35

Well each has it's advantages and disadvantages, the issue is I'm not just making a decision for this one query, I'm making a decision for 10+ tables, 100+ procs, and many, many other associated items, it will take over a month to implement and test it properly, and once it's done I'll have to live with the consequences in over 100 production environment for years. So this question really is only the tip of the iceberg, a small manifestation of a broader architectural decision that I need to make and then live with. Though I accept that I may need to go with more than one option...

– David Rogers
Jan 17 at 23:37

Well we as a peer group can tell you your options, and tell you which ones we think are best, but we can't crowd-source your decision for you. As for the testing mentioned in your update, did you look at any execution plans? Any query that takes 15 or 20 seconds is already 100 times worse than it probably should be. Why is it taking so long? What are the wait types? Is it being blocked? Are you returning 40 billion rows to an SSMS grid over a slow network connection? If you read the kitchen sink article I posted, dynamic SQL can help for presence/absence of parameters, and option (recompile)

– Aaron Bertrand♦
Jan 18 at 12:16

...can help for data skew. You can also combine them. But given a 15 second query, using dynamic SQL or not is unlikely to fix that, there is something else bigger going on here.

– Aaron Bertrand♦
Jan 18 at 12:20

add a comment |

IMO,Biggest mistake is the place where you are using CROSS JOIN.

First self join [dbo].[Table1] and filter the resultset then with this resultset use CROSS JOIN.It will give tremendous boost .

SELECT pfd.*

FROM (

    SELECT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ,PFDE1.[Dimension3]

        ,PFDE1.[Dimension4]

        ,PFDE1.[Dimension5]

        ,PFDE1.[Dimension6]

        ,PFDE1.[Dimension7]

        ,PFDE1.[Dimension8]

        ,PFDE1.[Dimension9]

        ,PFDE1.[Dimension10]

        ,p.FiscalPeriod AS Period

        ,0 AS Amt

        ,0 AS EndingBal

        ,PFDE1.Column5

    FROM (

        SELECT DISTINCT PFDE1.Column1

            ,PFDE1.Column2

            ,PFDE1.Column3

            ,PFDE1.Column4

            ,PFDE1.Dimension1

            ,PFDE1.Dimension2

            ,PFDE1.[Dimension3]

            ,PFDE1.[Dimension4]

            ,PFDE1.[Dimension5]

            ,PFDE1.[Dimension6]

            ,PFDE1.[Dimension7]

            ,PFDE1.[Dimension8]

            ,PFDE1.[Dimension9]

            ,PFDE1.[Dimension10]

            ,PFDE1.Column5

        FROM [dbo].[Table1] PFDE1

    LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = PFDE1.Column1

    AND PFDE.Column2 = PFDE1.Column2

    AND PFDE.Column3 = PFDE1.Column3

    AND PFDE.Column4 = PFDE1.Column4

    AND PFDE.Period = PFDE1.Period

    AND PFDE.Column5 = PFDE1.Column5

    AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = PFDE1.Dimension1)

    AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = PFDE1.Dimension2)

    AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = PFDE1.Dimension3)

    AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = PFDE1.Dimension4)

    AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = PFDE1.Dimension5)

    AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = PFDE1.Dimension6)

    AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = PFDE1.Dimension7)

    AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = PFDE1.Dimension8)

    AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = PFDE1.Dimension9)

    AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = PFDE1.Dimension10) 



WHERE pfde.Column1 IS NULL

        ) PFDE1

    CROSS JOIN [dbo].[Table2] p

    ) pfd

Hope my point is clear,it is untested.

No doubts Dynamic Sql will further improve it, EXEC sys.sp_executesql @sql

I'd also drop the DISTINCT - if it's really necessary because of
duplicates then there is an underlying problem to be fixed as well.

answered 6 mins ago

KumarHarsh

81358

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f227393%2fpoor-sql-server-performance-with-nested-or-conditions-in-left-outer-join%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Sounds like a kitchen sink query. You can make this work well for most cases with dynamic SQL, since you can generate a different execution plan for each combination of parameters.

DECLARE @sql nvarchar(max) = N'

    SELECT DISTINCT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ...

        ,PFDE1.[Dimension10]

        ,PFDE1.Column5

    FROM [dbo].[Table1] PFDE1

    ) PFDE1

CROSS JOIN [dbo].[Table2] p

) pfd

LEFT JOIN [dbo].[Table1] pfde 

ON PFDE.Column1 = pfd.Column1

AND PFDE.Column2 = pfd.Column2

AND PFDE.Column3 = pfd.Column3

AND PFDE.Column4 = pfd.Column4'



+ CASE WHEN @Dimension1Enabled = 1 THEN 

  N' AND PFDE.Dimension1 = pfd.Dimension1' ELSE N'' END

+ CASE WHEN @Dimension2Enabled = 1 THEN 

  N' AND PFDE.Dimension2 = pfd.Dimension2' ELSE N'' END



+ ...



+ CASE WHEN @Dimension10Enabled = 1 THEN 

  N' AND PFDE.Dimension10 = pfd.Dimension10' ELSE '' END



+ N'    AND PFDE.Period = pfd.Period

  AND PFDE.Column5 = pfd.Column5

  WHERE pfde.Column1 IS NULL;';



EXEC sys.sp_executesql @sql;

I'd also drop the DISTINCT - if it's really necessary because of duplicates then there is an underlying problem to be fixed as well.

answered Jan 17 at 16:16

Aaron Bertrand♦

150k18284482

Yep, this would be a good implementation of option 2, but I'm kinda hoping to avoiding this as if I followed this approach here I'd have to resort to this everywhere else I implement this, there are 100+ places and I'd prefer not to maintain a DB full of dynamic SQL...

– David Rogers
Jan 17 at 16:21

Well 1 and 3 don't sound like viable options either, since 1 is a performance hit and 3 is not going to change anything (they boil down to the same logic), so... sometimes you just have to pick your poison. A pattern is just a pattern and if dealing with users screaming is more important than a little extra code, you need to make a choice.

– Aaron Bertrand♦
Jan 17 at 16:35

Well each has it's advantages and disadvantages, the issue is I'm not just making a decision for this one query, I'm making a decision for 10+ tables, 100+ procs, and many, many other associated items, it will take over a month to implement and test it properly, and once it's done I'll have to live with the consequences in over 100 production environment for years. So this question really is only the tip of the iceberg, a small manifestation of a broader architectural decision that I need to make and then live with. Though I accept that I may need to go with more than one option...

– David Rogers
Jan 17 at 23:37

Well we as a peer group can tell you your options, and tell you which ones we think are best, but we can't crowd-source your decision for you. As for the testing mentioned in your update, did you look at any execution plans? Any query that takes 15 or 20 seconds is already 100 times worse than it probably should be. Why is it taking so long? What are the wait types? Is it being blocked? Are you returning 40 billion rows to an SSMS grid over a slow network connection? If you read the kitchen sink article I posted, dynamic SQL can help for presence/absence of parameters, and option (recompile)

– Aaron Bertrand♦
Jan 18 at 12:16

...can help for data skew. You can also combine them. But given a 15 second query, using dynamic SQL or not is unlikely to fix that, there is something else bigger going on here.

– Aaron Bertrand♦
Jan 18 at 12:20

add a comment |

Sounds like a kitchen sink query. You can make this work well for most cases with dynamic SQL, since you can generate a different execution plan for each combination of parameters.

DECLARE @sql nvarchar(max) = N'

    SELECT DISTINCT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ...

        ,PFDE1.[Dimension10]

        ,PFDE1.Column5

    FROM [dbo].[Table1] PFDE1

    ) PFDE1

CROSS JOIN [dbo].[Table2] p

) pfd

LEFT JOIN [dbo].[Table1] pfde 

ON PFDE.Column1 = pfd.Column1

AND PFDE.Column2 = pfd.Column2

AND PFDE.Column3 = pfd.Column3

AND PFDE.Column4 = pfd.Column4'



+ CASE WHEN @Dimension1Enabled = 1 THEN 

  N' AND PFDE.Dimension1 = pfd.Dimension1' ELSE N'' END

+ CASE WHEN @Dimension2Enabled = 1 THEN 

  N' AND PFDE.Dimension2 = pfd.Dimension2' ELSE N'' END



+ ...



+ CASE WHEN @Dimension10Enabled = 1 THEN 

  N' AND PFDE.Dimension10 = pfd.Dimension10' ELSE '' END



+ N'    AND PFDE.Period = pfd.Period

  AND PFDE.Column5 = pfd.Column5

  WHERE pfde.Column1 IS NULL;';



EXEC sys.sp_executesql @sql;

I'd also drop the DISTINCT - if it's really necessary because of duplicates then there is an underlying problem to be fixed as well.

answered Jan 17 at 16:16

Aaron Bertrand♦

150k18284482

Yep, this would be a good implementation of option 2, but I'm kinda hoping to avoiding this as if I followed this approach here I'd have to resort to this everywhere else I implement this, there are 100+ places and I'd prefer not to maintain a DB full of dynamic SQL...

– David Rogers
Jan 17 at 16:21

Well 1 and 3 don't sound like viable options either, since 1 is a performance hit and 3 is not going to change anything (they boil down to the same logic), so... sometimes you just have to pick your poison. A pattern is just a pattern and if dealing with users screaming is more important than a little extra code, you need to make a choice.

– Aaron Bertrand♦
Jan 17 at 16:35

Well each has it's advantages and disadvantages, the issue is I'm not just making a decision for this one query, I'm making a decision for 10+ tables, 100+ procs, and many, many other associated items, it will take over a month to implement and test it properly, and once it's done I'll have to live with the consequences in over 100 production environment for years. So this question really is only the tip of the iceberg, a small manifestation of a broader architectural decision that I need to make and then live with. Though I accept that I may need to go with more than one option...

– David Rogers
Jan 17 at 23:37

Well we as a peer group can tell you your options, and tell you which ones we think are best, but we can't crowd-source your decision for you. As for the testing mentioned in your update, did you look at any execution plans? Any query that takes 15 or 20 seconds is already 100 times worse than it probably should be. Why is it taking so long? What are the wait types? Is it being blocked? Are you returning 40 billion rows to an SSMS grid over a slow network connection? If you read the kitchen sink article I posted, dynamic SQL can help for presence/absence of parameters, and option (recompile)

– Aaron Bertrand♦
Jan 18 at 12:16

...can help for data skew. You can also combine them. But given a 15 second query, using dynamic SQL or not is unlikely to fix that, there is something else bigger going on here.

– Aaron Bertrand♦
Jan 18 at 12:20

add a comment |

Sounds like a kitchen sink query. You can make this work well for most cases with dynamic SQL, since you can generate a different execution plan for each combination of parameters.

DECLARE @sql nvarchar(max) = N'

    SELECT DISTINCT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ...

        ,PFDE1.[Dimension10]

        ,PFDE1.Column5

    FROM [dbo].[Table1] PFDE1

    ) PFDE1

CROSS JOIN [dbo].[Table2] p

) pfd

LEFT JOIN [dbo].[Table1] pfde 

ON PFDE.Column1 = pfd.Column1

AND PFDE.Column2 = pfd.Column2

AND PFDE.Column3 = pfd.Column3

AND PFDE.Column4 = pfd.Column4'



+ CASE WHEN @Dimension1Enabled = 1 THEN 

  N' AND PFDE.Dimension1 = pfd.Dimension1' ELSE N'' END

+ CASE WHEN @Dimension2Enabled = 1 THEN 

  N' AND PFDE.Dimension2 = pfd.Dimension2' ELSE N'' END



+ ...



+ CASE WHEN @Dimension10Enabled = 1 THEN 

  N' AND PFDE.Dimension10 = pfd.Dimension10' ELSE '' END



+ N'    AND PFDE.Period = pfd.Period

  AND PFDE.Column5 = pfd.Column5

  WHERE pfde.Column1 IS NULL;';



EXEC sys.sp_executesql @sql;

I'd also drop the DISTINCT - if it's really necessary because of duplicates then there is an underlying problem to be fixed as well.

answered Jan 17 at 16:16

Aaron Bertrand♦

150k18284482

Sounds like a kitchen sink query. You can make this work well for most cases with dynamic SQL, since you can generate a different execution plan for each combination of parameters.

DECLARE @sql nvarchar(max) = N'

    SELECT DISTINCT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ...

        ,PFDE1.[Dimension10]

        ,PFDE1.Column5

    FROM [dbo].[Table1] PFDE1

    ) PFDE1

CROSS JOIN [dbo].[Table2] p

) pfd

LEFT JOIN [dbo].[Table1] pfde 

ON PFDE.Column1 = pfd.Column1

AND PFDE.Column2 = pfd.Column2

AND PFDE.Column3 = pfd.Column3

AND PFDE.Column4 = pfd.Column4'



+ CASE WHEN @Dimension1Enabled = 1 THEN 

  N' AND PFDE.Dimension1 = pfd.Dimension1' ELSE N'' END

+ CASE WHEN @Dimension2Enabled = 1 THEN 

  N' AND PFDE.Dimension2 = pfd.Dimension2' ELSE N'' END



+ ...



+ CASE WHEN @Dimension10Enabled = 1 THEN 

  N' AND PFDE.Dimension10 = pfd.Dimension10' ELSE '' END



+ N'    AND PFDE.Period = pfd.Period

  AND PFDE.Column5 = pfd.Column5

  WHERE pfde.Column1 IS NULL;';



EXEC sys.sp_executesql @sql;

I'd also drop the DISTINCT - if it's really necessary because of duplicates then there is an underlying problem to be fixed as well.

answered Jan 17 at 16:16

Aaron Bertrand♦

150k18284482

answered Jan 17 at 16:16

Aaron Bertrand♦

150k18284482

answered Jan 17 at 16:16

Aaron Bertrand♦

150k18284482

answered Jan 17 at 16:16

Aaron Bertrand♦

150k18284482

Yep, this would be a good implementation of option 2, but I'm kinda hoping to avoiding this as if I followed this approach here I'd have to resort to this everywhere else I implement this, there are 100+ places and I'd prefer not to maintain a DB full of dynamic SQL...

– David Rogers
Jan 17 at 16:21

Well 1 and 3 don't sound like viable options either, since 1 is a performance hit and 3 is not going to change anything (they boil down to the same logic), so... sometimes you just have to pick your poison. A pattern is just a pattern and if dealing with users screaming is more important than a little extra code, you need to make a choice.

– Aaron Bertrand♦
Jan 17 at 16:35

Well each has it's advantages and disadvantages, the issue is I'm not just making a decision for this one query, I'm making a decision for 10+ tables, 100+ procs, and many, many other associated items, it will take over a month to implement and test it properly, and once it's done I'll have to live with the consequences in over 100 production environment for years. So this question really is only the tip of the iceberg, a small manifestation of a broader architectural decision that I need to make and then live with. Though I accept that I may need to go with more than one option...

– David Rogers
Jan 17 at 23:37

Well we as a peer group can tell you your options, and tell you which ones we think are best, but we can't crowd-source your decision for you. As for the testing mentioned in your update, did you look at any execution plans? Any query that takes 15 or 20 seconds is already 100 times worse than it probably should be. Why is it taking so long? What are the wait types? Is it being blocked? Are you returning 40 billion rows to an SSMS grid over a slow network connection? If you read the kitchen sink article I posted, dynamic SQL can help for presence/absence of parameters, and option (recompile)

– Aaron Bertrand♦
Jan 18 at 12:16

...can help for data skew. You can also combine them. But given a 15 second query, using dynamic SQL or not is unlikely to fix that, there is something else bigger going on here.

– Aaron Bertrand♦
Jan 18 at 12:20

add a comment |

Yep, this would be a good implementation of option 2, but I'm kinda hoping to avoiding this as if I followed this approach here I'd have to resort to this everywhere else I implement this, there are 100+ places and I'd prefer not to maintain a DB full of dynamic SQL...

– David Rogers
Jan 17 at 16:21

Well 1 and 3 don't sound like viable options either, since 1 is a performance hit and 3 is not going to change anything (they boil down to the same logic), so... sometimes you just have to pick your poison. A pattern is just a pattern and if dealing with users screaming is more important than a little extra code, you need to make a choice.

– Aaron Bertrand♦
Jan 17 at 16:35

Well each has it's advantages and disadvantages, the issue is I'm not just making a decision for this one query, I'm making a decision for 10+ tables, 100+ procs, and many, many other associated items, it will take over a month to implement and test it properly, and once it's done I'll have to live with the consequences in over 100 production environment for years. So this question really is only the tip of the iceberg, a small manifestation of a broader architectural decision that I need to make and then live with. Though I accept that I may need to go with more than one option...

– David Rogers
Jan 17 at 23:37

Well we as a peer group can tell you your options, and tell you which ones we think are best, but we can't crowd-source your decision for you. As for the testing mentioned in your update, did you look at any execution plans? Any query that takes 15 or 20 seconds is already 100 times worse than it probably should be. Why is it taking so long? What are the wait types? Is it being blocked? Are you returning 40 billion rows to an SSMS grid over a slow network connection? If you read the kitchen sink article I posted, dynamic SQL can help for presence/absence of parameters, and option (recompile)

– Aaron Bertrand♦
Jan 18 at 12:16

...can help for data skew. You can also combine them. But given a 15 second query, using dynamic SQL or not is unlikely to fix that, there is something else bigger going on here.

– Aaron Bertrand♦
Jan 18 at 12:20

Yep, this would be a good implementation of option 2, but I'm kinda hoping to avoiding this as if I followed this approach here I'd have to resort to this everywhere else I implement this, there are 100+ places and I'd prefer not to maintain a DB full of dynamic SQL...

– David Rogers
Jan 17 at 16:21

Well 1 and 3 don't sound like viable options either, since 1 is a performance hit and 3 is not going to change anything (they boil down to the same logic), so... sometimes you just have to pick your poison. A pattern is just a pattern and if dealing with users screaming is more important than a little extra code, you need to make a choice.

– Aaron Bertrand♦
Jan 17 at 16:35

Well each has it's advantages and disadvantages, the issue is I'm not just making a decision for this one query, I'm making a decision for 10+ tables, 100+ procs, and many, many other associated items, it will take over a month to implement and test it properly, and once it's done I'll have to live with the consequences in over 100 production environment for years. So this question really is only the tip of the iceberg, a small manifestation of a broader architectural decision that I need to make and then live with. Though I accept that I may need to go with more than one option...

– David Rogers
Jan 17 at 23:37

Well we as a peer group can tell you your options, and tell you which ones we think are best, but we can't crowd-source your decision for you. As for the testing mentioned in your update, did you look at any execution plans? Any query that takes 15 or 20 seconds is already 100 times worse than it probably should be. Why is it taking so long? What are the wait types? Is it being blocked? Are you returning 40 billion rows to an SSMS grid over a slow network connection? If you read the kitchen sink article I posted, dynamic SQL can help for presence/absence of parameters, and option (recompile)

– Aaron Bertrand♦
Jan 18 at 12:16

...can help for data skew. You can also combine them. But given a 15 second query, using dynamic SQL or not is unlikely to fix that, there is something else bigger going on here.

– Aaron Bertrand♦
Jan 18 at 12:20

add a comment |

IMO,Biggest mistake is the place where you are using CROSS JOIN.

First self join [dbo].[Table1] and filter the resultset then with this resultset use CROSS JOIN.It will give tremendous boost .

SELECT pfd.*

FROM (

    SELECT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ,PFDE1.[Dimension3]

        ,PFDE1.[Dimension4]

        ,PFDE1.[Dimension5]

        ,PFDE1.[Dimension6]

        ,PFDE1.[Dimension7]

        ,PFDE1.[Dimension8]

        ,PFDE1.[Dimension9]

        ,PFDE1.[Dimension10]

        ,p.FiscalPeriod AS Period

        ,0 AS Amt

        ,0 AS EndingBal

        ,PFDE1.Column5

    FROM (

        SELECT DISTINCT PFDE1.Column1

            ,PFDE1.Column2

            ,PFDE1.Column3

            ,PFDE1.Column4

            ,PFDE1.Dimension1

            ,PFDE1.Dimension2

            ,PFDE1.[Dimension3]

            ,PFDE1.[Dimension4]

            ,PFDE1.[Dimension5]

            ,PFDE1.[Dimension6]

            ,PFDE1.[Dimension7]

            ,PFDE1.[Dimension8]

            ,PFDE1.[Dimension9]

            ,PFDE1.[Dimension10]

            ,PFDE1.Column5

        FROM [dbo].[Table1] PFDE1

    LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = PFDE1.Column1

    AND PFDE.Column2 = PFDE1.Column2

    AND PFDE.Column3 = PFDE1.Column3

    AND PFDE.Column4 = PFDE1.Column4

    AND PFDE.Period = PFDE1.Period

    AND PFDE.Column5 = PFDE1.Column5

    AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = PFDE1.Dimension1)

    AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = PFDE1.Dimension2)

    AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = PFDE1.Dimension3)

    AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = PFDE1.Dimension4)

    AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = PFDE1.Dimension5)

    AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = PFDE1.Dimension6)

    AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = PFDE1.Dimension7)

    AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = PFDE1.Dimension8)

    AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = PFDE1.Dimension9)

    AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = PFDE1.Dimension10) 



WHERE pfde.Column1 IS NULL

        ) PFDE1

    CROSS JOIN [dbo].[Table2] p

    ) pfd

Hope my point is clear,it is untested.

No doubts Dynamic Sql will further improve it, EXEC sys.sp_executesql @sql

I'd also drop the DISTINCT - if it's really necessary because of
duplicates then there is an underlying problem to be fixed as well.

answered 6 mins ago

KumarHarsh

81358

add a comment |

IMO,Biggest mistake is the place where you are using CROSS JOIN.

First self join [dbo].[Table1] and filter the resultset then with this resultset use CROSS JOIN.It will give tremendous boost .

SELECT pfd.*

FROM (

    SELECT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ,PFDE1.[Dimension3]

        ,PFDE1.[Dimension4]

        ,PFDE1.[Dimension5]

        ,PFDE1.[Dimension6]

        ,PFDE1.[Dimension7]

        ,PFDE1.[Dimension8]

        ,PFDE1.[Dimension9]

        ,PFDE1.[Dimension10]

        ,p.FiscalPeriod AS Period

        ,0 AS Amt

        ,0 AS EndingBal

        ,PFDE1.Column5

    FROM (

        SELECT DISTINCT PFDE1.Column1

            ,PFDE1.Column2

            ,PFDE1.Column3

            ,PFDE1.Column4

            ,PFDE1.Dimension1

            ,PFDE1.Dimension2

            ,PFDE1.[Dimension3]

            ,PFDE1.[Dimension4]

            ,PFDE1.[Dimension5]

            ,PFDE1.[Dimension6]

            ,PFDE1.[Dimension7]

            ,PFDE1.[Dimension8]

            ,PFDE1.[Dimension9]

            ,PFDE1.[Dimension10]

            ,PFDE1.Column5

        FROM [dbo].[Table1] PFDE1

    LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = PFDE1.Column1

    AND PFDE.Column2 = PFDE1.Column2

    AND PFDE.Column3 = PFDE1.Column3

    AND PFDE.Column4 = PFDE1.Column4

    AND PFDE.Period = PFDE1.Period

    AND PFDE.Column5 = PFDE1.Column5

    AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = PFDE1.Dimension1)

    AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = PFDE1.Dimension2)

    AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = PFDE1.Dimension3)

    AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = PFDE1.Dimension4)

    AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = PFDE1.Dimension5)

    AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = PFDE1.Dimension6)

    AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = PFDE1.Dimension7)

    AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = PFDE1.Dimension8)

    AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = PFDE1.Dimension9)

    AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = PFDE1.Dimension10) 



WHERE pfde.Column1 IS NULL

        ) PFDE1

    CROSS JOIN [dbo].[Table2] p

    ) pfd

Hope my point is clear,it is untested.

No doubts Dynamic Sql will further improve it, EXEC sys.sp_executesql @sql

I'd also drop the DISTINCT - if it's really necessary because of
duplicates then there is an underlying problem to be fixed as well.

answered 6 mins ago

KumarHarsh

81358

add a comment |

IMO,Biggest mistake is the place where you are using CROSS JOIN.

First self join [dbo].[Table1] and filter the resultset then with this resultset use CROSS JOIN.It will give tremendous boost .

SELECT pfd.*

FROM (

    SELECT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ,PFDE1.[Dimension3]

        ,PFDE1.[Dimension4]

        ,PFDE1.[Dimension5]

        ,PFDE1.[Dimension6]

        ,PFDE1.[Dimension7]

        ,PFDE1.[Dimension8]

        ,PFDE1.[Dimension9]

        ,PFDE1.[Dimension10]

        ,p.FiscalPeriod AS Period

        ,0 AS Amt

        ,0 AS EndingBal

        ,PFDE1.Column5

    FROM (

        SELECT DISTINCT PFDE1.Column1

            ,PFDE1.Column2

            ,PFDE1.Column3

            ,PFDE1.Column4

            ,PFDE1.Dimension1

            ,PFDE1.Dimension2

            ,PFDE1.[Dimension3]

            ,PFDE1.[Dimension4]

            ,PFDE1.[Dimension5]

            ,PFDE1.[Dimension6]

            ,PFDE1.[Dimension7]

            ,PFDE1.[Dimension8]

            ,PFDE1.[Dimension9]

            ,PFDE1.[Dimension10]

            ,PFDE1.Column5

        FROM [dbo].[Table1] PFDE1

    LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = PFDE1.Column1

    AND PFDE.Column2 = PFDE1.Column2

    AND PFDE.Column3 = PFDE1.Column3

    AND PFDE.Column4 = PFDE1.Column4

    AND PFDE.Period = PFDE1.Period

    AND PFDE.Column5 = PFDE1.Column5

    AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = PFDE1.Dimension1)

    AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = PFDE1.Dimension2)

    AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = PFDE1.Dimension3)

    AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = PFDE1.Dimension4)

    AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = PFDE1.Dimension5)

    AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = PFDE1.Dimension6)

    AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = PFDE1.Dimension7)

    AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = PFDE1.Dimension8)

    AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = PFDE1.Dimension9)

    AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = PFDE1.Dimension10) 



WHERE pfde.Column1 IS NULL

        ) PFDE1

    CROSS JOIN [dbo].[Table2] p

    ) pfd

Hope my point is clear,it is untested.

No doubts Dynamic Sql will further improve it, EXEC sys.sp_executesql @sql

I'd also drop the DISTINCT - if it's really necessary because of
duplicates then there is an underlying problem to be fixed as well.

answered 6 mins ago

KumarHarsh

81358

IMO,Biggest mistake is the place where you are using CROSS JOIN.

First self join [dbo].[Table1] and filter the resultset then with this resultset use CROSS JOIN.It will give tremendous boost .

SELECT pfd.*

FROM (

    SELECT PFDE1.Column1

        ,PFDE1.Column2

        ,PFDE1.Column3

        ,PFDE1.Column4

        ,PFDE1.Dimension1

        ,PFDE1.Dimension2

        ,PFDE1.[Dimension3]

        ,PFDE1.[Dimension4]

        ,PFDE1.[Dimension5]

        ,PFDE1.[Dimension6]

        ,PFDE1.[Dimension7]

        ,PFDE1.[Dimension8]

        ,PFDE1.[Dimension9]

        ,PFDE1.[Dimension10]

        ,p.FiscalPeriod AS Period

        ,0 AS Amt

        ,0 AS EndingBal

        ,PFDE1.Column5

    FROM (

        SELECT DISTINCT PFDE1.Column1

            ,PFDE1.Column2

            ,PFDE1.Column3

            ,PFDE1.Column4

            ,PFDE1.Dimension1

            ,PFDE1.Dimension2

            ,PFDE1.[Dimension3]

            ,PFDE1.[Dimension4]

            ,PFDE1.[Dimension5]

            ,PFDE1.[Dimension6]

            ,PFDE1.[Dimension7]

            ,PFDE1.[Dimension8]

            ,PFDE1.[Dimension9]

            ,PFDE1.[Dimension10]

            ,PFDE1.Column5

        FROM [dbo].[Table1] PFDE1

    LEFT JOIN [dbo].[Table1] pfde ON PFDE.Column1 = PFDE1.Column1

    AND PFDE.Column2 = PFDE1.Column2

    AND PFDE.Column3 = PFDE1.Column3

    AND PFDE.Column4 = PFDE1.Column4

    AND PFDE.Period = PFDE1.Period

    AND PFDE.Column5 = PFDE1.Column5

    AND (@Dimension1Enabled = 0 OR PFDE.Dimension1 = PFDE1.Dimension1)

    AND (@Dimension2Enabled = 0 OR PFDE.Dimension2 = PFDE1.Dimension2)

    AND (@Dimension3Enabled = 0 OR PFDE.Dimension3 = PFDE1.Dimension3)

    AND (@Dimension4Enabled = 0 OR PFDE.Dimension4 = PFDE1.Dimension4)

    AND (@Dimension5Enabled = 0 OR PFDE.Dimension5 = PFDE1.Dimension5)

    AND (@Dimension6Enabled = 0 OR PFDE.Dimension6 = PFDE1.Dimension6)

    AND (@Dimension7Enabled = 0 OR PFDE.Dimension7 = PFDE1.Dimension7)

    AND (@Dimension8Enabled = 0 OR PFDE.Dimension8 = PFDE1.Dimension8)

    AND (@Dimension9Enabled = 0 OR PFDE.Dimension9 = PFDE1.Dimension9)

    AND (@Dimension10Enabled = 0 OR PFDE.Dimension10 = PFDE1.Dimension10) 



WHERE pfde.Column1 IS NULL

        ) PFDE1

    CROSS JOIN [dbo].[Table2] p

    ) pfd

Hope my point is clear,it is untested.

No doubts Dynamic Sql will further improve it, EXEC sys.sp_executesql @sql

I'd also drop the DISTINCT - if it's really necessary because of
duplicates then there is an underlying problem to be fixed as well.

answered 6 mins ago

KumarHarsh

81358

answered 6 mins ago

KumarHarsh

81358

answered 6 mins ago

KumarHarsh

81358

answered 6 mins ago

KumarHarsh

81358

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Database Administrators Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Xrhrft

Poor SQL Server Performance With Nested “OR” Conditions In LEFT OUTER JOIN

Update 1

Update 1

Update 1

Update 1

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

ف. موراي أبراهام

صرب

كأس إنترتوتو

Poor SQL Server Performance With Nested “OR” Conditions In LEFT OUTER JOIN

Update 1

Update 1

Update 1

Update 1

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

ف. موراي أبراهام

صرب

كأس إنترتوتو

2 Answers
2

2 Answers
2

2 Answers
2