LEFT JOIN performance for WHERE on secondary table












0















I've been developing for years and through out the carrier I've been using only LEFT JOIN as it provides a more logical sense of mapping my one to many data.



However recently I ran across a performance thought:



Say I have table user with millions of rows and table gender with only 2 rows



user                                gender
===================== ================
id name genderId id gender show
===================== ================
1 Sam 1 1 Male 0
2 Samantha 2 2 Female 1
3 Another millions of records


If I'm doing



select * from user u left join gender g on u.genderId = g.id where g.show = 1;


Does SQL engine joins the whole million user records first, then scan the million rows of records for join projected value?



Or is the engine smart enough to query and join only the rows with correct gender? Or is RIGHT JOIN the correct method to provide best performance?



If the behavior is dependent on engine type, let's talk about MySQL InnoDB.










share|improve this question
















bumped to the homepage by Community 4 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.











  • 2





    Show the EXPLAIN query output. Even if you have a gender.id pinned at 2, genderId in user may not be indexed, and even if it was it would be a large proportion of a secondary index, and you still need to look up name and id, so it might not be used. LEFT vs RIGHT won't matter at all. LEFT JOIN should be used when there isn't always RHS match, rather than just one to many, even a basic INNER JOIN does many to many.

    – danblack
    Oct 19 '18 at 1:59













  • Your join is INNER JOIN in practice, because a condition by right table presents in WHERE clause.

    – Akina
    Oct 19 '18 at 5:27






  • 2





    Your where condition on the outer joined table turns the outer join back into an inner join

    – a_horse_with_no_name
    Oct 19 '18 at 6:40











  • DBMS engines are not smart enough to fetch only required entries. Bad news is that nobody is smart enough. Mankind is not able to operate sets at all. All we can do is to emulate the sets by lists and perform the search instead of fetch. Therefore the correct question should sound like that: "Why this query over that tables with that indexing and that cardinalities is slow?"

    – Kondybas
    Oct 19 '18 at 12:01











  • What do you want to happen when user.gender_id = 3 and there is no corresponding row in the table gender.

    – Rick James
    Oct 22 '18 at 4:16
















0















I've been developing for years and through out the carrier I've been using only LEFT JOIN as it provides a more logical sense of mapping my one to many data.



However recently I ran across a performance thought:



Say I have table user with millions of rows and table gender with only 2 rows



user                                gender
===================== ================
id name genderId id gender show
===================== ================
1 Sam 1 1 Male 0
2 Samantha 2 2 Female 1
3 Another millions of records


If I'm doing



select * from user u left join gender g on u.genderId = g.id where g.show = 1;


Does SQL engine joins the whole million user records first, then scan the million rows of records for join projected value?



Or is the engine smart enough to query and join only the rows with correct gender? Or is RIGHT JOIN the correct method to provide best performance?



If the behavior is dependent on engine type, let's talk about MySQL InnoDB.










share|improve this question
















bumped to the homepage by Community 4 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.











  • 2





    Show the EXPLAIN query output. Even if you have a gender.id pinned at 2, genderId in user may not be indexed, and even if it was it would be a large proportion of a secondary index, and you still need to look up name and id, so it might not be used. LEFT vs RIGHT won't matter at all. LEFT JOIN should be used when there isn't always RHS match, rather than just one to many, even a basic INNER JOIN does many to many.

    – danblack
    Oct 19 '18 at 1:59













  • Your join is INNER JOIN in practice, because a condition by right table presents in WHERE clause.

    – Akina
    Oct 19 '18 at 5:27






  • 2





    Your where condition on the outer joined table turns the outer join back into an inner join

    – a_horse_with_no_name
    Oct 19 '18 at 6:40











  • DBMS engines are not smart enough to fetch only required entries. Bad news is that nobody is smart enough. Mankind is not able to operate sets at all. All we can do is to emulate the sets by lists and perform the search instead of fetch. Therefore the correct question should sound like that: "Why this query over that tables with that indexing and that cardinalities is slow?"

    – Kondybas
    Oct 19 '18 at 12:01











  • What do you want to happen when user.gender_id = 3 and there is no corresponding row in the table gender.

    – Rick James
    Oct 22 '18 at 4:16














0












0








0








I've been developing for years and through out the carrier I've been using only LEFT JOIN as it provides a more logical sense of mapping my one to many data.



However recently I ran across a performance thought:



Say I have table user with millions of rows and table gender with only 2 rows



user                                gender
===================== ================
id name genderId id gender show
===================== ================
1 Sam 1 1 Male 0
2 Samantha 2 2 Female 1
3 Another millions of records


If I'm doing



select * from user u left join gender g on u.genderId = g.id where g.show = 1;


Does SQL engine joins the whole million user records first, then scan the million rows of records for join projected value?



Or is the engine smart enough to query and join only the rows with correct gender? Or is RIGHT JOIN the correct method to provide best performance?



If the behavior is dependent on engine type, let's talk about MySQL InnoDB.










share|improve this question
















I've been developing for years and through out the carrier I've been using only LEFT JOIN as it provides a more logical sense of mapping my one to many data.



However recently I ran across a performance thought:



Say I have table user with millions of rows and table gender with only 2 rows



user                                gender
===================== ================
id name genderId id gender show
===================== ================
1 Sam 1 1 Male 0
2 Samantha 2 2 Female 1
3 Another millions of records


If I'm doing



select * from user u left join gender g on u.genderId = g.id where g.show = 1;


Does SQL engine joins the whole million user records first, then scan the million rows of records for join projected value?



Or is the engine smart enough to query and join only the rows with correct gender? Or is RIGHT JOIN the correct method to provide best performance?



If the behavior is dependent on engine type, let's talk about MySQL InnoDB.







mysql






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Oct 19 '18 at 6:40









a_horse_with_no_name

40.2k777112




40.2k777112










asked Oct 19 '18 at 1:27









Chor Wai ChunChor Wai Chun

1012




1012





bumped to the homepage by Community 4 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 4 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.










  • 2





    Show the EXPLAIN query output. Even if you have a gender.id pinned at 2, genderId in user may not be indexed, and even if it was it would be a large proportion of a secondary index, and you still need to look up name and id, so it might not be used. LEFT vs RIGHT won't matter at all. LEFT JOIN should be used when there isn't always RHS match, rather than just one to many, even a basic INNER JOIN does many to many.

    – danblack
    Oct 19 '18 at 1:59













  • Your join is INNER JOIN in practice, because a condition by right table presents in WHERE clause.

    – Akina
    Oct 19 '18 at 5:27






  • 2





    Your where condition on the outer joined table turns the outer join back into an inner join

    – a_horse_with_no_name
    Oct 19 '18 at 6:40











  • DBMS engines are not smart enough to fetch only required entries. Bad news is that nobody is smart enough. Mankind is not able to operate sets at all. All we can do is to emulate the sets by lists and perform the search instead of fetch. Therefore the correct question should sound like that: "Why this query over that tables with that indexing and that cardinalities is slow?"

    – Kondybas
    Oct 19 '18 at 12:01











  • What do you want to happen when user.gender_id = 3 and there is no corresponding row in the table gender.

    – Rick James
    Oct 22 '18 at 4:16














  • 2





    Show the EXPLAIN query output. Even if you have a gender.id pinned at 2, genderId in user may not be indexed, and even if it was it would be a large proportion of a secondary index, and you still need to look up name and id, so it might not be used. LEFT vs RIGHT won't matter at all. LEFT JOIN should be used when there isn't always RHS match, rather than just one to many, even a basic INNER JOIN does many to many.

    – danblack
    Oct 19 '18 at 1:59













  • Your join is INNER JOIN in practice, because a condition by right table presents in WHERE clause.

    – Akina
    Oct 19 '18 at 5:27






  • 2





    Your where condition on the outer joined table turns the outer join back into an inner join

    – a_horse_with_no_name
    Oct 19 '18 at 6:40











  • DBMS engines are not smart enough to fetch only required entries. Bad news is that nobody is smart enough. Mankind is not able to operate sets at all. All we can do is to emulate the sets by lists and perform the search instead of fetch. Therefore the correct question should sound like that: "Why this query over that tables with that indexing and that cardinalities is slow?"

    – Kondybas
    Oct 19 '18 at 12:01











  • What do you want to happen when user.gender_id = 3 and there is no corresponding row in the table gender.

    – Rick James
    Oct 22 '18 at 4:16








2




2





Show the EXPLAIN query output. Even if you have a gender.id pinned at 2, genderId in user may not be indexed, and even if it was it would be a large proportion of a secondary index, and you still need to look up name and id, so it might not be used. LEFT vs RIGHT won't matter at all. LEFT JOIN should be used when there isn't always RHS match, rather than just one to many, even a basic INNER JOIN does many to many.

– danblack
Oct 19 '18 at 1:59







Show the EXPLAIN query output. Even if you have a gender.id pinned at 2, genderId in user may not be indexed, and even if it was it would be a large proportion of a secondary index, and you still need to look up name and id, so it might not be used. LEFT vs RIGHT won't matter at all. LEFT JOIN should be used when there isn't always RHS match, rather than just one to many, even a basic INNER JOIN does many to many.

– danblack
Oct 19 '18 at 1:59















Your join is INNER JOIN in practice, because a condition by right table presents in WHERE clause.

– Akina
Oct 19 '18 at 5:27





Your join is INNER JOIN in practice, because a condition by right table presents in WHERE clause.

– Akina
Oct 19 '18 at 5:27




2




2





Your where condition on the outer joined table turns the outer join back into an inner join

– a_horse_with_no_name
Oct 19 '18 at 6:40





Your where condition on the outer joined table turns the outer join back into an inner join

– a_horse_with_no_name
Oct 19 '18 at 6:40













DBMS engines are not smart enough to fetch only required entries. Bad news is that nobody is smart enough. Mankind is not able to operate sets at all. All we can do is to emulate the sets by lists and perform the search instead of fetch. Therefore the correct question should sound like that: "Why this query over that tables with that indexing and that cardinalities is slow?"

– Kondybas
Oct 19 '18 at 12:01





DBMS engines are not smart enough to fetch only required entries. Bad news is that nobody is smart enough. Mankind is not able to operate sets at all. All we can do is to emulate the sets by lists and perform the search instead of fetch. Therefore the correct question should sound like that: "Why this query over that tables with that indexing and that cardinalities is slow?"

– Kondybas
Oct 19 '18 at 12:01













What do you want to happen when user.gender_id = 3 and there is no corresponding row in the table gender.

– Rick James
Oct 22 '18 at 4:16





What do you want to happen when user.gender_id = 3 and there is no corresponding row in the table gender.

– Rick James
Oct 22 '18 at 4:16










1 Answer
1






active

oldest

votes


















0














Remember how SQL works: construct a set of tuples (rows) based on the given tables and joins, then filter those based on any where clause.

Because you have a where condition on the "right" table, your outer join is effectively converted into an inner join.



To retain the "outer-ness" of the join, put the filter condition into the join construct, rather than in the where clause:



select * 
from user u
left join gender g
on u.genderId = g.id
and g.show = 1
[where ...]





share|improve this answer
























  • that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)

    – Jasen
    Oct 19 '18 at 11:26











  • @Jasen By index hints we can force the planner to do exactly we want.

    – Kondybas
    Oct 19 '18 at 11:53











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f220533%2fleft-join-performance-for-where-on-secondary-table%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














Remember how SQL works: construct a set of tuples (rows) based on the given tables and joins, then filter those based on any where clause.

Because you have a where condition on the "right" table, your outer join is effectively converted into an inner join.



To retain the "outer-ness" of the join, put the filter condition into the join construct, rather than in the where clause:



select * 
from user u
left join gender g
on u.genderId = g.id
and g.show = 1
[where ...]





share|improve this answer
























  • that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)

    – Jasen
    Oct 19 '18 at 11:26











  • @Jasen By index hints we can force the planner to do exactly we want.

    – Kondybas
    Oct 19 '18 at 11:53
















0














Remember how SQL works: construct a set of tuples (rows) based on the given tables and joins, then filter those based on any where clause.

Because you have a where condition on the "right" table, your outer join is effectively converted into an inner join.



To retain the "outer-ness" of the join, put the filter condition into the join construct, rather than in the where clause:



select * 
from user u
left join gender g
on u.genderId = g.id
and g.show = 1
[where ...]





share|improve this answer
























  • that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)

    – Jasen
    Oct 19 '18 at 11:26











  • @Jasen By index hints we can force the planner to do exactly we want.

    – Kondybas
    Oct 19 '18 at 11:53














0












0








0







Remember how SQL works: construct a set of tuples (rows) based on the given tables and joins, then filter those based on any where clause.

Because you have a where condition on the "right" table, your outer join is effectively converted into an inner join.



To retain the "outer-ness" of the join, put the filter condition into the join construct, rather than in the where clause:



select * 
from user u
left join gender g
on u.genderId = g.id
and g.show = 1
[where ...]





share|improve this answer













Remember how SQL works: construct a set of tuples (rows) based on the given tables and joins, then filter those based on any where clause.

Because you have a where condition on the "right" table, your outer join is effectively converted into an inner join.



To retain the "outer-ness" of the join, put the filter condition into the join construct, rather than in the where clause:



select * 
from user u
left join gender g
on u.genderId = g.id
and g.show = 1
[where ...]






share|improve this answer












share|improve this answer



share|improve this answer










answered Oct 19 '18 at 11:18









Phill W.Phill W.

80633




80633













  • that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)

    – Jasen
    Oct 19 '18 at 11:26











  • @Jasen By index hints we can force the planner to do exactly we want.

    – Kondybas
    Oct 19 '18 at 11:53



















  • that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)

    – Jasen
    Oct 19 '18 at 11:26











  • @Jasen By index hints we can force the planner to do exactly we want.

    – Kondybas
    Oct 19 '18 at 11:53

















that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)

– Jasen
Oct 19 '18 at 11:26





that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)

– Jasen
Oct 19 '18 at 11:26













@Jasen By index hints we can force the planner to do exactly we want.

– Kondybas
Oct 19 '18 at 11:53





@Jasen By index hints we can force the planner to do exactly we want.

– Kondybas
Oct 19 '18 at 11:53


















draft saved

draft discarded




















































Thanks for contributing an answer to Database Administrators Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f220533%2fleft-join-performance-for-where-on-secondary-table%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

SQL Server 17 - Attemping to backup to remote NAS but Access is denied

Always On Availability groups resolving state after failover - Remote harden of transaction...

Restoring from pg_dump with foreign key constraints