How to speed up query with anti-joins
I have a query with 2 anti-joins (UserEmails = 1M+ rows and Subscriptions = <100k rows), 2 conditions, and a sort. I've created an index for the 2 conditions + sort, which sped up the query by 50%. Both anti-joins have indices. However, the query is too slow (4 seconds on production).
Here is the query:
SELECT
"Users"."firstName",
"Users"."lastName",
"Users"."email",
"Users"."id"
FROM
"Users"
WHERE
NOT EXISTS (
SELECT
1
FROM
"UserEmails"
WHERE
"UserEmails"."userId" = "Users". ID
)
AND NOT EXISTS (
SELECT
1
FROM
"Subscriptions"
WHERE
"Subscriptions"."userId" = "Users". ID
)
AND "isEmailVerified" = TRUE
AND "emailUnsubscribeDate" IS NULL
ORDER BY
"Users"."createdAt" DESC
LIMIT 100
Here is the explain:
Limit (cost=1.28..177.77 rows=100 width=49) (actual time=6171.121..6171.850 rows=100 loops=1)
-> Nested Loop Anti Join (cost=1.28..4665810.76 rows=2643614 width=49) (actual time=6171.119..6171.807 rows=100 loops=1)
-> Nested Loop Anti Join (cost=0.86..3470216.17 rows=2707688 width=49) (actual time=0.809..6062.152 rows=28607 loops=1)
-> Index Scan using users_email_subscribers_idx on "Users" (cost=0.43..1844686.50 rows=3312999 width=49) (actual time=0.055..2342.793 rows=1186607 loops=1)
-> Index Only Scan using "UserEmails_userId_emailId_key" on "UserEmails" (cost=0.43..0.49 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1186607)
Index Cond: ("userId" = "Users".id)
Heap Fetches: 1153034
-> Index Only Scan using "Subscriptions_userId_type_key" on "Subscriptions" (cost=0.42..0.44 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=28607)
Index Cond: ("userId" = "Users".id)
Heap Fetches: 28507
Planning time: 2.346 ms
Execution time: 6171.963 ms
And here is the index that improved the speed by 50%:
CREATE INDEX "users_email_subscribers_idx" ON "public"."Users" USING btree("createdAt" DESC) WHERE "isEmailVerified" = TRUE AND "emailUnsubscribeDate" IS NULL;
EDIT:
I should also mention that the users_email_subscribers_idx is showing an Index Scan and not Index Only Scan likely because the index is being updated regularly.
postgresql index query-performance join
add a comment |
I have a query with 2 anti-joins (UserEmails = 1M+ rows and Subscriptions = <100k rows), 2 conditions, and a sort. I've created an index for the 2 conditions + sort, which sped up the query by 50%. Both anti-joins have indices. However, the query is too slow (4 seconds on production).
Here is the query:
SELECT
"Users"."firstName",
"Users"."lastName",
"Users"."email",
"Users"."id"
FROM
"Users"
WHERE
NOT EXISTS (
SELECT
1
FROM
"UserEmails"
WHERE
"UserEmails"."userId" = "Users". ID
)
AND NOT EXISTS (
SELECT
1
FROM
"Subscriptions"
WHERE
"Subscriptions"."userId" = "Users". ID
)
AND "isEmailVerified" = TRUE
AND "emailUnsubscribeDate" IS NULL
ORDER BY
"Users"."createdAt" DESC
LIMIT 100
Here is the explain:
Limit (cost=1.28..177.77 rows=100 width=49) (actual time=6171.121..6171.850 rows=100 loops=1)
-> Nested Loop Anti Join (cost=1.28..4665810.76 rows=2643614 width=49) (actual time=6171.119..6171.807 rows=100 loops=1)
-> Nested Loop Anti Join (cost=0.86..3470216.17 rows=2707688 width=49) (actual time=0.809..6062.152 rows=28607 loops=1)
-> Index Scan using users_email_subscribers_idx on "Users" (cost=0.43..1844686.50 rows=3312999 width=49) (actual time=0.055..2342.793 rows=1186607 loops=1)
-> Index Only Scan using "UserEmails_userId_emailId_key" on "UserEmails" (cost=0.43..0.49 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1186607)
Index Cond: ("userId" = "Users".id)
Heap Fetches: 1153034
-> Index Only Scan using "Subscriptions_userId_type_key" on "Subscriptions" (cost=0.42..0.44 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=28607)
Index Cond: ("userId" = "Users".id)
Heap Fetches: 28507
Planning time: 2.346 ms
Execution time: 6171.963 ms
And here is the index that improved the speed by 50%:
CREATE INDEX "users_email_subscribers_idx" ON "public"."Users" USING btree("createdAt" DESC) WHERE "isEmailVerified" = TRUE AND "emailUnsubscribeDate" IS NULL;
EDIT:
I should also mention that the users_email_subscribers_idx is showing an Index Scan and not Index Only Scan likely because the index is being updated regularly.
postgresql index query-performance join
add a comment |
I have a query with 2 anti-joins (UserEmails = 1M+ rows and Subscriptions = <100k rows), 2 conditions, and a sort. I've created an index for the 2 conditions + sort, which sped up the query by 50%. Both anti-joins have indices. However, the query is too slow (4 seconds on production).
Here is the query:
SELECT
"Users"."firstName",
"Users"."lastName",
"Users"."email",
"Users"."id"
FROM
"Users"
WHERE
NOT EXISTS (
SELECT
1
FROM
"UserEmails"
WHERE
"UserEmails"."userId" = "Users". ID
)
AND NOT EXISTS (
SELECT
1
FROM
"Subscriptions"
WHERE
"Subscriptions"."userId" = "Users". ID
)
AND "isEmailVerified" = TRUE
AND "emailUnsubscribeDate" IS NULL
ORDER BY
"Users"."createdAt" DESC
LIMIT 100
Here is the explain:
Limit (cost=1.28..177.77 rows=100 width=49) (actual time=6171.121..6171.850 rows=100 loops=1)
-> Nested Loop Anti Join (cost=1.28..4665810.76 rows=2643614 width=49) (actual time=6171.119..6171.807 rows=100 loops=1)
-> Nested Loop Anti Join (cost=0.86..3470216.17 rows=2707688 width=49) (actual time=0.809..6062.152 rows=28607 loops=1)
-> Index Scan using users_email_subscribers_idx on "Users" (cost=0.43..1844686.50 rows=3312999 width=49) (actual time=0.055..2342.793 rows=1186607 loops=1)
-> Index Only Scan using "UserEmails_userId_emailId_key" on "UserEmails" (cost=0.43..0.49 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1186607)
Index Cond: ("userId" = "Users".id)
Heap Fetches: 1153034
-> Index Only Scan using "Subscriptions_userId_type_key" on "Subscriptions" (cost=0.42..0.44 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=28607)
Index Cond: ("userId" = "Users".id)
Heap Fetches: 28507
Planning time: 2.346 ms
Execution time: 6171.963 ms
And here is the index that improved the speed by 50%:
CREATE INDEX "users_email_subscribers_idx" ON "public"."Users" USING btree("createdAt" DESC) WHERE "isEmailVerified" = TRUE AND "emailUnsubscribeDate" IS NULL;
EDIT:
I should also mention that the users_email_subscribers_idx is showing an Index Scan and not Index Only Scan likely because the index is being updated regularly.
postgresql index query-performance join
I have a query with 2 anti-joins (UserEmails = 1M+ rows and Subscriptions = <100k rows), 2 conditions, and a sort. I've created an index for the 2 conditions + sort, which sped up the query by 50%. Both anti-joins have indices. However, the query is too slow (4 seconds on production).
Here is the query:
SELECT
"Users"."firstName",
"Users"."lastName",
"Users"."email",
"Users"."id"
FROM
"Users"
WHERE
NOT EXISTS (
SELECT
1
FROM
"UserEmails"
WHERE
"UserEmails"."userId" = "Users". ID
)
AND NOT EXISTS (
SELECT
1
FROM
"Subscriptions"
WHERE
"Subscriptions"."userId" = "Users". ID
)
AND "isEmailVerified" = TRUE
AND "emailUnsubscribeDate" IS NULL
ORDER BY
"Users"."createdAt" DESC
LIMIT 100
Here is the explain:
Limit (cost=1.28..177.77 rows=100 width=49) (actual time=6171.121..6171.850 rows=100 loops=1)
-> Nested Loop Anti Join (cost=1.28..4665810.76 rows=2643614 width=49) (actual time=6171.119..6171.807 rows=100 loops=1)
-> Nested Loop Anti Join (cost=0.86..3470216.17 rows=2707688 width=49) (actual time=0.809..6062.152 rows=28607 loops=1)
-> Index Scan using users_email_subscribers_idx on "Users" (cost=0.43..1844686.50 rows=3312999 width=49) (actual time=0.055..2342.793 rows=1186607 loops=1)
-> Index Only Scan using "UserEmails_userId_emailId_key" on "UserEmails" (cost=0.43..0.49 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1186607)
Index Cond: ("userId" = "Users".id)
Heap Fetches: 1153034
-> Index Only Scan using "Subscriptions_userId_type_key" on "Subscriptions" (cost=0.42..0.44 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=28607)
Index Cond: ("userId" = "Users".id)
Heap Fetches: 28507
Planning time: 2.346 ms
Execution time: 6171.963 ms
And here is the index that improved the speed by 50%:
CREATE INDEX "users_email_subscribers_idx" ON "public"."Users" USING btree("createdAt" DESC) WHERE "isEmailVerified" = TRUE AND "emailUnsubscribeDate" IS NULL;
EDIT:
I should also mention that the users_email_subscribers_idx is showing an Index Scan and not Index Only Scan likely because the index is being updated regularly.
postgresql index query-performance join
postgresql index query-performance join
asked 2 mins ago
GarrettGarrett
3081412
3081412
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f228408%2fhow-to-speed-up-query-with-anti-joins%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f228408%2fhow-to-speed-up-query-with-anti-joins%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown