Slow running Oracle query caused by unnecessary full table scan
I'm by no means an expert with Oracle, but I know enough in terms of how to analyze explain plans and perform basic optimizations. However, this time, Oracle has stumped me with it's explain plan.
Basically, I've got a single table that is very large (27 million rows) with an index on ScheduleID. What I'm trying to do is get all the records that have a ScheduleID that match one of the ScheduleIDs in a given list (currently stored in a temp table). The temp table will probably have around 100 rows, although it could be up to 500 rows. And for each row in temp row, it will have up to around 10 matching records in the large table. So we are talking about the query returning about 5000 rows max (out of 27 million).
Now, for some reason, Oracle thinks that the best solution is to use a hash join and do a full table scan of the large table. You can see the query and its explain plan that Oracle generates below.
select scheduleintervalid, io_scheduleinterval.version
from io_scheduleinterval
inner join io_querytemp tmp on tmp.data = scheduleid
This doesn't seem to make any sense to me. It is not taking advantage of the index on ScheduleID, which I would think would make the query significantly faster. Now from my understanding, the hash join couldn't take advantage of the index, but wouldn't a nested loop join be much faster. Am I missing something here?
Is there something else I can do to speed up the query (I've tried things like using a dynamic IN clause)?
Thanks for the help!
oracle query-performance
add a comment |
I'm by no means an expert with Oracle, but I know enough in terms of how to analyze explain plans and perform basic optimizations. However, this time, Oracle has stumped me with it's explain plan.
Basically, I've got a single table that is very large (27 million rows) with an index on ScheduleID. What I'm trying to do is get all the records that have a ScheduleID that match one of the ScheduleIDs in a given list (currently stored in a temp table). The temp table will probably have around 100 rows, although it could be up to 500 rows. And for each row in temp row, it will have up to around 10 matching records in the large table. So we are talking about the query returning about 5000 rows max (out of 27 million).
Now, for some reason, Oracle thinks that the best solution is to use a hash join and do a full table scan of the large table. You can see the query and its explain plan that Oracle generates below.
select scheduleintervalid, io_scheduleinterval.version
from io_scheduleinterval
inner join io_querytemp tmp on tmp.data = scheduleid
This doesn't seem to make any sense to me. It is not taking advantage of the index on ScheduleID, which I would think would make the query significantly faster. Now from my understanding, the hash join couldn't take advantage of the index, but wouldn't a nested loop join be much faster. Am I missing something here?
Is there something else I can do to speed up the query (I've tried things like using a dynamic IN clause)?
Thanks for the help!
oracle query-performance
Is the index on scheduleid valid/usable? Do you have histograms on that column on either table? (checkdba_tab_col_statistics
, suitably filtered - posting the output of that might be interesting, but be careful low_value and high_value can be decoded if that data is potentially sensitive) Are there non-null constraints on both columns?
– Mat
Mar 30 '15 at 14:47
There are not non-null constraints on both columns. There is a non-null constraint on IO_ScheduleInterval.ScheduleID but there is no such constraint on IO_QueryTemp.Data (although there is no reason that there couldn't be). Could that potentially cause performance issues?
– David R.
Mar 30 '15 at 14:56
In case I wasn't clear, IO_QueryTemp is a GTT and has no indexes or constraints.
– David R.
Mar 30 '15 at 14:59
add a comment |
I'm by no means an expert with Oracle, but I know enough in terms of how to analyze explain plans and perform basic optimizations. However, this time, Oracle has stumped me with it's explain plan.
Basically, I've got a single table that is very large (27 million rows) with an index on ScheduleID. What I'm trying to do is get all the records that have a ScheduleID that match one of the ScheduleIDs in a given list (currently stored in a temp table). The temp table will probably have around 100 rows, although it could be up to 500 rows. And for each row in temp row, it will have up to around 10 matching records in the large table. So we are talking about the query returning about 5000 rows max (out of 27 million).
Now, for some reason, Oracle thinks that the best solution is to use a hash join and do a full table scan of the large table. You can see the query and its explain plan that Oracle generates below.
select scheduleintervalid, io_scheduleinterval.version
from io_scheduleinterval
inner join io_querytemp tmp on tmp.data = scheduleid
This doesn't seem to make any sense to me. It is not taking advantage of the index on ScheduleID, which I would think would make the query significantly faster. Now from my understanding, the hash join couldn't take advantage of the index, but wouldn't a nested loop join be much faster. Am I missing something here?
Is there something else I can do to speed up the query (I've tried things like using a dynamic IN clause)?
Thanks for the help!
oracle query-performance
I'm by no means an expert with Oracle, but I know enough in terms of how to analyze explain plans and perform basic optimizations. However, this time, Oracle has stumped me with it's explain plan.
Basically, I've got a single table that is very large (27 million rows) with an index on ScheduleID. What I'm trying to do is get all the records that have a ScheduleID that match one of the ScheduleIDs in a given list (currently stored in a temp table). The temp table will probably have around 100 rows, although it could be up to 500 rows. And for each row in temp row, it will have up to around 10 matching records in the large table. So we are talking about the query returning about 5000 rows max (out of 27 million).
Now, for some reason, Oracle thinks that the best solution is to use a hash join and do a full table scan of the large table. You can see the query and its explain plan that Oracle generates below.
select scheduleintervalid, io_scheduleinterval.version
from io_scheduleinterval
inner join io_querytemp tmp on tmp.data = scheduleid
This doesn't seem to make any sense to me. It is not taking advantage of the index on ScheduleID, which I would think would make the query significantly faster. Now from my understanding, the hash join couldn't take advantage of the index, but wouldn't a nested loop join be much faster. Am I missing something here?
Is there something else I can do to speed up the query (I've tried things like using a dynamic IN clause)?
Thanks for the help!
oracle query-performance
oracle query-performance
edited Jun 5 '15 at 19:55
blubb
1597
1597
asked Mar 30 '15 at 12:32
David R.David R.
111
111
Is the index on scheduleid valid/usable? Do you have histograms on that column on either table? (checkdba_tab_col_statistics
, suitably filtered - posting the output of that might be interesting, but be careful low_value and high_value can be decoded if that data is potentially sensitive) Are there non-null constraints on both columns?
– Mat
Mar 30 '15 at 14:47
There are not non-null constraints on both columns. There is a non-null constraint on IO_ScheduleInterval.ScheduleID but there is no such constraint on IO_QueryTemp.Data (although there is no reason that there couldn't be). Could that potentially cause performance issues?
– David R.
Mar 30 '15 at 14:56
In case I wasn't clear, IO_QueryTemp is a GTT and has no indexes or constraints.
– David R.
Mar 30 '15 at 14:59
add a comment |
Is the index on scheduleid valid/usable? Do you have histograms on that column on either table? (checkdba_tab_col_statistics
, suitably filtered - posting the output of that might be interesting, but be careful low_value and high_value can be decoded if that data is potentially sensitive) Are there non-null constraints on both columns?
– Mat
Mar 30 '15 at 14:47
There are not non-null constraints on both columns. There is a non-null constraint on IO_ScheduleInterval.ScheduleID but there is no such constraint on IO_QueryTemp.Data (although there is no reason that there couldn't be). Could that potentially cause performance issues?
– David R.
Mar 30 '15 at 14:56
In case I wasn't clear, IO_QueryTemp is a GTT and has no indexes or constraints.
– David R.
Mar 30 '15 at 14:59
Is the index on scheduleid valid/usable? Do you have histograms on that column on either table? (check
dba_tab_col_statistics
, suitably filtered - posting the output of that might be interesting, but be careful low_value and high_value can be decoded if that data is potentially sensitive) Are there non-null constraints on both columns?– Mat
Mar 30 '15 at 14:47
Is the index on scheduleid valid/usable? Do you have histograms on that column on either table? (check
dba_tab_col_statistics
, suitably filtered - posting the output of that might be interesting, but be careful low_value and high_value can be decoded if that data is potentially sensitive) Are there non-null constraints on both columns?– Mat
Mar 30 '15 at 14:47
There are not non-null constraints on both columns. There is a non-null constraint on IO_ScheduleInterval.ScheduleID but there is no such constraint on IO_QueryTemp.Data (although there is no reason that there couldn't be). Could that potentially cause performance issues?
– David R.
Mar 30 '15 at 14:56
There are not non-null constraints on both columns. There is a non-null constraint on IO_ScheduleInterval.ScheduleID but there is no such constraint on IO_QueryTemp.Data (although there is no reason that there couldn't be). Could that potentially cause performance issues?
– David R.
Mar 30 '15 at 14:56
In case I wasn't clear, IO_QueryTemp is a GTT and has no indexes or constraints.
– David R.
Mar 30 '15 at 14:59
In case I wasn't clear, IO_QueryTemp is a GTT and has no indexes or constraints.
– David R.
Mar 30 '15 at 14:59
add a comment |
1 Answer
1
active
oldest
votes
Answer left as comments by the question author
I think I may have found the issue, and it definitely makes sense. The type of the data column on the temp table was nvarchar2(100)
while the type of the scheduleid
column was char(22)
.
So apparently with the different types, the index cannot be used. When I added a to_char
to the data column, it was able to finally use the index.
This temp table is also used in some other joins, where the data is of type varchar(100)
. In hindsight, maybe I could have had multiple columns on the temp table for different types, but what I have now works.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f96590%2fslow-running-oracle-query-caused-by-unnecessary-full-table-scan%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Answer left as comments by the question author
I think I may have found the issue, and it definitely makes sense. The type of the data column on the temp table was nvarchar2(100)
while the type of the scheduleid
column was char(22)
.
So apparently with the different types, the index cannot be used. When I added a to_char
to the data column, it was able to finally use the index.
This temp table is also used in some other joins, where the data is of type varchar(100)
. In hindsight, maybe I could have had multiple columns on the temp table for different types, but what I have now works.
add a comment |
Answer left as comments by the question author
I think I may have found the issue, and it definitely makes sense. The type of the data column on the temp table was nvarchar2(100)
while the type of the scheduleid
column was char(22)
.
So apparently with the different types, the index cannot be used. When I added a to_char
to the data column, it was able to finally use the index.
This temp table is also used in some other joins, where the data is of type varchar(100)
. In hindsight, maybe I could have had multiple columns on the temp table for different types, but what I have now works.
add a comment |
Answer left as comments by the question author
I think I may have found the issue, and it definitely makes sense. The type of the data column on the temp table was nvarchar2(100)
while the type of the scheduleid
column was char(22)
.
So apparently with the different types, the index cannot be used. When I added a to_char
to the data column, it was able to finally use the index.
This temp table is also used in some other joins, where the data is of type varchar(100)
. In hindsight, maybe I could have had multiple columns on the temp table for different types, but what I have now works.
Answer left as comments by the question author
I think I may have found the issue, and it definitely makes sense. The type of the data column on the temp table was nvarchar2(100)
while the type of the scheduleid
column was char(22)
.
So apparently with the different types, the index cannot be used. When I added a to_char
to the data column, it was able to finally use the index.
This temp table is also used in some other joins, where the data is of type varchar(100)
. In hindsight, maybe I could have had multiple columns on the temp table for different types, but what I have now works.
answered 2 mins ago
Comment ConverterComment Converter
1,1951323
1,1951323
add a comment |
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f96590%2fslow-running-oracle-query-caused-by-unnecessary-full-table-scan%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Is the index on scheduleid valid/usable? Do you have histograms on that column on either table? (check
dba_tab_col_statistics
, suitably filtered - posting the output of that might be interesting, but be careful low_value and high_value can be decoded if that data is potentially sensitive) Are there non-null constraints on both columns?– Mat
Mar 30 '15 at 14:47
There are not non-null constraints on both columns. There is a non-null constraint on IO_ScheduleInterval.ScheduleID but there is no such constraint on IO_QueryTemp.Data (although there is no reason that there couldn't be). Could that potentially cause performance issues?
– David R.
Mar 30 '15 at 14:56
In case I wasn't clear, IO_QueryTemp is a GTT and has no indexes or constraints.
– David R.
Mar 30 '15 at 14:59