Containment check tstzrange @> timestamptz not using btree or gist index
Schema:
Column | Type
----------------------+--------------------------
id | integer
event_id | integer
started_at | timestamp with time zone
ended_at | timestamp with time zone
created_at | timestamp with time zone
"event_seat_state_lookup_pkey" PRIMARY KEY, btree (id)
"event_seating_lookup_created_at_idx" btree (created_at)
"event_seating_lookup_created_at_idx2" gist (created_at)
Query:
SELECT id
FROM event_seating_lookup esl1
WHERE
tstzrange(now() - interval '1 hour', now() + interval '1 hour', '[)') @> esl1.created_at;
Explain analyze:
Table with <100k rows.
Seq Scan on event_seating_lookup esl1 (cost=0.00..1550.30 rows=148 width=4) (actual time=0.013..19.956 rows=29103 loops=1)
Filter: (tstzrange((now() - '01:00:00'::interval), (now() + '01:00:00'::interval), '[)'::text) @> created_at)
Buffers: shared hit=809
Planning Time: 0.110 ms
Execution Time: 21.942 ms
Table with 1M+ rows:
Seq Scan on event_seating_lookup esl1 (cost=10000000000.00..10000042152.75 rows=5832 width=4) (actual time=0.009..621.895 rows=1166413 loops=1)
Filter: (tstzrange((now() - '01:00:00'::interval), (now() + '01:00:00'::interval), '[)'::text) @> created_at)
Buffers: shared hit=12995
Planning Time: 0.092 ms
Execution Time: 697.927 ms
I have tried:
VACUUM FULL event_seating_lookup;
VACUUM event_seating_lookup;
VACUUM ANALYZE event_seating_lookup;
SET enable_seqscan = OFF;
Problem:
event_seating_lookup_created_at_idx
or event_seating_lookup_created_at_idx2
indexes are not being used.
Notes:
- PostgreSQL 11.1.
btree_gist
extension is installed.- I have tried equivalent setup with
created_at timestamp without time zone
and usingtsrange
; same result. - I understand that rewriting the query with
>=
,<
checks would make it use the btree index. The question is what is the reason the index is not used with thetstzrange
containment operator and if there is a way to make it work.
postgresql index btree gist-index postgresql-11
add a comment |
Schema:
Column | Type
----------------------+--------------------------
id | integer
event_id | integer
started_at | timestamp with time zone
ended_at | timestamp with time zone
created_at | timestamp with time zone
"event_seat_state_lookup_pkey" PRIMARY KEY, btree (id)
"event_seating_lookup_created_at_idx" btree (created_at)
"event_seating_lookup_created_at_idx2" gist (created_at)
Query:
SELECT id
FROM event_seating_lookup esl1
WHERE
tstzrange(now() - interval '1 hour', now() + interval '1 hour', '[)') @> esl1.created_at;
Explain analyze:
Table with <100k rows.
Seq Scan on event_seating_lookup esl1 (cost=0.00..1550.30 rows=148 width=4) (actual time=0.013..19.956 rows=29103 loops=1)
Filter: (tstzrange((now() - '01:00:00'::interval), (now() + '01:00:00'::interval), '[)'::text) @> created_at)
Buffers: shared hit=809
Planning Time: 0.110 ms
Execution Time: 21.942 ms
Table with 1M+ rows:
Seq Scan on event_seating_lookup esl1 (cost=10000000000.00..10000042152.75 rows=5832 width=4) (actual time=0.009..621.895 rows=1166413 loops=1)
Filter: (tstzrange((now() - '01:00:00'::interval), (now() + '01:00:00'::interval), '[)'::text) @> created_at)
Buffers: shared hit=12995
Planning Time: 0.092 ms
Execution Time: 697.927 ms
I have tried:
VACUUM FULL event_seating_lookup;
VACUUM event_seating_lookup;
VACUUM ANALYZE event_seating_lookup;
SET enable_seqscan = OFF;
Problem:
event_seating_lookup_created_at_idx
or event_seating_lookup_created_at_idx2
indexes are not being used.
Notes:
- PostgreSQL 11.1.
btree_gist
extension is installed.- I have tried equivalent setup with
created_at timestamp without time zone
and usingtsrange
; same result. - I understand that rewriting the query with
>=
,<
checks would make it use the btree index. The question is what is the reason the index is not used with thetstzrange
containment operator and if there is a way to make it work.
postgresql index btree gist-index postgresql-11
add a comment |
Schema:
Column | Type
----------------------+--------------------------
id | integer
event_id | integer
started_at | timestamp with time zone
ended_at | timestamp with time zone
created_at | timestamp with time zone
"event_seat_state_lookup_pkey" PRIMARY KEY, btree (id)
"event_seating_lookup_created_at_idx" btree (created_at)
"event_seating_lookup_created_at_idx2" gist (created_at)
Query:
SELECT id
FROM event_seating_lookup esl1
WHERE
tstzrange(now() - interval '1 hour', now() + interval '1 hour', '[)') @> esl1.created_at;
Explain analyze:
Table with <100k rows.
Seq Scan on event_seating_lookup esl1 (cost=0.00..1550.30 rows=148 width=4) (actual time=0.013..19.956 rows=29103 loops=1)
Filter: (tstzrange((now() - '01:00:00'::interval), (now() + '01:00:00'::interval), '[)'::text) @> created_at)
Buffers: shared hit=809
Planning Time: 0.110 ms
Execution Time: 21.942 ms
Table with 1M+ rows:
Seq Scan on event_seating_lookup esl1 (cost=10000000000.00..10000042152.75 rows=5832 width=4) (actual time=0.009..621.895 rows=1166413 loops=1)
Filter: (tstzrange((now() - '01:00:00'::interval), (now() + '01:00:00'::interval), '[)'::text) @> created_at)
Buffers: shared hit=12995
Planning Time: 0.092 ms
Execution Time: 697.927 ms
I have tried:
VACUUM FULL event_seating_lookup;
VACUUM event_seating_lookup;
VACUUM ANALYZE event_seating_lookup;
SET enable_seqscan = OFF;
Problem:
event_seating_lookup_created_at_idx
or event_seating_lookup_created_at_idx2
indexes are not being used.
Notes:
- PostgreSQL 11.1.
btree_gist
extension is installed.- I have tried equivalent setup with
created_at timestamp without time zone
and usingtsrange
; same result. - I understand that rewriting the query with
>=
,<
checks would make it use the btree index. The question is what is the reason the index is not used with thetstzrange
containment operator and if there is a way to make it work.
postgresql index btree gist-index postgresql-11
Schema:
Column | Type
----------------------+--------------------------
id | integer
event_id | integer
started_at | timestamp with time zone
ended_at | timestamp with time zone
created_at | timestamp with time zone
"event_seat_state_lookup_pkey" PRIMARY KEY, btree (id)
"event_seating_lookup_created_at_idx" btree (created_at)
"event_seating_lookup_created_at_idx2" gist (created_at)
Query:
SELECT id
FROM event_seating_lookup esl1
WHERE
tstzrange(now() - interval '1 hour', now() + interval '1 hour', '[)') @> esl1.created_at;
Explain analyze:
Table with <100k rows.
Seq Scan on event_seating_lookup esl1 (cost=0.00..1550.30 rows=148 width=4) (actual time=0.013..19.956 rows=29103 loops=1)
Filter: (tstzrange((now() - '01:00:00'::interval), (now() + '01:00:00'::interval), '[)'::text) @> created_at)
Buffers: shared hit=809
Planning Time: 0.110 ms
Execution Time: 21.942 ms
Table with 1M+ rows:
Seq Scan on event_seating_lookup esl1 (cost=10000000000.00..10000042152.75 rows=5832 width=4) (actual time=0.009..621.895 rows=1166413 loops=1)
Filter: (tstzrange((now() - '01:00:00'::interval), (now() + '01:00:00'::interval), '[)'::text) @> created_at)
Buffers: shared hit=12995
Planning Time: 0.092 ms
Execution Time: 697.927 ms
I have tried:
VACUUM FULL event_seating_lookup;
VACUUM event_seating_lookup;
VACUUM ANALYZE event_seating_lookup;
SET enable_seqscan = OFF;
Problem:
event_seating_lookup_created_at_idx
or event_seating_lookup_created_at_idx2
indexes are not being used.
Notes:
- PostgreSQL 11.1.
btree_gist
extension is installed.- I have tried equivalent setup with
created_at timestamp without time zone
and usingtsrange
; same result. - I understand that rewriting the query with
>=
,<
checks would make it use the btree index. The question is what is the reason the index is not used with thetstzrange
containment operator and if there is a way to make it work.
postgresql index btree gist-index postgresql-11
postgresql index btree gist-index postgresql-11
edited Jan 12 at 1:18
Erwin Brandstetter
93.2k9178292
93.2k9178292
asked Jan 11 at 12:03
GajusGajus
326112
326112
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
As far as my research goes, postgresql is not able to rewrite containment check into an expression that could be matched using btree index, i.e.
esl1.created_at >= now() - interval '1 hour' AND
esl1.created_at < now() + interval '1 hour'
when written such way, the query is executed using indexes:
Index Scan using event_seating_lookup_created_at_idx on event_seating_lookup esl1 (cost=0.44..12623.56 rows=18084 width=4) (actual time=0.013..57.084 rows=70149 loops=1)
Index Cond: ((created_at >= (now() - '01:00:00'::interval)) AND (created_at < (now() + '01:00:00'::interval)))
Planning Time: 0.223 ms
Execution Time: 62.209 ms
As I prefer syntax of the containment query over the latter form, I researched possible alternatives. What came up is that I can write a procedure that will inline the condition for me:
CREATE OR REPLACE FUNCTION in_range(timestamptz, tstzrange)
RETURNS boolean AS $$
SELECT
(
$1 >= lower($2) AND
$1 <= upper($2) AND
(
upper_inc($2) OR
$1 < upper($2)
) AND
(
lower_inc($2) OR
$1 > lower($2)
)
)
$$
language sql;
The reason for first matching on the $1 >= lower($2) AND $1 <= upper($2)
condition and then checking the upper_inc
and lower_inc
constraints is to benefit from range scan first and then filter the result.
This is similar to what PostGis does (example functionST_DWithin()
): rewrite functions with the right operators to target certain indexes. So the general approach can make sense. Just make sure the function can be inlined. See: dba.stackexchange.com/a/212199/3684. And declare itIMMUTABLE
since it is. (Not sure about your handling of bounds ...)
– Erwin Brandstetter
Jan 11 at 18:36
Regardingimmutable
attribute. Thats what I did at first, but then I got feedback saying that "you should usually avoid declaring inlinable scalar sql funcs as immutable. wiki.postgresql.org/wiki/Inlining_of_SQL_functions read it; declaring an sql function immutable can only make it less inlinable. for scalar functions, that is; and once the function is inlined, what matters for const-simplification is the operators and functions inside the body, which are not affected by an "immutable" declaration on the function itself"
– Gajus
Jan 11 at 18:42
While the feedback you got is true, it is also not applicable to your case. Since everything in the function body is immutable (incl.upper()
,upper_inc()
,lower()
,lower_inc()
- I checked!) you should declare it as such. That's not to help inlining, but it allows various other optimizations.
– Erwin Brandstetter
Jan 11 at 18:52
I added an alternative version of your smart function to my answer.
– Erwin Brandstetter
Jan 11 at 20:27
@ErwinBrandstetter 'RhodiumToad> gajus: I don't do stackexchange, but feel free to respond to the [Erwin Brandstetter] pointing out that your advice came from a postgresql developer, and that he can learn why he's wrong by coming to the irc channel [#postgresql on Freenode]'
– Gajus
Jan 12 at 17:26
|
show 1 more comment
The question is what is the reason the index is not used with the
tstzrange
containment operator and if there is a way to make it work.
The reason is quite trivial. B-tree indexes do no support the containment operator @>
. Neither for range types like tstzrange
nor for any other type (including array types).
The manual:
... a btree operator class must provide five comparison operators,
<
,<=
,
=
,>=
and>
.
And GiST indexes do not support <
, <=
, > =
, >=
and >
. See these chapters of the manual:
- Built-in Operator Classes for GiST indexes
- Behavior of B-Tree Operator Classes
In Postgres indexes are bound to operators (which are implemented for certain types), not data types alone or functions or anything else. Related:
- Optimizing queries on a range of timestamps (two columns)
- Can PostgreSQL index array columns?
The GiST index event_seating_lookup_created_at_idx2
you have is pointless. It is created on the timestamptz
column created_at
. Such a GiST index would be useful for a range type (the opposite direction of logic).
Creating a GiST index on a timestamptz
column is only possible since you installed the additional btree_gist
extension to allow such useless indexes. (There are useful applications for multicolumn indexes or exclusion constraints ...) In stock Postgres you'd get an error:
ERROR: data type timestamp with time zone has no default operator class for access method "gist"
So while it's logically valid and technically possible to use a btree index (or a GiST index) for the query, the case is not implemented: no index support for timestamptz <@ tstzrange
(where timestamptz
would be the indexed expression!). It can be solved with <
, <=
, >
, >=
more efficiently. So I guess no developer felt (or will feel) the need to implement it.
Function to rewrite expression with favorable operators
Your implementation makes sense to me. It does what you want and due to function inlining makes use of a plain btree index on the timestamp column - event_seating_lookup_created_at_idx
in your example. For calls with a constant range (like in a single function call) I suggest this modified version:
CREATE OR REPLACE FUNCTION in_range(timestamptz, tstzrange)
RETURNS boolean AS
$func$
SELECT CASE WHEN lower_inc($2) THEN $1 >= lower($2) ELSE $1 > lower($2) END
AND CASE WHEN upper_inc($2) THEN $1 <= upper($2) ELSE $1 < upper($2) END
$func$ LANGUAGE sql IMMUTABLE;
Declare it IMMUTABLE
, because it actually is. Not to help function inlining (can even prevent it if the declaration is false) but for other gains. Related:
- Can declaring function volatility IMMUTABLE harm performance?
- PostgreSQL Inline Function Behavior
It can be inlined and uses the index just like your version. The difference: this one suppresses a redundant index condition for exclusive bounds. Fortunately, your consideration in this regard is slightly off target:
The reason for first matching on
the $1 >= lower($2) AND $1 <= upper($2)
condition and then checking theupper_inc
andlower_inc
constraints is to benefit from range scan first and then filter the result.
Postgres 11 (at least) is even smarter than that. I see no Filter
step for your version. For default [)
bounds (like in your example), I get this query plan (line breaks in conditions added by me):
-> Index Only Scan using foo_idx on foo (actual rows=5206 loops=1)
Index Cond: ((datetime >= '2018-09-05 22:00:00+00'::timestamp with time zone)
AND (datetime <= '2018-09-05 22:30:00+00'::timestamp with time zone)
AND (datetime < '2018-09-05 22:30:00+00'::timestamp with time zone))
An actual Filter
step could add a more substantial cost for corner cases with many hits on an excluded bound. Those would be fetched from the index and then discarded. Quite relevant for time ranges where values often end up on bounds - like timestamps on the full hour.
The actual difference is see is minor, but why not take it?
-> Index Only Scan using foo_idx on foo (actual rows=5206 loops=1)
Index Cond: ((datetime >= '2018-09-05 22:00:00+00'::timestamp with time zone)
AND (datetime < '2018-09-05 22:30:00+00'::timestamp with time zone))
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f226889%2fcontainment-check-tstzrange-timestamptz-not-using-btree-or-gist-index%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
As far as my research goes, postgresql is not able to rewrite containment check into an expression that could be matched using btree index, i.e.
esl1.created_at >= now() - interval '1 hour' AND
esl1.created_at < now() + interval '1 hour'
when written such way, the query is executed using indexes:
Index Scan using event_seating_lookup_created_at_idx on event_seating_lookup esl1 (cost=0.44..12623.56 rows=18084 width=4) (actual time=0.013..57.084 rows=70149 loops=1)
Index Cond: ((created_at >= (now() - '01:00:00'::interval)) AND (created_at < (now() + '01:00:00'::interval)))
Planning Time: 0.223 ms
Execution Time: 62.209 ms
As I prefer syntax of the containment query over the latter form, I researched possible alternatives. What came up is that I can write a procedure that will inline the condition for me:
CREATE OR REPLACE FUNCTION in_range(timestamptz, tstzrange)
RETURNS boolean AS $$
SELECT
(
$1 >= lower($2) AND
$1 <= upper($2) AND
(
upper_inc($2) OR
$1 < upper($2)
) AND
(
lower_inc($2) OR
$1 > lower($2)
)
)
$$
language sql;
The reason for first matching on the $1 >= lower($2) AND $1 <= upper($2)
condition and then checking the upper_inc
and lower_inc
constraints is to benefit from range scan first and then filter the result.
This is similar to what PostGis does (example functionST_DWithin()
): rewrite functions with the right operators to target certain indexes. So the general approach can make sense. Just make sure the function can be inlined. See: dba.stackexchange.com/a/212199/3684. And declare itIMMUTABLE
since it is. (Not sure about your handling of bounds ...)
– Erwin Brandstetter
Jan 11 at 18:36
Regardingimmutable
attribute. Thats what I did at first, but then I got feedback saying that "you should usually avoid declaring inlinable scalar sql funcs as immutable. wiki.postgresql.org/wiki/Inlining_of_SQL_functions read it; declaring an sql function immutable can only make it less inlinable. for scalar functions, that is; and once the function is inlined, what matters for const-simplification is the operators and functions inside the body, which are not affected by an "immutable" declaration on the function itself"
– Gajus
Jan 11 at 18:42
While the feedback you got is true, it is also not applicable to your case. Since everything in the function body is immutable (incl.upper()
,upper_inc()
,lower()
,lower_inc()
- I checked!) you should declare it as such. That's not to help inlining, but it allows various other optimizations.
– Erwin Brandstetter
Jan 11 at 18:52
I added an alternative version of your smart function to my answer.
– Erwin Brandstetter
Jan 11 at 20:27
@ErwinBrandstetter 'RhodiumToad> gajus: I don't do stackexchange, but feel free to respond to the [Erwin Brandstetter] pointing out that your advice came from a postgresql developer, and that he can learn why he's wrong by coming to the irc channel [#postgresql on Freenode]'
– Gajus
Jan 12 at 17:26
|
show 1 more comment
As far as my research goes, postgresql is not able to rewrite containment check into an expression that could be matched using btree index, i.e.
esl1.created_at >= now() - interval '1 hour' AND
esl1.created_at < now() + interval '1 hour'
when written such way, the query is executed using indexes:
Index Scan using event_seating_lookup_created_at_idx on event_seating_lookup esl1 (cost=0.44..12623.56 rows=18084 width=4) (actual time=0.013..57.084 rows=70149 loops=1)
Index Cond: ((created_at >= (now() - '01:00:00'::interval)) AND (created_at < (now() + '01:00:00'::interval)))
Planning Time: 0.223 ms
Execution Time: 62.209 ms
As I prefer syntax of the containment query over the latter form, I researched possible alternatives. What came up is that I can write a procedure that will inline the condition for me:
CREATE OR REPLACE FUNCTION in_range(timestamptz, tstzrange)
RETURNS boolean AS $$
SELECT
(
$1 >= lower($2) AND
$1 <= upper($2) AND
(
upper_inc($2) OR
$1 < upper($2)
) AND
(
lower_inc($2) OR
$1 > lower($2)
)
)
$$
language sql;
The reason for first matching on the $1 >= lower($2) AND $1 <= upper($2)
condition and then checking the upper_inc
and lower_inc
constraints is to benefit from range scan first and then filter the result.
This is similar to what PostGis does (example functionST_DWithin()
): rewrite functions with the right operators to target certain indexes. So the general approach can make sense. Just make sure the function can be inlined. See: dba.stackexchange.com/a/212199/3684. And declare itIMMUTABLE
since it is. (Not sure about your handling of bounds ...)
– Erwin Brandstetter
Jan 11 at 18:36
Regardingimmutable
attribute. Thats what I did at first, but then I got feedback saying that "you should usually avoid declaring inlinable scalar sql funcs as immutable. wiki.postgresql.org/wiki/Inlining_of_SQL_functions read it; declaring an sql function immutable can only make it less inlinable. for scalar functions, that is; and once the function is inlined, what matters for const-simplification is the operators and functions inside the body, which are not affected by an "immutable" declaration on the function itself"
– Gajus
Jan 11 at 18:42
While the feedback you got is true, it is also not applicable to your case. Since everything in the function body is immutable (incl.upper()
,upper_inc()
,lower()
,lower_inc()
- I checked!) you should declare it as such. That's not to help inlining, but it allows various other optimizations.
– Erwin Brandstetter
Jan 11 at 18:52
I added an alternative version of your smart function to my answer.
– Erwin Brandstetter
Jan 11 at 20:27
@ErwinBrandstetter 'RhodiumToad> gajus: I don't do stackexchange, but feel free to respond to the [Erwin Brandstetter] pointing out that your advice came from a postgresql developer, and that he can learn why he's wrong by coming to the irc channel [#postgresql on Freenode]'
– Gajus
Jan 12 at 17:26
|
show 1 more comment
As far as my research goes, postgresql is not able to rewrite containment check into an expression that could be matched using btree index, i.e.
esl1.created_at >= now() - interval '1 hour' AND
esl1.created_at < now() + interval '1 hour'
when written such way, the query is executed using indexes:
Index Scan using event_seating_lookup_created_at_idx on event_seating_lookup esl1 (cost=0.44..12623.56 rows=18084 width=4) (actual time=0.013..57.084 rows=70149 loops=1)
Index Cond: ((created_at >= (now() - '01:00:00'::interval)) AND (created_at < (now() + '01:00:00'::interval)))
Planning Time: 0.223 ms
Execution Time: 62.209 ms
As I prefer syntax of the containment query over the latter form, I researched possible alternatives. What came up is that I can write a procedure that will inline the condition for me:
CREATE OR REPLACE FUNCTION in_range(timestamptz, tstzrange)
RETURNS boolean AS $$
SELECT
(
$1 >= lower($2) AND
$1 <= upper($2) AND
(
upper_inc($2) OR
$1 < upper($2)
) AND
(
lower_inc($2) OR
$1 > lower($2)
)
)
$$
language sql;
The reason for first matching on the $1 >= lower($2) AND $1 <= upper($2)
condition and then checking the upper_inc
and lower_inc
constraints is to benefit from range scan first and then filter the result.
As far as my research goes, postgresql is not able to rewrite containment check into an expression that could be matched using btree index, i.e.
esl1.created_at >= now() - interval '1 hour' AND
esl1.created_at < now() + interval '1 hour'
when written such way, the query is executed using indexes:
Index Scan using event_seating_lookup_created_at_idx on event_seating_lookup esl1 (cost=0.44..12623.56 rows=18084 width=4) (actual time=0.013..57.084 rows=70149 loops=1)
Index Cond: ((created_at >= (now() - '01:00:00'::interval)) AND (created_at < (now() + '01:00:00'::interval)))
Planning Time: 0.223 ms
Execution Time: 62.209 ms
As I prefer syntax of the containment query over the latter form, I researched possible alternatives. What came up is that I can write a procedure that will inline the condition for me:
CREATE OR REPLACE FUNCTION in_range(timestamptz, tstzrange)
RETURNS boolean AS $$
SELECT
(
$1 >= lower($2) AND
$1 <= upper($2) AND
(
upper_inc($2) OR
$1 < upper($2)
) AND
(
lower_inc($2) OR
$1 > lower($2)
)
)
$$
language sql;
The reason for first matching on the $1 >= lower($2) AND $1 <= upper($2)
condition and then checking the upper_inc
and lower_inc
constraints is to benefit from range scan first and then filter the result.
answered Jan 11 at 15:58
GajusGajus
326112
326112
This is similar to what PostGis does (example functionST_DWithin()
): rewrite functions with the right operators to target certain indexes. So the general approach can make sense. Just make sure the function can be inlined. See: dba.stackexchange.com/a/212199/3684. And declare itIMMUTABLE
since it is. (Not sure about your handling of bounds ...)
– Erwin Brandstetter
Jan 11 at 18:36
Regardingimmutable
attribute. Thats what I did at first, but then I got feedback saying that "you should usually avoid declaring inlinable scalar sql funcs as immutable. wiki.postgresql.org/wiki/Inlining_of_SQL_functions read it; declaring an sql function immutable can only make it less inlinable. for scalar functions, that is; and once the function is inlined, what matters for const-simplification is the operators and functions inside the body, which are not affected by an "immutable" declaration on the function itself"
– Gajus
Jan 11 at 18:42
While the feedback you got is true, it is also not applicable to your case. Since everything in the function body is immutable (incl.upper()
,upper_inc()
,lower()
,lower_inc()
- I checked!) you should declare it as such. That's not to help inlining, but it allows various other optimizations.
– Erwin Brandstetter
Jan 11 at 18:52
I added an alternative version of your smart function to my answer.
– Erwin Brandstetter
Jan 11 at 20:27
@ErwinBrandstetter 'RhodiumToad> gajus: I don't do stackexchange, but feel free to respond to the [Erwin Brandstetter] pointing out that your advice came from a postgresql developer, and that he can learn why he's wrong by coming to the irc channel [#postgresql on Freenode]'
– Gajus
Jan 12 at 17:26
|
show 1 more comment
This is similar to what PostGis does (example functionST_DWithin()
): rewrite functions with the right operators to target certain indexes. So the general approach can make sense. Just make sure the function can be inlined. See: dba.stackexchange.com/a/212199/3684. And declare itIMMUTABLE
since it is. (Not sure about your handling of bounds ...)
– Erwin Brandstetter
Jan 11 at 18:36
Regardingimmutable
attribute. Thats what I did at first, but then I got feedback saying that "you should usually avoid declaring inlinable scalar sql funcs as immutable. wiki.postgresql.org/wiki/Inlining_of_SQL_functions read it; declaring an sql function immutable can only make it less inlinable. for scalar functions, that is; and once the function is inlined, what matters for const-simplification is the operators and functions inside the body, which are not affected by an "immutable" declaration on the function itself"
– Gajus
Jan 11 at 18:42
While the feedback you got is true, it is also not applicable to your case. Since everything in the function body is immutable (incl.upper()
,upper_inc()
,lower()
,lower_inc()
- I checked!) you should declare it as such. That's not to help inlining, but it allows various other optimizations.
– Erwin Brandstetter
Jan 11 at 18:52
I added an alternative version of your smart function to my answer.
– Erwin Brandstetter
Jan 11 at 20:27
@ErwinBrandstetter 'RhodiumToad> gajus: I don't do stackexchange, but feel free to respond to the [Erwin Brandstetter] pointing out that your advice came from a postgresql developer, and that he can learn why he's wrong by coming to the irc channel [#postgresql on Freenode]'
– Gajus
Jan 12 at 17:26
This is similar to what PostGis does (example function
ST_DWithin()
): rewrite functions with the right operators to target certain indexes. So the general approach can make sense. Just make sure the function can be inlined. See: dba.stackexchange.com/a/212199/3684. And declare it IMMUTABLE
since it is. (Not sure about your handling of bounds ...)– Erwin Brandstetter
Jan 11 at 18:36
This is similar to what PostGis does (example function
ST_DWithin()
): rewrite functions with the right operators to target certain indexes. So the general approach can make sense. Just make sure the function can be inlined. See: dba.stackexchange.com/a/212199/3684. And declare it IMMUTABLE
since it is. (Not sure about your handling of bounds ...)– Erwin Brandstetter
Jan 11 at 18:36
Regarding
immutable
attribute. Thats what I did at first, but then I got feedback saying that "you should usually avoid declaring inlinable scalar sql funcs as immutable. wiki.postgresql.org/wiki/Inlining_of_SQL_functions read it; declaring an sql function immutable can only make it less inlinable. for scalar functions, that is; and once the function is inlined, what matters for const-simplification is the operators and functions inside the body, which are not affected by an "immutable" declaration on the function itself"– Gajus
Jan 11 at 18:42
Regarding
immutable
attribute. Thats what I did at first, but then I got feedback saying that "you should usually avoid declaring inlinable scalar sql funcs as immutable. wiki.postgresql.org/wiki/Inlining_of_SQL_functions read it; declaring an sql function immutable can only make it less inlinable. for scalar functions, that is; and once the function is inlined, what matters for const-simplification is the operators and functions inside the body, which are not affected by an "immutable" declaration on the function itself"– Gajus
Jan 11 at 18:42
While the feedback you got is true, it is also not applicable to your case. Since everything in the function body is immutable (incl.
upper()
, upper_inc()
, lower()
, lower_inc()
- I checked!) you should declare it as such. That's not to help inlining, but it allows various other optimizations.– Erwin Brandstetter
Jan 11 at 18:52
While the feedback you got is true, it is also not applicable to your case. Since everything in the function body is immutable (incl.
upper()
, upper_inc()
, lower()
, lower_inc()
- I checked!) you should declare it as such. That's not to help inlining, but it allows various other optimizations.– Erwin Brandstetter
Jan 11 at 18:52
I added an alternative version of your smart function to my answer.
– Erwin Brandstetter
Jan 11 at 20:27
I added an alternative version of your smart function to my answer.
– Erwin Brandstetter
Jan 11 at 20:27
@ErwinBrandstetter 'RhodiumToad> gajus: I don't do stackexchange, but feel free to respond to the [Erwin Brandstetter] pointing out that your advice came from a postgresql developer, and that he can learn why he's wrong by coming to the irc channel [#postgresql on Freenode]'
– Gajus
Jan 12 at 17:26
@ErwinBrandstetter 'RhodiumToad> gajus: I don't do stackexchange, but feel free to respond to the [Erwin Brandstetter] pointing out that your advice came from a postgresql developer, and that he can learn why he's wrong by coming to the irc channel [#postgresql on Freenode]'
– Gajus
Jan 12 at 17:26
|
show 1 more comment
The question is what is the reason the index is not used with the
tstzrange
containment operator and if there is a way to make it work.
The reason is quite trivial. B-tree indexes do no support the containment operator @>
. Neither for range types like tstzrange
nor for any other type (including array types).
The manual:
... a btree operator class must provide five comparison operators,
<
,<=
,
=
,>=
and>
.
And GiST indexes do not support <
, <=
, > =
, >=
and >
. See these chapters of the manual:
- Built-in Operator Classes for GiST indexes
- Behavior of B-Tree Operator Classes
In Postgres indexes are bound to operators (which are implemented for certain types), not data types alone or functions or anything else. Related:
- Optimizing queries on a range of timestamps (two columns)
- Can PostgreSQL index array columns?
The GiST index event_seating_lookup_created_at_idx2
you have is pointless. It is created on the timestamptz
column created_at
. Such a GiST index would be useful for a range type (the opposite direction of logic).
Creating a GiST index on a timestamptz
column is only possible since you installed the additional btree_gist
extension to allow such useless indexes. (There are useful applications for multicolumn indexes or exclusion constraints ...) In stock Postgres you'd get an error:
ERROR: data type timestamp with time zone has no default operator class for access method "gist"
So while it's logically valid and technically possible to use a btree index (or a GiST index) for the query, the case is not implemented: no index support for timestamptz <@ tstzrange
(where timestamptz
would be the indexed expression!). It can be solved with <
, <=
, >
, >=
more efficiently. So I guess no developer felt (or will feel) the need to implement it.
Function to rewrite expression with favorable operators
Your implementation makes sense to me. It does what you want and due to function inlining makes use of a plain btree index on the timestamp column - event_seating_lookup_created_at_idx
in your example. For calls with a constant range (like in a single function call) I suggest this modified version:
CREATE OR REPLACE FUNCTION in_range(timestamptz, tstzrange)
RETURNS boolean AS
$func$
SELECT CASE WHEN lower_inc($2) THEN $1 >= lower($2) ELSE $1 > lower($2) END
AND CASE WHEN upper_inc($2) THEN $1 <= upper($2) ELSE $1 < upper($2) END
$func$ LANGUAGE sql IMMUTABLE;
Declare it IMMUTABLE
, because it actually is. Not to help function inlining (can even prevent it if the declaration is false) but for other gains. Related:
- Can declaring function volatility IMMUTABLE harm performance?
- PostgreSQL Inline Function Behavior
It can be inlined and uses the index just like your version. The difference: this one suppresses a redundant index condition for exclusive bounds. Fortunately, your consideration in this regard is slightly off target:
The reason for first matching on
the $1 >= lower($2) AND $1 <= upper($2)
condition and then checking theupper_inc
andlower_inc
constraints is to benefit from range scan first and then filter the result.
Postgres 11 (at least) is even smarter than that. I see no Filter
step for your version. For default [)
bounds (like in your example), I get this query plan (line breaks in conditions added by me):
-> Index Only Scan using foo_idx on foo (actual rows=5206 loops=1)
Index Cond: ((datetime >= '2018-09-05 22:00:00+00'::timestamp with time zone)
AND (datetime <= '2018-09-05 22:30:00+00'::timestamp with time zone)
AND (datetime < '2018-09-05 22:30:00+00'::timestamp with time zone))
An actual Filter
step could add a more substantial cost for corner cases with many hits on an excluded bound. Those would be fetched from the index and then discarded. Quite relevant for time ranges where values often end up on bounds - like timestamps on the full hour.
The actual difference is see is minor, but why not take it?
-> Index Only Scan using foo_idx on foo (actual rows=5206 loops=1)
Index Cond: ((datetime >= '2018-09-05 22:00:00+00'::timestamp with time zone)
AND (datetime < '2018-09-05 22:30:00+00'::timestamp with time zone))
add a comment |
The question is what is the reason the index is not used with the
tstzrange
containment operator and if there is a way to make it work.
The reason is quite trivial. B-tree indexes do no support the containment operator @>
. Neither for range types like tstzrange
nor for any other type (including array types).
The manual:
... a btree operator class must provide five comparison operators,
<
,<=
,
=
,>=
and>
.
And GiST indexes do not support <
, <=
, > =
, >=
and >
. See these chapters of the manual:
- Built-in Operator Classes for GiST indexes
- Behavior of B-Tree Operator Classes
In Postgres indexes are bound to operators (which are implemented for certain types), not data types alone or functions or anything else. Related:
- Optimizing queries on a range of timestamps (two columns)
- Can PostgreSQL index array columns?
The GiST index event_seating_lookup_created_at_idx2
you have is pointless. It is created on the timestamptz
column created_at
. Such a GiST index would be useful for a range type (the opposite direction of logic).
Creating a GiST index on a timestamptz
column is only possible since you installed the additional btree_gist
extension to allow such useless indexes. (There are useful applications for multicolumn indexes or exclusion constraints ...) In stock Postgres you'd get an error:
ERROR: data type timestamp with time zone has no default operator class for access method "gist"
So while it's logically valid and technically possible to use a btree index (or a GiST index) for the query, the case is not implemented: no index support for timestamptz <@ tstzrange
(where timestamptz
would be the indexed expression!). It can be solved with <
, <=
, >
, >=
more efficiently. So I guess no developer felt (or will feel) the need to implement it.
Function to rewrite expression with favorable operators
Your implementation makes sense to me. It does what you want and due to function inlining makes use of a plain btree index on the timestamp column - event_seating_lookup_created_at_idx
in your example. For calls with a constant range (like in a single function call) I suggest this modified version:
CREATE OR REPLACE FUNCTION in_range(timestamptz, tstzrange)
RETURNS boolean AS
$func$
SELECT CASE WHEN lower_inc($2) THEN $1 >= lower($2) ELSE $1 > lower($2) END
AND CASE WHEN upper_inc($2) THEN $1 <= upper($2) ELSE $1 < upper($2) END
$func$ LANGUAGE sql IMMUTABLE;
Declare it IMMUTABLE
, because it actually is. Not to help function inlining (can even prevent it if the declaration is false) but for other gains. Related:
- Can declaring function volatility IMMUTABLE harm performance?
- PostgreSQL Inline Function Behavior
It can be inlined and uses the index just like your version. The difference: this one suppresses a redundant index condition for exclusive bounds. Fortunately, your consideration in this regard is slightly off target:
The reason for first matching on
the $1 >= lower($2) AND $1 <= upper($2)
condition and then checking theupper_inc
andlower_inc
constraints is to benefit from range scan first and then filter the result.
Postgres 11 (at least) is even smarter than that. I see no Filter
step for your version. For default [)
bounds (like in your example), I get this query plan (line breaks in conditions added by me):
-> Index Only Scan using foo_idx on foo (actual rows=5206 loops=1)
Index Cond: ((datetime >= '2018-09-05 22:00:00+00'::timestamp with time zone)
AND (datetime <= '2018-09-05 22:30:00+00'::timestamp with time zone)
AND (datetime < '2018-09-05 22:30:00+00'::timestamp with time zone))
An actual Filter
step could add a more substantial cost for corner cases with many hits on an excluded bound. Those would be fetched from the index and then discarded. Quite relevant for time ranges where values often end up on bounds - like timestamps on the full hour.
The actual difference is see is minor, but why not take it?
-> Index Only Scan using foo_idx on foo (actual rows=5206 loops=1)
Index Cond: ((datetime >= '2018-09-05 22:00:00+00'::timestamp with time zone)
AND (datetime < '2018-09-05 22:30:00+00'::timestamp with time zone))
add a comment |
The question is what is the reason the index is not used with the
tstzrange
containment operator and if there is a way to make it work.
The reason is quite trivial. B-tree indexes do no support the containment operator @>
. Neither for range types like tstzrange
nor for any other type (including array types).
The manual:
... a btree operator class must provide five comparison operators,
<
,<=
,
=
,>=
and>
.
And GiST indexes do not support <
, <=
, > =
, >=
and >
. See these chapters of the manual:
- Built-in Operator Classes for GiST indexes
- Behavior of B-Tree Operator Classes
In Postgres indexes are bound to operators (which are implemented for certain types), not data types alone or functions or anything else. Related:
- Optimizing queries on a range of timestamps (two columns)
- Can PostgreSQL index array columns?
The GiST index event_seating_lookup_created_at_idx2
you have is pointless. It is created on the timestamptz
column created_at
. Such a GiST index would be useful for a range type (the opposite direction of logic).
Creating a GiST index on a timestamptz
column is only possible since you installed the additional btree_gist
extension to allow such useless indexes. (There are useful applications for multicolumn indexes or exclusion constraints ...) In stock Postgres you'd get an error:
ERROR: data type timestamp with time zone has no default operator class for access method "gist"
So while it's logically valid and technically possible to use a btree index (or a GiST index) for the query, the case is not implemented: no index support for timestamptz <@ tstzrange
(where timestamptz
would be the indexed expression!). It can be solved with <
, <=
, >
, >=
more efficiently. So I guess no developer felt (or will feel) the need to implement it.
Function to rewrite expression with favorable operators
Your implementation makes sense to me. It does what you want and due to function inlining makes use of a plain btree index on the timestamp column - event_seating_lookup_created_at_idx
in your example. For calls with a constant range (like in a single function call) I suggest this modified version:
CREATE OR REPLACE FUNCTION in_range(timestamptz, tstzrange)
RETURNS boolean AS
$func$
SELECT CASE WHEN lower_inc($2) THEN $1 >= lower($2) ELSE $1 > lower($2) END
AND CASE WHEN upper_inc($2) THEN $1 <= upper($2) ELSE $1 < upper($2) END
$func$ LANGUAGE sql IMMUTABLE;
Declare it IMMUTABLE
, because it actually is. Not to help function inlining (can even prevent it if the declaration is false) but for other gains. Related:
- Can declaring function volatility IMMUTABLE harm performance?
- PostgreSQL Inline Function Behavior
It can be inlined and uses the index just like your version. The difference: this one suppresses a redundant index condition for exclusive bounds. Fortunately, your consideration in this regard is slightly off target:
The reason for first matching on
the $1 >= lower($2) AND $1 <= upper($2)
condition and then checking theupper_inc
andlower_inc
constraints is to benefit from range scan first and then filter the result.
Postgres 11 (at least) is even smarter than that. I see no Filter
step for your version. For default [)
bounds (like in your example), I get this query plan (line breaks in conditions added by me):
-> Index Only Scan using foo_idx on foo (actual rows=5206 loops=1)
Index Cond: ((datetime >= '2018-09-05 22:00:00+00'::timestamp with time zone)
AND (datetime <= '2018-09-05 22:30:00+00'::timestamp with time zone)
AND (datetime < '2018-09-05 22:30:00+00'::timestamp with time zone))
An actual Filter
step could add a more substantial cost for corner cases with many hits on an excluded bound. Those would be fetched from the index and then discarded. Quite relevant for time ranges where values often end up on bounds - like timestamps on the full hour.
The actual difference is see is minor, but why not take it?
-> Index Only Scan using foo_idx on foo (actual rows=5206 loops=1)
Index Cond: ((datetime >= '2018-09-05 22:00:00+00'::timestamp with time zone)
AND (datetime < '2018-09-05 22:30:00+00'::timestamp with time zone))
The question is what is the reason the index is not used with the
tstzrange
containment operator and if there is a way to make it work.
The reason is quite trivial. B-tree indexes do no support the containment operator @>
. Neither for range types like tstzrange
nor for any other type (including array types).
The manual:
... a btree operator class must provide five comparison operators,
<
,<=
,
=
,>=
and>
.
And GiST indexes do not support <
, <=
, > =
, >=
and >
. See these chapters of the manual:
- Built-in Operator Classes for GiST indexes
- Behavior of B-Tree Operator Classes
In Postgres indexes are bound to operators (which are implemented for certain types), not data types alone or functions or anything else. Related:
- Optimizing queries on a range of timestamps (two columns)
- Can PostgreSQL index array columns?
The GiST index event_seating_lookup_created_at_idx2
you have is pointless. It is created on the timestamptz
column created_at
. Such a GiST index would be useful for a range type (the opposite direction of logic).
Creating a GiST index on a timestamptz
column is only possible since you installed the additional btree_gist
extension to allow such useless indexes. (There are useful applications for multicolumn indexes or exclusion constraints ...) In stock Postgres you'd get an error:
ERROR: data type timestamp with time zone has no default operator class for access method "gist"
So while it's logically valid and technically possible to use a btree index (or a GiST index) for the query, the case is not implemented: no index support for timestamptz <@ tstzrange
(where timestamptz
would be the indexed expression!). It can be solved with <
, <=
, >
, >=
more efficiently. So I guess no developer felt (or will feel) the need to implement it.
Function to rewrite expression with favorable operators
Your implementation makes sense to me. It does what you want and due to function inlining makes use of a plain btree index on the timestamp column - event_seating_lookup_created_at_idx
in your example. For calls with a constant range (like in a single function call) I suggest this modified version:
CREATE OR REPLACE FUNCTION in_range(timestamptz, tstzrange)
RETURNS boolean AS
$func$
SELECT CASE WHEN lower_inc($2) THEN $1 >= lower($2) ELSE $1 > lower($2) END
AND CASE WHEN upper_inc($2) THEN $1 <= upper($2) ELSE $1 < upper($2) END
$func$ LANGUAGE sql IMMUTABLE;
Declare it IMMUTABLE
, because it actually is. Not to help function inlining (can even prevent it if the declaration is false) but for other gains. Related:
- Can declaring function volatility IMMUTABLE harm performance?
- PostgreSQL Inline Function Behavior
It can be inlined and uses the index just like your version. The difference: this one suppresses a redundant index condition for exclusive bounds. Fortunately, your consideration in this regard is slightly off target:
The reason for first matching on
the $1 >= lower($2) AND $1 <= upper($2)
condition and then checking theupper_inc
andlower_inc
constraints is to benefit from range scan first and then filter the result.
Postgres 11 (at least) is even smarter than that. I see no Filter
step for your version. For default [)
bounds (like in your example), I get this query plan (line breaks in conditions added by me):
-> Index Only Scan using foo_idx on foo (actual rows=5206 loops=1)
Index Cond: ((datetime >= '2018-09-05 22:00:00+00'::timestamp with time zone)
AND (datetime <= '2018-09-05 22:30:00+00'::timestamp with time zone)
AND (datetime < '2018-09-05 22:30:00+00'::timestamp with time zone))
An actual Filter
step could add a more substantial cost for corner cases with many hits on an excluded bound. Those would be fetched from the index and then discarded. Quite relevant for time ranges where values often end up on bounds - like timestamps on the full hour.
The actual difference is see is minor, but why not take it?
-> Index Only Scan using foo_idx on foo (actual rows=5206 loops=1)
Index Cond: ((datetime >= '2018-09-05 22:00:00+00'::timestamp with time zone)
AND (datetime < '2018-09-05 22:30:00+00'::timestamp with time zone))
edited 28 mins ago
answered Jan 11 at 17:44
Erwin BrandstetterErwin Brandstetter
93.2k9178292
93.2k9178292
add a comment |
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f226889%2fcontainment-check-tstzrange-timestamptz-not-using-btree-or-gist-index%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown