Can Postgres use an index-only scan for this query with joined tables?Do covering indexes in PostgreSQL help...
Removing whitespace between consecutive numbers
How to make ice magic work from a scientific point of view?
How would an AI self awareness kill switch work?
Can I announce prefix 161.117.25.0/24 even though I don't have all of /24 IPs
Why don't key signatures indicate the tonic?
How can I play a serial killer in a party of good PCs?
Strange "DuckDuckGo dork" takes me to random website
Short story where statues have their heads replaced by those of carved insect heads
Why did the villain in the first Men in Black movie care about Earth's Cockroaches?
What game did these black and yellow dice come from?
Premature ending of generator in list comprehension
Why is Agricola named as such?
A Missing Symbol for This Logo
How do you catch Smeargle in Pokemon Go?
Eww, those bytes are gross
Airplane generations - how does it work?
Has Britain negotiated with any other countries outside the EU in preparation for the exit?
Citing paid articles from illegal web sharing
Why does magnet wire need to be insulated?
Why zero tolerance on nudity in space?
What makes papers publishable in top-tier journals?
Existence of Riemann surface, holomorphic maps
What is a good reason for every spaceship to carry a weapon on board?
What is the wife of a henpecked husband called?
Can Postgres use an index-only scan for this query with joined tables?
Do covering indexes in PostgreSQL help JOIN columns?What is the recommended way to join junction tables for efficient ordering/pagination?Optimize Bitmap Heap ScanEfficiently return two aggregated arrays from a m:n tableOptimizing a simple query joining two big tablesPostgreSQL performance with (col = value or col is NULL)Slow left join lateral in subqueryReturn values at more/less specific time pointsImprove Performance on GROUP BY - large table PostgreSQLMerge 2 columns and replace with specific outputPrimary key index with a DATETIME as first part of the compound key is never usedInnoDB - Use combined index with primary key on GROUP BYOptimizing index creationHow can I speed up a Postgres query containing lots of Joins with an ILIKE conditionCan PostgreSQL use nulls in its indexes?Optimising tables for ordering from a joinDo covering indexes in PostgreSQL help JOIN columns?Why is this query with WHERE, ORDER BY and LIMIT so slow?How to index two tables for JOINed query optimisationPostgreSQL 9.5 query performance depends on JOINed column in SELECT clause
This is a follow-up to: Do covering indexes in PostgreSQL help JOIN columns?
Consider the inverse of the schema in the other question where you filter in the joined-on table:
CREATE TABLE thing_types(
id INTEGER PRIMARY KEY
, first_lvl_type TEXT
, second_lvl_type TEXT
);
CREATE TABLE things(
id INTEGER PRIMARY KEY
, thing_type INTEGER REFERENCES thing_types(id)
, t1c1 INTEGER
);
And a query like so:
SELECT things.t1c1
FROM things
JOIN thing_types ON things.thing_type = thing_types.id
WHERE thing_types.first_lvl_type = 'Book'
AND thing_types.second_lvl_type = 'Biography';
Is it madness to have an index like:
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);
which covers the primary key for use in that join? Will the index be used as a covering index to help the JOIN
in the above query? Should I change my indexing strategy to cover the primary key more often when I know the table is going to be JOIN
ed on like this?
postgresql index optimization index-tuning
add a comment |
This is a follow-up to: Do covering indexes in PostgreSQL help JOIN columns?
Consider the inverse of the schema in the other question where you filter in the joined-on table:
CREATE TABLE thing_types(
id INTEGER PRIMARY KEY
, first_lvl_type TEXT
, second_lvl_type TEXT
);
CREATE TABLE things(
id INTEGER PRIMARY KEY
, thing_type INTEGER REFERENCES thing_types(id)
, t1c1 INTEGER
);
And a query like so:
SELECT things.t1c1
FROM things
JOIN thing_types ON things.thing_type = thing_types.id
WHERE thing_types.first_lvl_type = 'Book'
AND thing_types.second_lvl_type = 'Biography';
Is it madness to have an index like:
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);
which covers the primary key for use in that join? Will the index be used as a covering index to help the JOIN
in the above query? Should I change my indexing strategy to cover the primary key more often when I know the table is going to be JOIN
ed on like this?
postgresql index optimization index-tuning
add a comment |
This is a follow-up to: Do covering indexes in PostgreSQL help JOIN columns?
Consider the inverse of the schema in the other question where you filter in the joined-on table:
CREATE TABLE thing_types(
id INTEGER PRIMARY KEY
, first_lvl_type TEXT
, second_lvl_type TEXT
);
CREATE TABLE things(
id INTEGER PRIMARY KEY
, thing_type INTEGER REFERENCES thing_types(id)
, t1c1 INTEGER
);
And a query like so:
SELECT things.t1c1
FROM things
JOIN thing_types ON things.thing_type = thing_types.id
WHERE thing_types.first_lvl_type = 'Book'
AND thing_types.second_lvl_type = 'Biography';
Is it madness to have an index like:
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);
which covers the primary key for use in that join? Will the index be used as a covering index to help the JOIN
in the above query? Should I change my indexing strategy to cover the primary key more often when I know the table is going to be JOIN
ed on like this?
postgresql index optimization index-tuning
This is a follow-up to: Do covering indexes in PostgreSQL help JOIN columns?
Consider the inverse of the schema in the other question where you filter in the joined-on table:
CREATE TABLE thing_types(
id INTEGER PRIMARY KEY
, first_lvl_type TEXT
, second_lvl_type TEXT
);
CREATE TABLE things(
id INTEGER PRIMARY KEY
, thing_type INTEGER REFERENCES thing_types(id)
, t1c1 INTEGER
);
And a query like so:
SELECT things.t1c1
FROM things
JOIN thing_types ON things.thing_type = thing_types.id
WHERE thing_types.first_lvl_type = 'Book'
AND thing_types.second_lvl_type = 'Biography';
Is it madness to have an index like:
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);
which covers the primary key for use in that join? Will the index be used as a covering index to help the JOIN
in the above query? Should I change my indexing strategy to cover the primary key more often when I know the table is going to be JOIN
ed on like this?
postgresql index optimization index-tuning
postgresql index optimization index-tuning
edited Aug 1 '18 at 21:41
Erwin Brandstetter
93.4k9179292
93.4k9179292
asked Nov 5 '17 at 16:42
ldrgldrg
20137
20137
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
If additional preconditions for an index-only scan are met, it makes perfect sense to append the column id
as trailing column to the index (not as leading column):
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);
Postgres 11 introduces actual covering indexes with the INCLUDE
keyword.
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type) INCLUDE (id);
Only a small benefit for your case, but it's a great option to add columns to a UNIQUE or PK index or constraint.
About index-only scans:
- The Postgres Wiki
- The manual
The most important precondition: The visibility map of table thing_types
has to show most or all pages as "visible" to all transactions. I.e. the table is either read-only, or your autovacuum settings are aggressive enough to continuously clean up after writes to the table.
Every additional index adds costs. Mostly to write performance. But also side effects, like exhausted cache capacities. (Multiple queries using the same indexes have a better chance for them to reside in cache.) So it's also a question of size. id
is typically a very small column integer
or bigint
. Makes it a good candidate for the use case.
In particular, adding a column to an index disables the option for H.O.T. updates involving the column. But since id
is indexed anyway and typically not updated (being the PK) this is not a problem in this case. Related:
- Redundant data in update statements
If you actually get index-only scans out of these indexes most of the time, it typically makes sense to use them. Test with EXPLAIN
.
There were limitations for partial indexes in older versions. Quoting the release notes of Postgres 9.6:
Allow use of an index-only scan on a partial index when the index's
WHERE
clause references columns that are not indexed (Tomas
Vondra, Kyotaro Horiguchi)
For example, an index defined by
CREATE INDEX tidx_partial ON t(b) WHERE a > 0
can now be used for an index-only scan by a query that
specifiesWHERE a > 0
and does not otherwise usea
. Previously
this was disallowed because a is not listed as an index column.
add a comment |
You need to try it and see for your specific query plan. You're making a lot of blanket assumptions about the advice given and even the potential for it to be useful to your query.
- Size.
- PostgreSQL major number
- Configuration for costs.
- Staleness and accuracy of statistics.
All of those things matter.
Not to be vague here, but I could conjure a few examples to show you this.
Generally, I wouldn't index something that is already indexed in the table. If for no other reason than because for every time an index covers a specific column, you have one more index that has to be updated when you change the row.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f190156%2fcan-postgres-use-an-index-only-scan-for-this-query-with-joined-tables%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
If additional preconditions for an index-only scan are met, it makes perfect sense to append the column id
as trailing column to the index (not as leading column):
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);
Postgres 11 introduces actual covering indexes with the INCLUDE
keyword.
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type) INCLUDE (id);
Only a small benefit for your case, but it's a great option to add columns to a UNIQUE or PK index or constraint.
About index-only scans:
- The Postgres Wiki
- The manual
The most important precondition: The visibility map of table thing_types
has to show most or all pages as "visible" to all transactions. I.e. the table is either read-only, or your autovacuum settings are aggressive enough to continuously clean up after writes to the table.
Every additional index adds costs. Mostly to write performance. But also side effects, like exhausted cache capacities. (Multiple queries using the same indexes have a better chance for them to reside in cache.) So it's also a question of size. id
is typically a very small column integer
or bigint
. Makes it a good candidate for the use case.
In particular, adding a column to an index disables the option for H.O.T. updates involving the column. But since id
is indexed anyway and typically not updated (being the PK) this is not a problem in this case. Related:
- Redundant data in update statements
If you actually get index-only scans out of these indexes most of the time, it typically makes sense to use them. Test with EXPLAIN
.
There were limitations for partial indexes in older versions. Quoting the release notes of Postgres 9.6:
Allow use of an index-only scan on a partial index when the index's
WHERE
clause references columns that are not indexed (Tomas
Vondra, Kyotaro Horiguchi)
For example, an index defined by
CREATE INDEX tidx_partial ON t(b) WHERE a > 0
can now be used for an index-only scan by a query that
specifiesWHERE a > 0
and does not otherwise usea
. Previously
this was disallowed because a is not listed as an index column.
add a comment |
If additional preconditions for an index-only scan are met, it makes perfect sense to append the column id
as trailing column to the index (not as leading column):
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);
Postgres 11 introduces actual covering indexes with the INCLUDE
keyword.
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type) INCLUDE (id);
Only a small benefit for your case, but it's a great option to add columns to a UNIQUE or PK index or constraint.
About index-only scans:
- The Postgres Wiki
- The manual
The most important precondition: The visibility map of table thing_types
has to show most or all pages as "visible" to all transactions. I.e. the table is either read-only, or your autovacuum settings are aggressive enough to continuously clean up after writes to the table.
Every additional index adds costs. Mostly to write performance. But also side effects, like exhausted cache capacities. (Multiple queries using the same indexes have a better chance for them to reside in cache.) So it's also a question of size. id
is typically a very small column integer
or bigint
. Makes it a good candidate for the use case.
In particular, adding a column to an index disables the option for H.O.T. updates involving the column. But since id
is indexed anyway and typically not updated (being the PK) this is not a problem in this case. Related:
- Redundant data in update statements
If you actually get index-only scans out of these indexes most of the time, it typically makes sense to use them. Test with EXPLAIN
.
There were limitations for partial indexes in older versions. Quoting the release notes of Postgres 9.6:
Allow use of an index-only scan on a partial index when the index's
WHERE
clause references columns that are not indexed (Tomas
Vondra, Kyotaro Horiguchi)
For example, an index defined by
CREATE INDEX tidx_partial ON t(b) WHERE a > 0
can now be used for an index-only scan by a query that
specifiesWHERE a > 0
and does not otherwise usea
. Previously
this was disallowed because a is not listed as an index column.
add a comment |
If additional preconditions for an index-only scan are met, it makes perfect sense to append the column id
as trailing column to the index (not as leading column):
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);
Postgres 11 introduces actual covering indexes with the INCLUDE
keyword.
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type) INCLUDE (id);
Only a small benefit for your case, but it's a great option to add columns to a UNIQUE or PK index or constraint.
About index-only scans:
- The Postgres Wiki
- The manual
The most important precondition: The visibility map of table thing_types
has to show most or all pages as "visible" to all transactions. I.e. the table is either read-only, or your autovacuum settings are aggressive enough to continuously clean up after writes to the table.
Every additional index adds costs. Mostly to write performance. But also side effects, like exhausted cache capacities. (Multiple queries using the same indexes have a better chance for them to reside in cache.) So it's also a question of size. id
is typically a very small column integer
or bigint
. Makes it a good candidate for the use case.
In particular, adding a column to an index disables the option for H.O.T. updates involving the column. But since id
is indexed anyway and typically not updated (being the PK) this is not a problem in this case. Related:
- Redundant data in update statements
If you actually get index-only scans out of these indexes most of the time, it typically makes sense to use them. Test with EXPLAIN
.
There were limitations for partial indexes in older versions. Quoting the release notes of Postgres 9.6:
Allow use of an index-only scan on a partial index when the index's
WHERE
clause references columns that are not indexed (Tomas
Vondra, Kyotaro Horiguchi)
For example, an index defined by
CREATE INDEX tidx_partial ON t(b) WHERE a > 0
can now be used for an index-only scan by a query that
specifiesWHERE a > 0
and does not otherwise usea
. Previously
this was disallowed because a is not listed as an index column.
If additional preconditions for an index-only scan are met, it makes perfect sense to append the column id
as trailing column to the index (not as leading column):
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);
Postgres 11 introduces actual covering indexes with the INCLUDE
keyword.
CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type) INCLUDE (id);
Only a small benefit for your case, but it's a great option to add columns to a UNIQUE or PK index or constraint.
About index-only scans:
- The Postgres Wiki
- The manual
The most important precondition: The visibility map of table thing_types
has to show most or all pages as "visible" to all transactions. I.e. the table is either read-only, or your autovacuum settings are aggressive enough to continuously clean up after writes to the table.
Every additional index adds costs. Mostly to write performance. But also side effects, like exhausted cache capacities. (Multiple queries using the same indexes have a better chance for them to reside in cache.) So it's also a question of size. id
is typically a very small column integer
or bigint
. Makes it a good candidate for the use case.
In particular, adding a column to an index disables the option for H.O.T. updates involving the column. But since id
is indexed anyway and typically not updated (being the PK) this is not a problem in this case. Related:
- Redundant data in update statements
If you actually get index-only scans out of these indexes most of the time, it typically makes sense to use them. Test with EXPLAIN
.
There were limitations for partial indexes in older versions. Quoting the release notes of Postgres 9.6:
Allow use of an index-only scan on a partial index when the index's
WHERE
clause references columns that are not indexed (Tomas
Vondra, Kyotaro Horiguchi)
For example, an index defined by
CREATE INDEX tidx_partial ON t(b) WHERE a > 0
can now be used for an index-only scan by a query that
specifiesWHERE a > 0
and does not otherwise usea
. Previously
this was disallowed because a is not listed as an index column.
edited 9 mins ago
answered Nov 5 '17 at 17:08
Erwin BrandstetterErwin Brandstetter
93.4k9179292
93.4k9179292
add a comment |
add a comment |
You need to try it and see for your specific query plan. You're making a lot of blanket assumptions about the advice given and even the potential for it to be useful to your query.
- Size.
- PostgreSQL major number
- Configuration for costs.
- Staleness and accuracy of statistics.
All of those things matter.
Not to be vague here, but I could conjure a few examples to show you this.
Generally, I wouldn't index something that is already indexed in the table. If for no other reason than because for every time an index covers a specific column, you have one more index that has to be updated when you change the row.
add a comment |
You need to try it and see for your specific query plan. You're making a lot of blanket assumptions about the advice given and even the potential for it to be useful to your query.
- Size.
- PostgreSQL major number
- Configuration for costs.
- Staleness and accuracy of statistics.
All of those things matter.
Not to be vague here, but I could conjure a few examples to show you this.
Generally, I wouldn't index something that is already indexed in the table. If for no other reason than because for every time an index covers a specific column, you have one more index that has to be updated when you change the row.
add a comment |
You need to try it and see for your specific query plan. You're making a lot of blanket assumptions about the advice given and even the potential for it to be useful to your query.
- Size.
- PostgreSQL major number
- Configuration for costs.
- Staleness and accuracy of statistics.
All of those things matter.
Not to be vague here, but I could conjure a few examples to show you this.
Generally, I wouldn't index something that is already indexed in the table. If for no other reason than because for every time an index covers a specific column, you have one more index that has to be updated when you change the row.
You need to try it and see for your specific query plan. You're making a lot of blanket assumptions about the advice given and even the potential for it to be useful to your query.
- Size.
- PostgreSQL major number
- Configuration for costs.
- Staleness and accuracy of statistics.
All of those things matter.
Not to be vague here, but I could conjure a few examples to show you this.
Generally, I wouldn't index something that is already indexed in the table. If for no other reason than because for every time an index covers a specific column, you have one more index that has to be updated when you change the row.
answered Nov 6 '17 at 20:06
Evan CarrollEvan Carroll
32.6k970222
32.6k970222
add a comment |
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f190156%2fcan-postgres-use-an-index-only-scan-for-this-query-with-joined-tables%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown