Can Postgres use an index-only scan for this query with joined tables?Do covering indexes in PostgreSQL help...

Removing whitespace between consecutive numbers

How to make ice magic work from a scientific point of view?

How would an AI self awareness kill switch work?

Can I announce prefix 161.117.25.0/24 even though I don't have all of /24 IPs

Why don't key signatures indicate the tonic?

How can I play a serial killer in a party of good PCs?

Strange "DuckDuckGo dork" takes me to random website

Short story where statues have their heads replaced by those of carved insect heads

Why did the villain in the first Men in Black movie care about Earth's Cockroaches?

What game did these black and yellow dice come from?

Premature ending of generator in list comprehension

Why is Agricola named as such?

A Missing Symbol for This Logo

How do you catch Smeargle in Pokemon Go?

Eww, those bytes are gross

Airplane generations - how does it work?

Has Britain negotiated with any other countries outside the EU in preparation for the exit?

Citing paid articles from illegal web sharing

Why does magnet wire need to be insulated?

Why zero tolerance on nudity in space?

What makes papers publishable in top-tier journals?

Existence of Riemann surface, holomorphic maps

What is a good reason for every spaceship to carry a weapon on board?

What is the wife of a henpecked husband called?



Can Postgres use an index-only scan for this query with joined tables?


Do covering indexes in PostgreSQL help JOIN columns?What is the recommended way to join junction tables for efficient ordering/pagination?Optimize Bitmap Heap ScanEfficiently return two aggregated arrays from a m:n tableOptimizing a simple query joining two big tablesPostgreSQL performance with (col = value or col is NULL)Slow left join lateral in subqueryReturn values at more/less specific time pointsImprove Performance on GROUP BY - large table PostgreSQLMerge 2 columns and replace with specific outputPrimary key index with a DATETIME as first part of the compound key is never usedInnoDB - Use combined index with primary key on GROUP BYOptimizing index creationHow can I speed up a Postgres query containing lots of Joins with an ILIKE conditionCan PostgreSQL use nulls in its indexes?Optimising tables for ordering from a joinDo covering indexes in PostgreSQL help JOIN columns?Why is this query with WHERE, ORDER BY and LIMIT so slow?How to index two tables for JOINed query optimisationPostgreSQL 9.5 query performance depends on JOINed column in SELECT clause













1















This is a follow-up to: Do covering indexes in PostgreSQL help JOIN columns?



Consider the inverse of the schema in the other question where you filter in the joined-on table:



CREATE TABLE thing_types(
id INTEGER PRIMARY KEY
, first_lvl_type TEXT
, second_lvl_type TEXT
);

CREATE TABLE things(
id INTEGER PRIMARY KEY
, thing_type INTEGER REFERENCES thing_types(id)
, t1c1 INTEGER
);


And a query like so:



SELECT things.t1c1
FROM things
JOIN thing_types ON things.thing_type = thing_types.id
WHERE thing_types.first_lvl_type = 'Book'
AND thing_types.second_lvl_type = 'Biography';


Is it madness to have an index like:



CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);


which covers the primary key for use in that join? Will the index be used as a covering index to help the JOIN in the above query? Should I change my indexing strategy to cover the primary key more often when I know the table is going to be JOINed on like this?










share|improve this question





























    1















    This is a follow-up to: Do covering indexes in PostgreSQL help JOIN columns?



    Consider the inverse of the schema in the other question where you filter in the joined-on table:



    CREATE TABLE thing_types(
    id INTEGER PRIMARY KEY
    , first_lvl_type TEXT
    , second_lvl_type TEXT
    );

    CREATE TABLE things(
    id INTEGER PRIMARY KEY
    , thing_type INTEGER REFERENCES thing_types(id)
    , t1c1 INTEGER
    );


    And a query like so:



    SELECT things.t1c1
    FROM things
    JOIN thing_types ON things.thing_type = thing_types.id
    WHERE thing_types.first_lvl_type = 'Book'
    AND thing_types.second_lvl_type = 'Biography';


    Is it madness to have an index like:



    CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);


    which covers the primary key for use in that join? Will the index be used as a covering index to help the JOIN in the above query? Should I change my indexing strategy to cover the primary key more often when I know the table is going to be JOINed on like this?










    share|improve this question



























      1












      1








      1


      1






      This is a follow-up to: Do covering indexes in PostgreSQL help JOIN columns?



      Consider the inverse of the schema in the other question where you filter in the joined-on table:



      CREATE TABLE thing_types(
      id INTEGER PRIMARY KEY
      , first_lvl_type TEXT
      , second_lvl_type TEXT
      );

      CREATE TABLE things(
      id INTEGER PRIMARY KEY
      , thing_type INTEGER REFERENCES thing_types(id)
      , t1c1 INTEGER
      );


      And a query like so:



      SELECT things.t1c1
      FROM things
      JOIN thing_types ON things.thing_type = thing_types.id
      WHERE thing_types.first_lvl_type = 'Book'
      AND thing_types.second_lvl_type = 'Biography';


      Is it madness to have an index like:



      CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);


      which covers the primary key for use in that join? Will the index be used as a covering index to help the JOIN in the above query? Should I change my indexing strategy to cover the primary key more often when I know the table is going to be JOINed on like this?










      share|improve this question
















      This is a follow-up to: Do covering indexes in PostgreSQL help JOIN columns?



      Consider the inverse of the schema in the other question where you filter in the joined-on table:



      CREATE TABLE thing_types(
      id INTEGER PRIMARY KEY
      , first_lvl_type TEXT
      , second_lvl_type TEXT
      );

      CREATE TABLE things(
      id INTEGER PRIMARY KEY
      , thing_type INTEGER REFERENCES thing_types(id)
      , t1c1 INTEGER
      );


      And a query like so:



      SELECT things.t1c1
      FROM things
      JOIN thing_types ON things.thing_type = thing_types.id
      WHERE thing_types.first_lvl_type = 'Book'
      AND thing_types.second_lvl_type = 'Biography';


      Is it madness to have an index like:



      CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);


      which covers the primary key for use in that join? Will the index be used as a covering index to help the JOIN in the above query? Should I change my indexing strategy to cover the primary key more often when I know the table is going to be JOINed on like this?







      postgresql index optimization index-tuning






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Aug 1 '18 at 21:41









      Erwin Brandstetter

      93.4k9179292




      93.4k9179292










      asked Nov 5 '17 at 16:42









      ldrgldrg

      20137




      20137






















          2 Answers
          2






          active

          oldest

          votes


















          1














          If additional preconditions for an index-only scan are met, it makes perfect sense to append the column id as trailing column to the index (not as leading column):



          CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);


          Postgres 11 introduces actual covering indexes with the INCLUDE keyword.



          CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type) INCLUDE (id);


          Only a small benefit for your case, but it's a great option to add columns to a UNIQUE or PK index or constraint.



          About index-only scans:




          • The Postgres Wiki

          • The manual


          The most important precondition: The visibility map of table thing_types has to show most or all pages as "visible" to all transactions. I.e. the table is either read-only, or your autovacuum settings are aggressive enough to continuously clean up after writes to the table.



          Every additional index adds costs. Mostly to write performance. But also side effects, like exhausted cache capacities. (Multiple queries using the same indexes have a better chance for them to reside in cache.) So it's also a question of size. id is typically a very small column integer or bigint. Makes it a good candidate for the use case.



          In particular, adding a column to an index disables the option for H.O.T. updates involving the column. But since id is indexed anyway and typically not updated (being the PK) this is not a problem in this case. Related:




          • Redundant data in update statements


          If you actually get index-only scans out of these indexes most of the time, it typically makes sense to use them. Test with EXPLAIN.



          There were limitations for partial indexes in older versions. Quoting the release notes of Postgres 9.6:






          • Allow use of an index-only scan on a partial index when the index's WHERE clause references columns that are not indexed (Tomas
            Vondra, Kyotaro Horiguchi)



            For example, an index defined by CREATE INDEX tidx_partial ON t(b) WHERE a > 0 can now be used for an index-only scan by a query that
            specifies WHERE a > 0 and does not otherwise use a. Previously
            this was disallowed because a is not listed as an index column.









          share|improve this answer

































            0














            You need to try it and see for your specific query plan. You're making a lot of blanket assumptions about the advice given and even the potential for it to be useful to your query.




            • Size.

            • PostgreSQL major number

            • Configuration for costs.

            • Staleness and accuracy of statistics.


            All of those things matter.



            Not to be vague here, but I could conjure a few examples to show you this.



            Generally, I wouldn't index something that is already indexed in the table. If for no other reason than because for every time an index covers a specific column, you have one more index that has to be updated when you change the row.






            share|improve this answer























              Your Answer








              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "182"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f190156%2fcan-postgres-use-an-index-only-scan-for-this-query-with-joined-tables%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              1














              If additional preconditions for an index-only scan are met, it makes perfect sense to append the column id as trailing column to the index (not as leading column):



              CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);


              Postgres 11 introduces actual covering indexes with the INCLUDE keyword.



              CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type) INCLUDE (id);


              Only a small benefit for your case, but it's a great option to add columns to a UNIQUE or PK index or constraint.



              About index-only scans:




              • The Postgres Wiki

              • The manual


              The most important precondition: The visibility map of table thing_types has to show most or all pages as "visible" to all transactions. I.e. the table is either read-only, or your autovacuum settings are aggressive enough to continuously clean up after writes to the table.



              Every additional index adds costs. Mostly to write performance. But also side effects, like exhausted cache capacities. (Multiple queries using the same indexes have a better chance for them to reside in cache.) So it's also a question of size. id is typically a very small column integer or bigint. Makes it a good candidate for the use case.



              In particular, adding a column to an index disables the option for H.O.T. updates involving the column. But since id is indexed anyway and typically not updated (being the PK) this is not a problem in this case. Related:




              • Redundant data in update statements


              If you actually get index-only scans out of these indexes most of the time, it typically makes sense to use them. Test with EXPLAIN.



              There were limitations for partial indexes in older versions. Quoting the release notes of Postgres 9.6:






              • Allow use of an index-only scan on a partial index when the index's WHERE clause references columns that are not indexed (Tomas
                Vondra, Kyotaro Horiguchi)



                For example, an index defined by CREATE INDEX tidx_partial ON t(b) WHERE a > 0 can now be used for an index-only scan by a query that
                specifies WHERE a > 0 and does not otherwise use a. Previously
                this was disallowed because a is not listed as an index column.









              share|improve this answer






























                1














                If additional preconditions for an index-only scan are met, it makes perfect sense to append the column id as trailing column to the index (not as leading column):



                CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);


                Postgres 11 introduces actual covering indexes with the INCLUDE keyword.



                CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type) INCLUDE (id);


                Only a small benefit for your case, but it's a great option to add columns to a UNIQUE or PK index or constraint.



                About index-only scans:




                • The Postgres Wiki

                • The manual


                The most important precondition: The visibility map of table thing_types has to show most or all pages as "visible" to all transactions. I.e. the table is either read-only, or your autovacuum settings are aggressive enough to continuously clean up after writes to the table.



                Every additional index adds costs. Mostly to write performance. But also side effects, like exhausted cache capacities. (Multiple queries using the same indexes have a better chance for them to reside in cache.) So it's also a question of size. id is typically a very small column integer or bigint. Makes it a good candidate for the use case.



                In particular, adding a column to an index disables the option for H.O.T. updates involving the column. But since id is indexed anyway and typically not updated (being the PK) this is not a problem in this case. Related:




                • Redundant data in update statements


                If you actually get index-only scans out of these indexes most of the time, it typically makes sense to use them. Test with EXPLAIN.



                There were limitations for partial indexes in older versions. Quoting the release notes of Postgres 9.6:






                • Allow use of an index-only scan on a partial index when the index's WHERE clause references columns that are not indexed (Tomas
                  Vondra, Kyotaro Horiguchi)



                  For example, an index defined by CREATE INDEX tidx_partial ON t(b) WHERE a > 0 can now be used for an index-only scan by a query that
                  specifies WHERE a > 0 and does not otherwise use a. Previously
                  this was disallowed because a is not listed as an index column.









                share|improve this answer




























                  1












                  1








                  1







                  If additional preconditions for an index-only scan are met, it makes perfect sense to append the column id as trailing column to the index (not as leading column):



                  CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);


                  Postgres 11 introduces actual covering indexes with the INCLUDE keyword.



                  CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type) INCLUDE (id);


                  Only a small benefit for your case, but it's a great option to add columns to a UNIQUE or PK index or constraint.



                  About index-only scans:




                  • The Postgres Wiki

                  • The manual


                  The most important precondition: The visibility map of table thing_types has to show most or all pages as "visible" to all transactions. I.e. the table is either read-only, or your autovacuum settings are aggressive enough to continuously clean up after writes to the table.



                  Every additional index adds costs. Mostly to write performance. But also side effects, like exhausted cache capacities. (Multiple queries using the same indexes have a better chance for them to reside in cache.) So it's also a question of size. id is typically a very small column integer or bigint. Makes it a good candidate for the use case.



                  In particular, adding a column to an index disables the option for H.O.T. updates involving the column. But since id is indexed anyway and typically not updated (being the PK) this is not a problem in this case. Related:




                  • Redundant data in update statements


                  If you actually get index-only scans out of these indexes most of the time, it typically makes sense to use them. Test with EXPLAIN.



                  There were limitations for partial indexes in older versions. Quoting the release notes of Postgres 9.6:






                  • Allow use of an index-only scan on a partial index when the index's WHERE clause references columns that are not indexed (Tomas
                    Vondra, Kyotaro Horiguchi)



                    For example, an index defined by CREATE INDEX tidx_partial ON t(b) WHERE a > 0 can now be used for an index-only scan by a query that
                    specifies WHERE a > 0 and does not otherwise use a. Previously
                    this was disallowed because a is not listed as an index column.









                  share|improve this answer















                  If additional preconditions for an index-only scan are met, it makes perfect sense to append the column id as trailing column to the index (not as leading column):



                  CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type, id);


                  Postgres 11 introduces actual covering indexes with the INCLUDE keyword.



                  CREATE INDEX ON thing_types(first_lvl_type, second_lvl_type) INCLUDE (id);


                  Only a small benefit for your case, but it's a great option to add columns to a UNIQUE or PK index or constraint.



                  About index-only scans:




                  • The Postgres Wiki

                  • The manual


                  The most important precondition: The visibility map of table thing_types has to show most or all pages as "visible" to all transactions. I.e. the table is either read-only, or your autovacuum settings are aggressive enough to continuously clean up after writes to the table.



                  Every additional index adds costs. Mostly to write performance. But also side effects, like exhausted cache capacities. (Multiple queries using the same indexes have a better chance for them to reside in cache.) So it's also a question of size. id is typically a very small column integer or bigint. Makes it a good candidate for the use case.



                  In particular, adding a column to an index disables the option for H.O.T. updates involving the column. But since id is indexed anyway and typically not updated (being the PK) this is not a problem in this case. Related:




                  • Redundant data in update statements


                  If you actually get index-only scans out of these indexes most of the time, it typically makes sense to use them. Test with EXPLAIN.



                  There were limitations for partial indexes in older versions. Quoting the release notes of Postgres 9.6:






                  • Allow use of an index-only scan on a partial index when the index's WHERE clause references columns that are not indexed (Tomas
                    Vondra, Kyotaro Horiguchi)



                    For example, an index defined by CREATE INDEX tidx_partial ON t(b) WHERE a > 0 can now be used for an index-only scan by a query that
                    specifies WHERE a > 0 and does not otherwise use a. Previously
                    this was disallowed because a is not listed as an index column.










                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited 9 mins ago

























                  answered Nov 5 '17 at 17:08









                  Erwin BrandstetterErwin Brandstetter

                  93.4k9179292




                  93.4k9179292

























                      0














                      You need to try it and see for your specific query plan. You're making a lot of blanket assumptions about the advice given and even the potential for it to be useful to your query.




                      • Size.

                      • PostgreSQL major number

                      • Configuration for costs.

                      • Staleness and accuracy of statistics.


                      All of those things matter.



                      Not to be vague here, but I could conjure a few examples to show you this.



                      Generally, I wouldn't index something that is already indexed in the table. If for no other reason than because for every time an index covers a specific column, you have one more index that has to be updated when you change the row.






                      share|improve this answer




























                        0














                        You need to try it and see for your specific query plan. You're making a lot of blanket assumptions about the advice given and even the potential for it to be useful to your query.




                        • Size.

                        • PostgreSQL major number

                        • Configuration for costs.

                        • Staleness and accuracy of statistics.


                        All of those things matter.



                        Not to be vague here, but I could conjure a few examples to show you this.



                        Generally, I wouldn't index something that is already indexed in the table. If for no other reason than because for every time an index covers a specific column, you have one more index that has to be updated when you change the row.






                        share|improve this answer


























                          0












                          0








                          0







                          You need to try it and see for your specific query plan. You're making a lot of blanket assumptions about the advice given and even the potential for it to be useful to your query.




                          • Size.

                          • PostgreSQL major number

                          • Configuration for costs.

                          • Staleness and accuracy of statistics.


                          All of those things matter.



                          Not to be vague here, but I could conjure a few examples to show you this.



                          Generally, I wouldn't index something that is already indexed in the table. If for no other reason than because for every time an index covers a specific column, you have one more index that has to be updated when you change the row.






                          share|improve this answer













                          You need to try it and see for your specific query plan. You're making a lot of blanket assumptions about the advice given and even the potential for it to be useful to your query.




                          • Size.

                          • PostgreSQL major number

                          • Configuration for costs.

                          • Staleness and accuracy of statistics.


                          All of those things matter.



                          Not to be vague here, but I could conjure a few examples to show you this.



                          Generally, I wouldn't index something that is already indexed in the table. If for no other reason than because for every time an index covers a specific column, you have one more index that has to be updated when you change the row.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Nov 6 '17 at 20:06









                          Evan CarrollEvan Carroll

                          32.6k970222




                          32.6k970222






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Database Administrators Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f190156%2fcan-postgres-use-an-index-only-scan-for-this-query-with-joined-tables%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              ORA-01691 (unable to extend lob segment) even though my tablespace has AUTOEXTEND onORA-01692: unable to...

                              Always On Availability groups resolving state after failover - Remote harden of transaction...

                              Circunscripción electoral de Guipúzcoa Referencias Menú de navegaciónLas claves del sistema electoral en...