Window aggregate resulting in slow query on postgres 9.6 The 2019 Stack Overflow Developer...

Loose spokes after only a few rides

What to do when moving next to a bird sanctuary with a loosely-domesticated cat?

Protecting Dualbooting Windows from dangerous code (like rm -rf)

Did 3000BC Egyptians use meteoric iron weapons?

Resizing object distorts it (Illustrator CC 2018)

Is flight data recorder erased after every flight?

Deal with toxic manager when you can't quit

Why do UK politicians seemingly ignore opinion polls on Brexit?

Have you ever entered Singapore using a different passport or name?

Identify boardgame from Big movie

If I score a critical hit on an 18 or higher, what are my chances of getting a critical hit if I roll 3d20?

Is this app Icon Browser Safe/Legit?

Worn-tile Scrabble

Am I thawing this London Broil safely?

Is there any way to tell whether the shot is going to hit you or not?

What is the motivation for a law requiring 2 parties to consent for recording a conversation

Did Scotland spend $250,000 for the slogan "Welcome to Scotland"?

Pokemon Turn Based battle (Python)

If a Druid sees an animal’s corpse, can they wild shape into that animal?

What is the accessibility of a package's `Private` context variables?

Why didn't the Event Horizon Telescope team mention Sagittarius A*?

What does Linus Torvalds mean when he says that Git "never ever" tracks a file?

Do these rules for Critical Successes and Critical Failures seem Fair?

Why did Acorn's A3000 have red function keys?



Window aggregate resulting in slow query on postgres 9.6



The 2019 Stack Overflow Developer Survey Results Are InPostgres query plan has high rowcount, slow executionQuery with merge join horrendously slowVery Slow QuerySlow postgres query that uses quick sort instead of hash-aggregatePostgresSQL recursive query running slow with low load and almost no dataWhy is the Postgresql planner choosing the slower merge join (instead of hash join)Nested Loop with Materialize resulting in very slow query on postgresql 9.5Optimizing slow performance of simple SELECT querySlow query performance when searching for a particular value, but fast with a different value on the same columnPostgreSQL: Slow JOIN between multiple tables.





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







1















I'm using Postgresql 9.6 and I'm running into a performance issue. I have a relatively simple query:



explain analyze
SELECT * FROM (
SELECT A.person_id, O.session_id, timestamp_ AS step_ts_min ,rank() over (partition by A.person_id order by timestamp_) as rn
FROM event_page O
JOIN alias A ON (O.person_alias = A.alias)
WHERE O.timestamp_ between '2017-02-18 00:00:00.000' AND '2017-03-13 00:00:00.000'
and O.location_host like '2016.myhost.ca'
) as innerstep WHERE rn=1


The event_page table is actually a view over monthly tables union-ed together, you can see that in the plan.



This query takes over 2.5 minutes to run and you can see the lions share of that time is used by quicksort, along with 1.8G of ram!!!



Here is the execution plan: https://explain.depesz.com/s/v51h



Is there a way to improve performance and optimize the query?










share|improve this question














bumped to the homepage by Community 8 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.






















    1















    I'm using Postgresql 9.6 and I'm running into a performance issue. I have a relatively simple query:



    explain analyze
    SELECT * FROM (
    SELECT A.person_id, O.session_id, timestamp_ AS step_ts_min ,rank() over (partition by A.person_id order by timestamp_) as rn
    FROM event_page O
    JOIN alias A ON (O.person_alias = A.alias)
    WHERE O.timestamp_ between '2017-02-18 00:00:00.000' AND '2017-03-13 00:00:00.000'
    and O.location_host like '2016.myhost.ca'
    ) as innerstep WHERE rn=1


    The event_page table is actually a view over monthly tables union-ed together, you can see that in the plan.



    This query takes over 2.5 minutes to run and you can see the lions share of that time is used by quicksort, along with 1.8G of ram!!!



    Here is the execution plan: https://explain.depesz.com/s/v51h



    Is there a way to improve performance and optimize the query?










    share|improve this question














    bumped to the homepage by Community 8 mins ago


    This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.


















      1












      1








      1








      I'm using Postgresql 9.6 and I'm running into a performance issue. I have a relatively simple query:



      explain analyze
      SELECT * FROM (
      SELECT A.person_id, O.session_id, timestamp_ AS step_ts_min ,rank() over (partition by A.person_id order by timestamp_) as rn
      FROM event_page O
      JOIN alias A ON (O.person_alias = A.alias)
      WHERE O.timestamp_ between '2017-02-18 00:00:00.000' AND '2017-03-13 00:00:00.000'
      and O.location_host like '2016.myhost.ca'
      ) as innerstep WHERE rn=1


      The event_page table is actually a view over monthly tables union-ed together, you can see that in the plan.



      This query takes over 2.5 minutes to run and you can see the lions share of that time is used by quicksort, along with 1.8G of ram!!!



      Here is the execution plan: https://explain.depesz.com/s/v51h



      Is there a way to improve performance and optimize the query?










      share|improve this question














      I'm using Postgresql 9.6 and I'm running into a performance issue. I have a relatively simple query:



      explain analyze
      SELECT * FROM (
      SELECT A.person_id, O.session_id, timestamp_ AS step_ts_min ,rank() over (partition by A.person_id order by timestamp_) as rn
      FROM event_page O
      JOIN alias A ON (O.person_alias = A.alias)
      WHERE O.timestamp_ between '2017-02-18 00:00:00.000' AND '2017-03-13 00:00:00.000'
      and O.location_host like '2016.myhost.ca'
      ) as innerstep WHERE rn=1


      The event_page table is actually a view over monthly tables union-ed together, you can see that in the plan.



      This query takes over 2.5 minutes to run and you can see the lions share of that time is used by quicksort, along with 1.8G of ram!!!



      Here is the execution plan: https://explain.depesz.com/s/v51h



      Is there a way to improve performance and optimize the query?







      postgresql query-performance execution-plan window-functions






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 20 '17 at 18:20









      maxTrialfiremaxTrialfire

      385310




      385310





      bumped to the homepage by Community 8 mins ago


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







      bumped to the homepage by Community 8 mins ago


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
























          1 Answer
          1






          active

          oldest

          votes


















          0














          Your window function has to sort 10,453,164 rows. Look at cutting that down. Sorting 10.2 million rows in 158 seconds isn't too bad.



          Sort (cost=3,047,865.12..3,073,554.61 rows=10,275,796 width=55) (actual time=144,515.353..148,038.268 rows=10,453,164 loops=1)
          Sort Key: a.person_id, event_page_2016_8.timestamp_
          Sort Method: quicksort Memory: 1863192kB


          I also think there is something wrong with your partitioning.



          ->  Index Only Scan using ep_2016_8_host_ts_a_ses on event_page_2016_8  (cost=0.55..4.57 rows=1 width=42) (actual time=0.016..0.016 rows=0 loops=1)
          Index Cond: ((location_host = '2016.myhost.ca'::text) AND (timestamp_ >= '2017-02-18 00:00:00'::timestamp without time zone) AND (timestamp_ <= '2017-03-13 00:00:00'::timestamp without time zone))


          Why does this query search a table called ep_2016_8_host_ts_a_ses when the timestamp clearly exists in 2017-02, and 2017-03. The query planner is supposed to know better. Look up constraint exclusion



          My guess is here that the query can't make use of the partitions index to sort the table. It should be able to -- if all partitions were indexed by timestamp_ I would think it could walk through the indexes in parallel and get just the first rank fairly easily. I could be wrong though. I may have to play with this later. In the mean time, try getting constraint exclusion working and giving it another go.






          share|improve this answer


























          • Thanks Evan. I didnt know about constraint exclusion. Seems like it would be better than my current approach. That being said those index scan take almost no time. I'm concerned that sorting 10.2M rows in 158 seconds using 2GB of RAM "isn't too bad". I'd say its very bad.

            – maxTrialfire
            Mar 20 '17 at 22:17












          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "182"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f167669%2fwindow-aggregate-resulting-in-slow-query-on-postgres-9-6%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          Your window function has to sort 10,453,164 rows. Look at cutting that down. Sorting 10.2 million rows in 158 seconds isn't too bad.



          Sort (cost=3,047,865.12..3,073,554.61 rows=10,275,796 width=55) (actual time=144,515.353..148,038.268 rows=10,453,164 loops=1)
          Sort Key: a.person_id, event_page_2016_8.timestamp_
          Sort Method: quicksort Memory: 1863192kB


          I also think there is something wrong with your partitioning.



          ->  Index Only Scan using ep_2016_8_host_ts_a_ses on event_page_2016_8  (cost=0.55..4.57 rows=1 width=42) (actual time=0.016..0.016 rows=0 loops=1)
          Index Cond: ((location_host = '2016.myhost.ca'::text) AND (timestamp_ >= '2017-02-18 00:00:00'::timestamp without time zone) AND (timestamp_ <= '2017-03-13 00:00:00'::timestamp without time zone))


          Why does this query search a table called ep_2016_8_host_ts_a_ses when the timestamp clearly exists in 2017-02, and 2017-03. The query planner is supposed to know better. Look up constraint exclusion



          My guess is here that the query can't make use of the partitions index to sort the table. It should be able to -- if all partitions were indexed by timestamp_ I would think it could walk through the indexes in parallel and get just the first rank fairly easily. I could be wrong though. I may have to play with this later. In the mean time, try getting constraint exclusion working and giving it another go.






          share|improve this answer


























          • Thanks Evan. I didnt know about constraint exclusion. Seems like it would be better than my current approach. That being said those index scan take almost no time. I'm concerned that sorting 10.2M rows in 158 seconds using 2GB of RAM "isn't too bad". I'd say its very bad.

            – maxTrialfire
            Mar 20 '17 at 22:17
















          0














          Your window function has to sort 10,453,164 rows. Look at cutting that down. Sorting 10.2 million rows in 158 seconds isn't too bad.



          Sort (cost=3,047,865.12..3,073,554.61 rows=10,275,796 width=55) (actual time=144,515.353..148,038.268 rows=10,453,164 loops=1)
          Sort Key: a.person_id, event_page_2016_8.timestamp_
          Sort Method: quicksort Memory: 1863192kB


          I also think there is something wrong with your partitioning.



          ->  Index Only Scan using ep_2016_8_host_ts_a_ses on event_page_2016_8  (cost=0.55..4.57 rows=1 width=42) (actual time=0.016..0.016 rows=0 loops=1)
          Index Cond: ((location_host = '2016.myhost.ca'::text) AND (timestamp_ >= '2017-02-18 00:00:00'::timestamp without time zone) AND (timestamp_ <= '2017-03-13 00:00:00'::timestamp without time zone))


          Why does this query search a table called ep_2016_8_host_ts_a_ses when the timestamp clearly exists in 2017-02, and 2017-03. The query planner is supposed to know better. Look up constraint exclusion



          My guess is here that the query can't make use of the partitions index to sort the table. It should be able to -- if all partitions were indexed by timestamp_ I would think it could walk through the indexes in parallel and get just the first rank fairly easily. I could be wrong though. I may have to play with this later. In the mean time, try getting constraint exclusion working and giving it another go.






          share|improve this answer


























          • Thanks Evan. I didnt know about constraint exclusion. Seems like it would be better than my current approach. That being said those index scan take almost no time. I'm concerned that sorting 10.2M rows in 158 seconds using 2GB of RAM "isn't too bad". I'd say its very bad.

            – maxTrialfire
            Mar 20 '17 at 22:17














          0












          0








          0







          Your window function has to sort 10,453,164 rows. Look at cutting that down. Sorting 10.2 million rows in 158 seconds isn't too bad.



          Sort (cost=3,047,865.12..3,073,554.61 rows=10,275,796 width=55) (actual time=144,515.353..148,038.268 rows=10,453,164 loops=1)
          Sort Key: a.person_id, event_page_2016_8.timestamp_
          Sort Method: quicksort Memory: 1863192kB


          I also think there is something wrong with your partitioning.



          ->  Index Only Scan using ep_2016_8_host_ts_a_ses on event_page_2016_8  (cost=0.55..4.57 rows=1 width=42) (actual time=0.016..0.016 rows=0 loops=1)
          Index Cond: ((location_host = '2016.myhost.ca'::text) AND (timestamp_ >= '2017-02-18 00:00:00'::timestamp without time zone) AND (timestamp_ <= '2017-03-13 00:00:00'::timestamp without time zone))


          Why does this query search a table called ep_2016_8_host_ts_a_ses when the timestamp clearly exists in 2017-02, and 2017-03. The query planner is supposed to know better. Look up constraint exclusion



          My guess is here that the query can't make use of the partitions index to sort the table. It should be able to -- if all partitions were indexed by timestamp_ I would think it could walk through the indexes in parallel and get just the first rank fairly easily. I could be wrong though. I may have to play with this later. In the mean time, try getting constraint exclusion working and giving it another go.






          share|improve this answer















          Your window function has to sort 10,453,164 rows. Look at cutting that down. Sorting 10.2 million rows in 158 seconds isn't too bad.



          Sort (cost=3,047,865.12..3,073,554.61 rows=10,275,796 width=55) (actual time=144,515.353..148,038.268 rows=10,453,164 loops=1)
          Sort Key: a.person_id, event_page_2016_8.timestamp_
          Sort Method: quicksort Memory: 1863192kB


          I also think there is something wrong with your partitioning.



          ->  Index Only Scan using ep_2016_8_host_ts_a_ses on event_page_2016_8  (cost=0.55..4.57 rows=1 width=42) (actual time=0.016..0.016 rows=0 loops=1)
          Index Cond: ((location_host = '2016.myhost.ca'::text) AND (timestamp_ >= '2017-02-18 00:00:00'::timestamp without time zone) AND (timestamp_ <= '2017-03-13 00:00:00'::timestamp without time zone))


          Why does this query search a table called ep_2016_8_host_ts_a_ses when the timestamp clearly exists in 2017-02, and 2017-03. The query planner is supposed to know better. Look up constraint exclusion



          My guess is here that the query can't make use of the partitions index to sort the table. It should be able to -- if all partitions were indexed by timestamp_ I would think it could walk through the indexes in parallel and get just the first rank fairly easily. I could be wrong though. I may have to play with this later. In the mean time, try getting constraint exclusion working and giving it another go.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Mar 20 '17 at 19:16

























          answered Mar 20 '17 at 19:06









          Evan CarrollEvan Carroll

          33.5k1076233




          33.5k1076233













          • Thanks Evan. I didnt know about constraint exclusion. Seems like it would be better than my current approach. That being said those index scan take almost no time. I'm concerned that sorting 10.2M rows in 158 seconds using 2GB of RAM "isn't too bad". I'd say its very bad.

            – maxTrialfire
            Mar 20 '17 at 22:17



















          • Thanks Evan. I didnt know about constraint exclusion. Seems like it would be better than my current approach. That being said those index scan take almost no time. I'm concerned that sorting 10.2M rows in 158 seconds using 2GB of RAM "isn't too bad". I'd say its very bad.

            – maxTrialfire
            Mar 20 '17 at 22:17

















          Thanks Evan. I didnt know about constraint exclusion. Seems like it would be better than my current approach. That being said those index scan take almost no time. I'm concerned that sorting 10.2M rows in 158 seconds using 2GB of RAM "isn't too bad". I'd say its very bad.

          – maxTrialfire
          Mar 20 '17 at 22:17





          Thanks Evan. I didnt know about constraint exclusion. Seems like it would be better than my current approach. That being said those index scan take almost no time. I'm concerned that sorting 10.2M rows in 158 seconds using 2GB of RAM "isn't too bad". I'd say its very bad.

          – maxTrialfire
          Mar 20 '17 at 22:17


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Database Administrators Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f167669%2fwindow-aggregate-resulting-in-slow-query-on-postgres-9-6%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Anexo:Material bélico de la Fuerza Aérea de Chile Índice Aeronaves Defensa...

          Always On Availability groups resolving state after failover - Remote harden of transaction...

          update json value to null Announcing the arrival of Valued Associate #679: Cesar Manara ...