How to optimize an ordered date range query with several OR filters (for streaming)?Optimizing ORDER BY in a...

Stream.findFirst different than Optional.of?

What happens if you declare more than $10,000 at the US border?

The Longest Chess Game

Coworker is trying to get me to sign his petition to run for office. How to decline politely?

Badly designed reimbursement form. What does that say about the company?

我可不觉得 - agree or disagree?

Why is Bernie Sanders maximum accepted donation on actblue $5600?

In a world with multiracial creatures, what word can be used instead of mankind?

How to encircle section of matrix in LaTeX?

Found a major flaw in paper from home university – to which I would like to return

Have the conservatives lost the working majority and if so, what does this mean?

Boss asked me to sign a resignation paper without a date on it along with my new contract

Ramanujan's radical and how we define an infinite nested radical

Current measurement op-amp calculation

Discouraging missile alpha strikes

Which was the first story to feature space elevators?

How can I differentiate duration vs starting time

How can a kingdom keep the secret of a missing monarch from the public?

Define function that behaves almost identically to Mathematica function

Why would you use 2 alternate layout buttons instead of 1, when only one can be selected at once

Are all power cords made equal?

How do I know my password or backup information is not being shared when creating a new wallet?

How to not forget my phone in the bathroom?

How can changes in personality/values of a person who turned into a vampire be explained?

How to optimize an ordered date range query with several OR filters (for streaming)?

Optimizing ORDER BY in a full text search queryPostgreSQL 9.2 (PostGIS) performance problemHow to index WHERE (start_date >= '2013-12-15')How can I speed up a Postgres query containing lots of Joins with an ILIKE conditionpostgres explain plan with giant gaps between operationsSlow fulltext search due to wildly inaccurate row estimatespostgresql 9.2 hash join issueSorting killing my postgresql queryWhy is this query with WHERE, ORDER BY and LIMIT so slow?How to index two tables for JOINed query optimisation

I have a table of data which is be outputted into a stream, so the upper bound of the cost doesn't matter as much as the lower bound. The table contains about 200M rows. The question is how to structurize indexes and queries with order by such that Postgres would use the index instead of doing additional sorting.

Each entry is characterized by 5 fields each of which is not unique on its own:

date of occurrence (timestamp)

num (order within a timestamp)

type1

type2

id_in_type2 (never used without type2)

A typical query for this data is bounded by date (say, from 1 day before now until now) and zero or more (about 5 actually) filters by any combination of type1, type2 and id_in_type2 connected by OR, and expects a stream of data (it doesn't matter how long it takes to output it completely, but it should start streaming as fast as possible), example:

SELECT *

FROM MY_TABLE

WHERE date > ?

AND (

  (type1 = ?)

  OR (type2 = ?)

  OR (type2 = ? AND associated_id = ?)

  OR ...

)

ORDER BY date, num;

When there are no additional filters or there is just one, a simple index by (date, num, type1, type2, associated_id) works perfectly - since the results are stored along the sort order of the index, there's no problem to start outputting the results.

However, when things get more complicated, no combination of structures of my indexes and my queries seem to yield satisfying results.
Here's roughly what I tried so far:

Indexes: I tried having a separate index for (date, num) and 3 separate ones for other fields (like in the example below), I tried adding date to filters 2-4, i tried putting date and num as first fields in the index as well as the last ones.

CREATE INDEX my_table_1 ON my_table (date, num);

CREATE INDEX my_table_2 ON my_table (type1);

CREATE INDEX my_table_3 ON my_table (type2, id_in_type2);

CREATE INDEX my_table_4 ON my_table (type1, type2, id_in_type2);

Queries: I tried to select by date range and then by filters, I tried putting the filter by date to the end, I tried splitting the filters to separate ones and make a union.

No matter what I do, it won't apply the filters along the (date, num) index order, it will always do the sorting, yielding lower cost about the same as the upper cost.

Oh yes the table is also partitioned by date.

Any ideas on structurizing the indexes? Maybe I'd be better off returning all results with matching date and make the filtering in the app that's reading the stream?

Here's an execution plan for a naively constructed query with index setting as shown above (I ran this one on a test DB which is not partitioned and has 2M rows total, I'm presuming that on production table numbers will be different but orders of magnitude of difference between lower cost and upper cost are going to be about the same)

explain analyze

select * from my_table

where date >= now() - '1 day'::interval

  and (

    (type2 = 'T2A' and type1 = 'T1A')

    or (type2 = 'T2B' and type1 = 'T1B1')

    or (type1 = 'T1C')

    or (type2 = 'T2B' and type1 = 'T1B2')

  )

order by date, num;



Sort  (cost=157714.25..158431.63 rows=286949 width=262) (actual time=2153.327..2313.075 rows=600314 loops=1)

  Sort Key: date, num

  Sort Method: external merge  Disk: 144376kB

  ->  Bitmap Heap Scan on my_table  (cost=16602.76..96390.20 rows=286949 width=262) (actual time=199.193..1281.214 rows=600314 loops=1)

        Recheck Cond: (((type1)::text = 'T1A'::text) OR (((type1)::text = 'T1B1'::text) AND ((type2)::text = 'T2B'::text)) OR ((type1)::text = 'T1C'::text) OR (((type1)::text = 'T1B2'::text) AND ((type2)::text = 'T2B'::text)))

        Rows Removed by Index Recheck: 455892

        Filter: ((date >= (now() - '1 day'::interval)) AND ((((type2)::text = 'T2A'::text) AND ((type1)::text = 'T1A'::text)) OR (((type2)::text = 'T2B'::text) AND ((type1)::text = 'T1B1'::text)) OR ((type1)::text = 'T1C'::text) OR (((type2)::text = 'T2B'::text) AND ((type1)::text = 'T1B2'::text))))

        Rows Removed by Filter: 2

        Heap Blocks: exact=42763 lossy=26546

        ->  BitmapOr  (cost=16602.76..16602.76 rows=299384 width=0) (actual time=188.980..188.980 rows=0 loops=1)

              ->  Bitmap Index Scan on my_table_2  (cost=0.00..4.44 rows=1 width=0) (actual time=0.037..0.037 rows=30 loops=1)

                    Index Cond: ((type1)::text = 'T1A'::text)

              ->  Bitmap Index Scan on my_table_4  (cost=0.00..8094.98 rows=148643 width=0) (actual time=93.562..93.562 rows=300240 loops=1)

                    Index Cond: (((type1)::text = 'T1B1'::text) AND ((type2)::text = 'T2B'::text))

              ->  Bitmap Index Scan on my_table_2  (cost=0.00..4.44 rows=1 width=0) (actual time=0.059..0.059 rows=28 loops=1)

                    Index Cond: ((type1)::text = 'T1C'::text)

              ->  Bitmap Index Scan on my_table_4  (cost=0.00..8211.96 rows=150741 width=0) (actual time=95.315..95.315 rows=300018 loops=1)

                    Index Cond: (((type1)::text = 'T1B2'::text) AND ((type2)::text = 'T2B'::text))

Planning time: 0.435 ms

Execution time: 2356.305 ms

The query may return 100s of thousands of rows.

edited Apr 23 '17 at 17:49

Paul White♦

52.2k14278450

asked Nov 16 '16 at 8:51

Sergey

1162

bumped to the homepage by Community♦ 3 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

Please also show the execution plan for the desirable query plan, where you just specify date and chop off all the OR conditions.

– jjanes
Nov 17 '16 at 21:02

Put all 4 fields in the index starting with the date,num sort key.

– eckes
Apr 23 '17 at 13:58

I am not sure if PostgreSQL will start sending data to the client before the server has completed... So, you can speed-up the query, but I am not sure at all that the server will start sending any data to the client until it has processed all the query.

– joanolo
Apr 23 '17 at 20:31

add a comment |

Each entry is characterized by 5 fields each of which is not unique on its own:

date of occurrence (timestamp)

num (order within a timestamp)

type1

type2

id_in_type2 (never used without type2)

SELECT *

FROM MY_TABLE

WHERE date > ?

AND (

  (type1 = ?)

  OR (type2 = ?)

  OR (type2 = ? AND associated_id = ?)

  OR ...

)

ORDER BY date, num;

However, when things get more complicated, no combination of structures of my indexes and my queries seem to yield satisfying results.
Here's roughly what I tried so far:

CREATE INDEX my_table_1 ON my_table (date, num);

CREATE INDEX my_table_2 ON my_table (type1);

CREATE INDEX my_table_3 ON my_table (type2, id_in_type2);

CREATE INDEX my_table_4 ON my_table (type1, type2, id_in_type2);

Queries: I tried to select by date range and then by filters, I tried putting the filter by date to the end, I tried splitting the filters to separate ones and make a union.

No matter what I do, it won't apply the filters along the (date, num) index order, it will always do the sorting, yielding lower cost about the same as the upper cost.

Oh yes the table is also partitioned by date.

Any ideas on structurizing the indexes? Maybe I'd be better off returning all results with matching date and make the filtering in the app that's reading the stream?

explain analyze

select * from my_table

where date >= now() - '1 day'::interval

  and (

    (type2 = 'T2A' and type1 = 'T1A')

    or (type2 = 'T2B' and type1 = 'T1B1')

    or (type1 = 'T1C')

    or (type2 = 'T2B' and type1 = 'T1B2')

  )

order by date, num;



Sort  (cost=157714.25..158431.63 rows=286949 width=262) (actual time=2153.327..2313.075 rows=600314 loops=1)

  Sort Key: date, num

  Sort Method: external merge  Disk: 144376kB

  ->  Bitmap Heap Scan on my_table  (cost=16602.76..96390.20 rows=286949 width=262) (actual time=199.193..1281.214 rows=600314 loops=1)

        Recheck Cond: (((type1)::text = 'T1A'::text) OR (((type1)::text = 'T1B1'::text) AND ((type2)::text = 'T2B'::text)) OR ((type1)::text = 'T1C'::text) OR (((type1)::text = 'T1B2'::text) AND ((type2)::text = 'T2B'::text)))

        Rows Removed by Index Recheck: 455892

        Filter: ((date >= (now() - '1 day'::interval)) AND ((((type2)::text = 'T2A'::text) AND ((type1)::text = 'T1A'::text)) OR (((type2)::text = 'T2B'::text) AND ((type1)::text = 'T1B1'::text)) OR ((type1)::text = 'T1C'::text) OR (((type2)::text = 'T2B'::text) AND ((type1)::text = 'T1B2'::text))))

        Rows Removed by Filter: 2

        Heap Blocks: exact=42763 lossy=26546

        ->  BitmapOr  (cost=16602.76..16602.76 rows=299384 width=0) (actual time=188.980..188.980 rows=0 loops=1)

              ->  Bitmap Index Scan on my_table_2  (cost=0.00..4.44 rows=1 width=0) (actual time=0.037..0.037 rows=30 loops=1)

                    Index Cond: ((type1)::text = 'T1A'::text)

              ->  Bitmap Index Scan on my_table_4  (cost=0.00..8094.98 rows=148643 width=0) (actual time=93.562..93.562 rows=300240 loops=1)

                    Index Cond: (((type1)::text = 'T1B1'::text) AND ((type2)::text = 'T2B'::text))

              ->  Bitmap Index Scan on my_table_2  (cost=0.00..4.44 rows=1 width=0) (actual time=0.059..0.059 rows=28 loops=1)

                    Index Cond: ((type1)::text = 'T1C'::text)

              ->  Bitmap Index Scan on my_table_4  (cost=0.00..8211.96 rows=150741 width=0) (actual time=95.315..95.315 rows=300018 loops=1)

                    Index Cond: (((type1)::text = 'T1B2'::text) AND ((type2)::text = 'T2B'::text))

Planning time: 0.435 ms

Execution time: 2356.305 ms

The query may return 100s of thousands of rows.

edited Apr 23 '17 at 17:49

Paul White♦

52.2k14278450

asked Nov 16 '16 at 8:51

Sergey

1162

bumped to the homepage by Community♦ 3 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

Please also show the execution plan for the desirable query plan, where you just specify date and chop off all the OR conditions.

– jjanes
Nov 17 '16 at 21:02

Put all 4 fields in the index starting with the date,num sort key.

– eckes
Apr 23 '17 at 13:58

I am not sure if PostgreSQL will start sending data to the client before the server has completed... So, you can speed-up the query, but I am not sure at all that the server will start sending any data to the client until it has processed all the query.

– joanolo
Apr 23 '17 at 20:31

add a comment |

Each entry is characterized by 5 fields each of which is not unique on its own:

date of occurrence (timestamp)

num (order within a timestamp)

type1

type2

id_in_type2 (never used without type2)

SELECT *

FROM MY_TABLE

WHERE date > ?

AND (

  (type1 = ?)

  OR (type2 = ?)

  OR (type2 = ? AND associated_id = ?)

  OR ...

)

ORDER BY date, num;

However, when things get more complicated, no combination of structures of my indexes and my queries seem to yield satisfying results.
Here's roughly what I tried so far:

CREATE INDEX my_table_1 ON my_table (date, num);

CREATE INDEX my_table_2 ON my_table (type1);

CREATE INDEX my_table_3 ON my_table (type2, id_in_type2);

CREATE INDEX my_table_4 ON my_table (type1, type2, id_in_type2);

Queries: I tried to select by date range and then by filters, I tried putting the filter by date to the end, I tried splitting the filters to separate ones and make a union.

No matter what I do, it won't apply the filters along the (date, num) index order, it will always do the sorting, yielding lower cost about the same as the upper cost.

Oh yes the table is also partitioned by date.

Any ideas on structurizing the indexes? Maybe I'd be better off returning all results with matching date and make the filtering in the app that's reading the stream?

explain analyze

select * from my_table

where date >= now() - '1 day'::interval

  and (

    (type2 = 'T2A' and type1 = 'T1A')

    or (type2 = 'T2B' and type1 = 'T1B1')

    or (type1 = 'T1C')

    or (type2 = 'T2B' and type1 = 'T1B2')

  )

order by date, num;



Sort  (cost=157714.25..158431.63 rows=286949 width=262) (actual time=2153.327..2313.075 rows=600314 loops=1)

  Sort Key: date, num

  Sort Method: external merge  Disk: 144376kB

  ->  Bitmap Heap Scan on my_table  (cost=16602.76..96390.20 rows=286949 width=262) (actual time=199.193..1281.214 rows=600314 loops=1)

        Recheck Cond: (((type1)::text = 'T1A'::text) OR (((type1)::text = 'T1B1'::text) AND ((type2)::text = 'T2B'::text)) OR ((type1)::text = 'T1C'::text) OR (((type1)::text = 'T1B2'::text) AND ((type2)::text = 'T2B'::text)))

        Rows Removed by Index Recheck: 455892

        Filter: ((date >= (now() - '1 day'::interval)) AND ((((type2)::text = 'T2A'::text) AND ((type1)::text = 'T1A'::text)) OR (((type2)::text = 'T2B'::text) AND ((type1)::text = 'T1B1'::text)) OR ((type1)::text = 'T1C'::text) OR (((type2)::text = 'T2B'::text) AND ((type1)::text = 'T1B2'::text))))

        Rows Removed by Filter: 2

        Heap Blocks: exact=42763 lossy=26546

        ->  BitmapOr  (cost=16602.76..16602.76 rows=299384 width=0) (actual time=188.980..188.980 rows=0 loops=1)

              ->  Bitmap Index Scan on my_table_2  (cost=0.00..4.44 rows=1 width=0) (actual time=0.037..0.037 rows=30 loops=1)

                    Index Cond: ((type1)::text = 'T1A'::text)

              ->  Bitmap Index Scan on my_table_4  (cost=0.00..8094.98 rows=148643 width=0) (actual time=93.562..93.562 rows=300240 loops=1)

                    Index Cond: (((type1)::text = 'T1B1'::text) AND ((type2)::text = 'T2B'::text))

              ->  Bitmap Index Scan on my_table_2  (cost=0.00..4.44 rows=1 width=0) (actual time=0.059..0.059 rows=28 loops=1)

                    Index Cond: ((type1)::text = 'T1C'::text)

              ->  Bitmap Index Scan on my_table_4  (cost=0.00..8211.96 rows=150741 width=0) (actual time=95.315..95.315 rows=300018 loops=1)

                    Index Cond: (((type1)::text = 'T1B2'::text) AND ((type2)::text = 'T2B'::text))

Planning time: 0.435 ms

Execution time: 2356.305 ms

The query may return 100s of thousands of rows.

edited Apr 23 '17 at 17:49

Paul White♦

52.2k14278450

asked Nov 16 '16 at 8:51

Sergey

1162

Each entry is characterized by 5 fields each of which is not unique on its own:

date of occurrence (timestamp)

num (order within a timestamp)

type1

type2

id_in_type2 (never used without type2)

SELECT *

FROM MY_TABLE

WHERE date > ?

AND (

  (type1 = ?)

  OR (type2 = ?)

  OR (type2 = ? AND associated_id = ?)

  OR ...

)

ORDER BY date, num;

However, when things get more complicated, no combination of structures of my indexes and my queries seem to yield satisfying results.
Here's roughly what I tried so far:

CREATE INDEX my_table_1 ON my_table (date, num);

CREATE INDEX my_table_2 ON my_table (type1);

CREATE INDEX my_table_3 ON my_table (type2, id_in_type2);

CREATE INDEX my_table_4 ON my_table (type1, type2, id_in_type2);

Queries: I tried to select by date range and then by filters, I tried putting the filter by date to the end, I tried splitting the filters to separate ones and make a union.

No matter what I do, it won't apply the filters along the (date, num) index order, it will always do the sorting, yielding lower cost about the same as the upper cost.

Oh yes the table is also partitioned by date.

Any ideas on structurizing the indexes? Maybe I'd be better off returning all results with matching date and make the filtering in the app that's reading the stream?

explain analyze

select * from my_table

where date >= now() - '1 day'::interval

  and (

    (type2 = 'T2A' and type1 = 'T1A')

    or (type2 = 'T2B' and type1 = 'T1B1')

    or (type1 = 'T1C')

    or (type2 = 'T2B' and type1 = 'T1B2')

  )

order by date, num;



Sort  (cost=157714.25..158431.63 rows=286949 width=262) (actual time=2153.327..2313.075 rows=600314 loops=1)

  Sort Key: date, num

  Sort Method: external merge  Disk: 144376kB

  ->  Bitmap Heap Scan on my_table  (cost=16602.76..96390.20 rows=286949 width=262) (actual time=199.193..1281.214 rows=600314 loops=1)

        Recheck Cond: (((type1)::text = 'T1A'::text) OR (((type1)::text = 'T1B1'::text) AND ((type2)::text = 'T2B'::text)) OR ((type1)::text = 'T1C'::text) OR (((type1)::text = 'T1B2'::text) AND ((type2)::text = 'T2B'::text)))

        Rows Removed by Index Recheck: 455892

        Filter: ((date >= (now() - '1 day'::interval)) AND ((((type2)::text = 'T2A'::text) AND ((type1)::text = 'T1A'::text)) OR (((type2)::text = 'T2B'::text) AND ((type1)::text = 'T1B1'::text)) OR ((type1)::text = 'T1C'::text) OR (((type2)::text = 'T2B'::text) AND ((type1)::text = 'T1B2'::text))))

        Rows Removed by Filter: 2

        Heap Blocks: exact=42763 lossy=26546

        ->  BitmapOr  (cost=16602.76..16602.76 rows=299384 width=0) (actual time=188.980..188.980 rows=0 loops=1)

              ->  Bitmap Index Scan on my_table_2  (cost=0.00..4.44 rows=1 width=0) (actual time=0.037..0.037 rows=30 loops=1)

                    Index Cond: ((type1)::text = 'T1A'::text)

              ->  Bitmap Index Scan on my_table_4  (cost=0.00..8094.98 rows=148643 width=0) (actual time=93.562..93.562 rows=300240 loops=1)

                    Index Cond: (((type1)::text = 'T1B1'::text) AND ((type2)::text = 'T2B'::text))

              ->  Bitmap Index Scan on my_table_2  (cost=0.00..4.44 rows=1 width=0) (actual time=0.059..0.059 rows=28 loops=1)

                    Index Cond: ((type1)::text = 'T1C'::text)

              ->  Bitmap Index Scan on my_table_4  (cost=0.00..8211.96 rows=150741 width=0) (actual time=95.315..95.315 rows=300018 loops=1)

                    Index Cond: (((type1)::text = 'T1B2'::text) AND ((type2)::text = 'T2B'::text))

Planning time: 0.435 ms

Execution time: 2356.305 ms

The query may return 100s of thousands of rows.

postgresql query-performance

edited Apr 23 '17 at 17:49

Paul White♦

52.2k14278450

asked Nov 16 '16 at 8:51

Sergey

1162

edited Apr 23 '17 at 17:49

Paul White♦

52.2k14278450

asked Nov 16 '16 at 8:51

Sergey

1162

edited Apr 23 '17 at 17:49

Paul White♦

52.2k14278450

edited Apr 23 '17 at 17:49

Paul White♦

52.2k14278450

edited Apr 23 '17 at 17:49

Paul White♦

52.2k14278450

asked Nov 16 '16 at 8:51

Sergey

1162

asked Nov 16 '16 at 8:51

Sergey

1162

asked Nov 16 '16 at 8:51

Sergey

1162

bumped to the homepage by Community♦ 3 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

bumped to the homepage by Community♦ 3 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

Please also show the execution plan for the desirable query plan, where you just specify date and chop off all the OR conditions.

– jjanes
Nov 17 '16 at 21:02

Put all 4 fields in the index starting with the date,num sort key.

– eckes
Apr 23 '17 at 13:58

I am not sure if PostgreSQL will start sending data to the client before the server has completed... So, you can speed-up the query, but I am not sure at all that the server will start sending any data to the client until it has processed all the query.

– joanolo
Apr 23 '17 at 20:31

add a comment |

Please also show the execution plan for the desirable query plan, where you just specify date and chop off all the OR conditions.

– jjanes
Nov 17 '16 at 21:02

Put all 4 fields in the index starting with the date,num sort key.

– eckes
Apr 23 '17 at 13:58

I am not sure if PostgreSQL will start sending data to the client before the server has completed... So, you can speed-up the query, but I am not sure at all that the server will start sending any data to the client until it has processed all the query.

– joanolo
Apr 23 '17 at 20:31

Please also show the execution plan for the desirable query plan, where you just specify date and chop off all the OR conditions.

– jjanes
Nov 17 '16 at 21:02

Put all 4 fields in the index starting with the date,num sort key.

– eckes
Apr 23 '17 at 13:58

I am not sure if PostgreSQL will start sending data to the client before the server has completed... So, you can speed-up the query, but I am not sure at all that the server will start sending any data to the client until it has processed all the query.

– joanolo
Apr 23 '17 at 20:31

add a comment |

2 Answers
2

active

oldest

votes

Have you tried making 4 separate queries, and then using union all, so that you can get rid of the or statements?

Example:

select * from my_table

where date >= now() - '1 day'::interval

  and type2 = 'T2A' 

  and type1 = 'T1A'



union all

select * from my_table

where date >= now() - '1 day'::interval

  and type2 = 'T2B' 

  and type1 = 'T1B1'

...

edited Apr 23 '17 at 17:46

Paul White♦

52.2k14278450

answered Apr 23 '17 at 13:53

oraclequestions7

add a comment |

I would try a couple of things. First: instead of using an expression date >= now() - '1 day'::interval, I would compute the constant value on the server side and actually try the query like this:

explain analyze     

select * from my_table

where date >= '2017-04-22'

  and (

    (type2 = 'T2A' and type1 = 'T1A')

    or (type2 = 'T2B' and type1 = 'T1B1')

    or (type1 = 'T1C')

    or (type2 = 'T2B' and type1 = 'T1B2')

    )

order by date, num;

In some experiences, this makes a difference with regard to partitions

Secondly, I'd try to have the query in two stages, to see if this makes it easier for PostgreSQL to use the index more efficiently (if that is actually possible):

SELECT

    *

FROM

    (SELECT

        * 

    FROM

        my_table

    WHERE

      -- This where is *likely* to use index

      date >= '2017-04-22'   -- Use a CONSTANT (compute it on the client side)

      and 

      type1 IN ('T1A', 'T1B1', 'T1C', 'T1B2')

    ) AS s0

WHERE

    -- Now you actually fine-filter result

       (type1 = 'T1A'  and type2 = 'T2A')

    or (type1 = 'T1B1' and type2 = 'T2B')

    or (type1 = 'T1B2' and type2 = 'T2B')

    or (type1 = 'T1C')

ORDER BY

    date, num;

I've trie to simulate your scenario in DBFiddle, but, not knowing your specific data distribution... I don't manage to really simulate it well enough. In any case, in my simulation, the query worked differently when changing the function call with a constant.

These are a bit of "wild guesses". As already commented: I am not sure if PostgreSQL will start sending data to the client before the server has completed processing the whole query... You can probably speed-up the query, but I am not sure at all that the server will start sending any data to the client until it has processed all the query.

answered Apr 23 '17 at 21:14

joanolo

9,73342153

1

The derived table won't make a difference. It the where clause the the inner query really reduces the number of rows substantially, then putting that into a CTE might make a difference though

– a_horse_with_no_name
Apr 24 '17 at 6:13

In the query plan of PostgreSQL 9.6, it just pushes the condition and ANDs both WHEREs.

– joanolo
Apr 24 '17 at 6:55

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f155427%2fhow-to-optimize-an-ordered-date-range-query-with-several-or-filters-for-streami%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Have you tried making 4 separate queries, and then using union all, so that you can get rid of the or statements?

Example:

select * from my_table

where date >= now() - '1 day'::interval

  and type2 = 'T2A' 

  and type1 = 'T1A'



union all

select * from my_table

where date >= now() - '1 day'::interval

  and type2 = 'T2B' 

  and type1 = 'T1B1'

...

edited Apr 23 '17 at 17:46

Paul White♦

52.2k14278450

answered Apr 23 '17 at 13:53

oraclequestions7

add a comment |

Have you tried making 4 separate queries, and then using union all, so that you can get rid of the or statements?

Example:

select * from my_table

where date >= now() - '1 day'::interval

  and type2 = 'T2A' 

  and type1 = 'T1A'



union all

select * from my_table

where date >= now() - '1 day'::interval

  and type2 = 'T2B' 

  and type1 = 'T1B1'

...

edited Apr 23 '17 at 17:46

Paul White♦

52.2k14278450

answered Apr 23 '17 at 13:53

oraclequestions7

add a comment |

Have you tried making 4 separate queries, and then using union all, so that you can get rid of the or statements?

Example:

select * from my_table

where date >= now() - '1 day'::interval

  and type2 = 'T2A' 

  and type1 = 'T1A'



union all

select * from my_table

where date >= now() - '1 day'::interval

  and type2 = 'T2B' 

  and type1 = 'T1B1'

...

edited Apr 23 '17 at 17:46

Paul White♦

52.2k14278450

answered Apr 23 '17 at 13:53

oraclequestions7

Have you tried making 4 separate queries, and then using union all, so that you can get rid of the or statements?

Example:

select * from my_table

where date >= now() - '1 day'::interval

  and type2 = 'T2A' 

  and type1 = 'T1A'



union all

select * from my_table

where date >= now() - '1 day'::interval

  and type2 = 'T2B' 

  and type1 = 'T1B1'

...

edited Apr 23 '17 at 17:46

Paul White♦

52.2k14278450

answered Apr 23 '17 at 13:53

oraclequestions7

edited Apr 23 '17 at 17:46

Paul White♦

52.2k14278450

edited Apr 23 '17 at 17:46

Paul White♦

52.2k14278450

edited Apr 23 '17 at 17:46

Paul White♦

52.2k14278450

answered Apr 23 '17 at 13:53

oraclequestions7

answered Apr 23 '17 at 13:53

oraclequestions7

answered Apr 23 '17 at 13:53

oraclequestions7

add a comment |

I would try a couple of things. First: instead of using an expression date >= now() - '1 day'::interval, I would compute the constant value on the server side and actually try the query like this:

explain analyze     

select * from my_table

where date >= '2017-04-22'

  and (

    (type2 = 'T2A' and type1 = 'T1A')

    or (type2 = 'T2B' and type1 = 'T1B1')

    or (type1 = 'T1C')

    or (type2 = 'T2B' and type1 = 'T1B2')

    )

order by date, num;

In some experiences, this makes a difference with regard to partitions

Secondly, I'd try to have the query in two stages, to see if this makes it easier for PostgreSQL to use the index more efficiently (if that is actually possible):

SELECT

    *

FROM

    (SELECT

        * 

    FROM

        my_table

    WHERE

      -- This where is *likely* to use index

      date >= '2017-04-22'   -- Use a CONSTANT (compute it on the client side)

      and 

      type1 IN ('T1A', 'T1B1', 'T1C', 'T1B2')

    ) AS s0

WHERE

    -- Now you actually fine-filter result

       (type1 = 'T1A'  and type2 = 'T2A')

    or (type1 = 'T1B1' and type2 = 'T2B')

    or (type1 = 'T1B2' and type2 = 'T2B')

    or (type1 = 'T1C')

ORDER BY

    date, num;

answered Apr 23 '17 at 21:14

joanolo

9,73342153

1

The derived table won't make a difference. It the where clause the the inner query really reduces the number of rows substantially, then putting that into a CTE might make a difference though

– a_horse_with_no_name
Apr 24 '17 at 6:13

In the query plan of PostgreSQL 9.6, it just pushes the condition and ANDs both WHEREs.

– joanolo
Apr 24 '17 at 6:55

add a comment |

I would try a couple of things. First: instead of using an expression date >= now() - '1 day'::interval, I would compute the constant value on the server side and actually try the query like this:

explain analyze     

select * from my_table

where date >= '2017-04-22'

  and (

    (type2 = 'T2A' and type1 = 'T1A')

    or (type2 = 'T2B' and type1 = 'T1B1')

    or (type1 = 'T1C')

    or (type2 = 'T2B' and type1 = 'T1B2')

    )

order by date, num;

In some experiences, this makes a difference with regard to partitions

Secondly, I'd try to have the query in two stages, to see if this makes it easier for PostgreSQL to use the index more efficiently (if that is actually possible):

SELECT

    *

FROM

    (SELECT

        * 

    FROM

        my_table

    WHERE

      -- This where is *likely* to use index

      date >= '2017-04-22'   -- Use a CONSTANT (compute it on the client side)

      and 

      type1 IN ('T1A', 'T1B1', 'T1C', 'T1B2')

    ) AS s0

WHERE

    -- Now you actually fine-filter result

       (type1 = 'T1A'  and type2 = 'T2A')

    or (type1 = 'T1B1' and type2 = 'T2B')

    or (type1 = 'T1B2' and type2 = 'T2B')

    or (type1 = 'T1C')

ORDER BY

    date, num;

answered Apr 23 '17 at 21:14

joanolo

9,73342153

1

The derived table won't make a difference. It the where clause the the inner query really reduces the number of rows substantially, then putting that into a CTE might make a difference though

– a_horse_with_no_name
Apr 24 '17 at 6:13

In the query plan of PostgreSQL 9.6, it just pushes the condition and ANDs both WHEREs.

– joanolo
Apr 24 '17 at 6:55

add a comment |

I would try a couple of things. First: instead of using an expression date >= now() - '1 day'::interval, I would compute the constant value on the server side and actually try the query like this:

explain analyze     

select * from my_table

where date >= '2017-04-22'

  and (

    (type2 = 'T2A' and type1 = 'T1A')

    or (type2 = 'T2B' and type1 = 'T1B1')

    or (type1 = 'T1C')

    or (type2 = 'T2B' and type1 = 'T1B2')

    )

order by date, num;

In some experiences, this makes a difference with regard to partitions

Secondly, I'd try to have the query in two stages, to see if this makes it easier for PostgreSQL to use the index more efficiently (if that is actually possible):

SELECT

    *

FROM

    (SELECT

        * 

    FROM

        my_table

    WHERE

      -- This where is *likely* to use index

      date >= '2017-04-22'   -- Use a CONSTANT (compute it on the client side)

      and 

      type1 IN ('T1A', 'T1B1', 'T1C', 'T1B2')

    ) AS s0

WHERE

    -- Now you actually fine-filter result

       (type1 = 'T1A'  and type2 = 'T2A')

    or (type1 = 'T1B1' and type2 = 'T2B')

    or (type1 = 'T1B2' and type2 = 'T2B')

    or (type1 = 'T1C')

ORDER BY

    date, num;

answered Apr 23 '17 at 21:14

joanolo

9,73342153

I would try a couple of things. First: instead of using an expression date >= now() - '1 day'::interval, I would compute the constant value on the server side and actually try the query like this:

explain analyze     

select * from my_table

where date >= '2017-04-22'

  and (

    (type2 = 'T2A' and type1 = 'T1A')

    or (type2 = 'T2B' and type1 = 'T1B1')

    or (type1 = 'T1C')

    or (type2 = 'T2B' and type1 = 'T1B2')

    )

order by date, num;

In some experiences, this makes a difference with regard to partitions

Secondly, I'd try to have the query in two stages, to see if this makes it easier for PostgreSQL to use the index more efficiently (if that is actually possible):

SELECT

    *

FROM

    (SELECT

        * 

    FROM

        my_table

    WHERE

      -- This where is *likely* to use index

      date >= '2017-04-22'   -- Use a CONSTANT (compute it on the client side)

      and 

      type1 IN ('T1A', 'T1B1', 'T1C', 'T1B2')

    ) AS s0

WHERE

    -- Now you actually fine-filter result

       (type1 = 'T1A'  and type2 = 'T2A')

    or (type1 = 'T1B1' and type2 = 'T2B')

    or (type1 = 'T1B2' and type2 = 'T2B')

    or (type1 = 'T1C')

ORDER BY

    date, num;

answered Apr 23 '17 at 21:14

joanolo

9,73342153

answered Apr 23 '17 at 21:14

joanolo

9,73342153

answered Apr 23 '17 at 21:14

joanolo

9,73342153

answered Apr 23 '17 at 21:14

joanolo

9,73342153

1

The derived table won't make a difference. It the where clause the the inner query really reduces the number of rows substantially, then putting that into a CTE might make a difference though

– a_horse_with_no_name
Apr 24 '17 at 6:13

In the query plan of PostgreSQL 9.6, it just pushes the condition and ANDs both WHEREs.

– joanolo
Apr 24 '17 at 6:55

add a comment |

1

The derived table won't make a difference. It the where clause the the inner query really reduces the number of rows substantially, then putting that into a CTE might make a difference though

– a_horse_with_no_name
Apr 24 '17 at 6:13

In the query plan of PostgreSQL 9.6, it just pushes the condition and ANDs both WHEREs.

– joanolo
Apr 24 '17 at 6:55

The derived table won't make a difference. It the where clause the the inner query really reduces the number of rows substantially, then putting that into a CTE might make a difference though

– a_horse_with_no_name
Apr 24 '17 at 6:13

In the query plan of PostgreSQL 9.6, it just pushes the condition and ANDs both WHEREs.

– joanolo
Apr 24 '17 at 6:55

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Database Administrators Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Sfrgttk