Very slow nested SQL query The Next CEO of Stack OverflowPostgreSQL insert into table (not...

How to count occurrences of text in a file?

A small doubt about the dominated convergence theorem

Is micro rebar a better way to reinforce concrete than rebar?

Prepend last line of stdin to entire stdin

WOW air has ceased operation, can I get my tickets refunded?

Easy to read palindrome checker

Why isn't the Mueller report being released completely and unredacted?

Is it ever safe to open a suspicious HTML file (e.g. email attachment)?

Is it convenient to ask the journal's editor for two additional days to complete a review?

Why do airplanes bank sharply to the right after air-to-air refueling?

Where does this common spurious transmission come from? Is there a quality difference?

Proper way to express "He disappeared them"

What did we know about the Kessel run before the prequels?

How many extra stops do monopods offer for tele photographs?

Some questions about different axiomatic systems for neighbourhoods

Is there always a complete, orthogonal set of unitary matrices?

Why doesn't UK go for the same deal Japan has with EU to resolve Brexit?

Example of a Mathematician/Physicist whose Other Publications during their PhD eclipsed their PhD Thesis

Rotate a column

RigExpert AA-35 - Interpreting The Information

Make solar eclipses exceedingly rare, but still have new moons

Won the lottery - how do I keep the money?

Why, when going from special to general relativity, do we just replace partial derivatives with covariant derivatives?

TikZ: How to reverse arrow direction without switching start/end point?

Very slow nested SQL query

The Next CEO of Stack OverflowPostgreSQL insert into table (not origin) based on a condition on fields on different tablesPostgresql query plan differs with limit value making the query slower for lower limitsSlow running Oracle query caused by unnecessary full table scanSlow SQL query with LEFT JOINVery Slow QuerySlow query caused by nested loop on simple join?Very slow simple JOIN queryVery slow IndexOnlyScan on partial ~16MiB indexesMySQL join two large tables is very slow

I seem to have hit some bottlenecks around nested SQL queries.

I have three tables in PostgreSQL with the following:

Table-1: has 2443 unique rows

Table-2: has 3414569 unique rows

Table-3: has 9516434 unique rows

My Goal

For data in each row in Table-1, I would like to pick data in each
row in Table-2, and then use the (table-1-data, table-2-data) pair
to make query into Table-3 such that SELECT'ed rows from table-3
contain data that include (table-1-data, table-2-data).

Problem

On the surface, this seems like an O(n^2) operation, and my SQL
queries under the nested-loop lead to horrible performance.

For example, I start with:

SELECT row_a FROM table_1; // query-1

SELECT row_b FROM table_2; // query-2

Then proceed to make multiple queries for each output of query-1,
combined with each output of query-2 as shown below:

SELECT count(*)

  FROM table_3

  WHERE row_c = <single-value-from-query-1>

  AND   row_d = <single-value-from-query-2>

Needless to say, this nested O(n^2) operation is running very slow
as a whole, where I have about 2443 x 3414569 = 8341792067 queries
to make through 9516434 rows in table-3 each time.

I want to speed up the whole process, and I'd be grateful if you
could please point me in the right direction.

edited Oct 18 '18 at 7:48

Paul White♦

53.9k14287459

asked Oct 17 '18 at 23:35

gsbabil

1041

bumped to the homepage by Community♦ 13 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

SELECT ... FORM table_1 JOIN table_2 on table_1.data=table_2.data JOIN table_3 ON table_2.data=table_3.data, and use the right indexes, show the table structure on your tables.

– danblack
Oct 17 '18 at 23:59

I have about 2443 x 3414569 = 8341792067 queries ... and 8341792067 output recordsets. What do you do with those array of data??? Do you really need in ALL of them separately? I think there exists some task, and you select wrong way to solve it...

– Akina
Oct 18 '18 at 6:39

"make multiple queries for each output" - sounds like the wrong approach. Typically running SQL statements in a loop is not going to scale well. Why can't you use WHERE row_c IN (... query_1 ) (and the same for the other column) and thus get rid of the looping in your code.

– a_horse_with_no_name
Oct 18 '18 at 8:15

Do you really need the results for all 2,443×3,414,569 combinations of (row_a, row_b) or is it all right to just get all the matching pairs, i.e. just the ones that are actually there in (row_c, row_d) of table_3?

– Andriy M
Oct 18 '18 at 13:21

@AndriyM I need to derive the exhaustive list of pairs i.e. (row_a, row_b). I'm currently looking at "CARTESIAN JOIN". Is there any faster approach? Thank you.

– gsbabil
Oct 18 '18 at 13:33

|
show 9 more comments

I seem to have hit some bottlenecks around nested SQL queries.

I have three tables in PostgreSQL with the following:

Table-1: has 2443 unique rows

Table-2: has 3414569 unique rows

Table-3: has 9516434 unique rows

My Goal

Problem

On the surface, this seems like an O(n^2) operation, and my SQL
queries under the nested-loop lead to horrible performance.

For example, I start with:

SELECT row_a FROM table_1; // query-1

SELECT row_b FROM table_2; // query-2

Then proceed to make multiple queries for each output of query-1,
combined with each output of query-2 as shown below:

SELECT count(*)

  FROM table_3

  WHERE row_c = <single-value-from-query-1>

  AND   row_d = <single-value-from-query-2>

Needless to say, this nested O(n^2) operation is running very slow
as a whole, where I have about 2443 x 3414569 = 8341792067 queries
to make through 9516434 rows in table-3 each time.

I want to speed up the whole process, and I'd be grateful if you
could please point me in the right direction.

edited Oct 18 '18 at 7:48

Paul White♦

53.9k14287459

asked Oct 17 '18 at 23:35

gsbabil

1041

bumped to the homepage by Community♦ 13 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

SELECT ... FORM table_1 JOIN table_2 on table_1.data=table_2.data JOIN table_3 ON table_2.data=table_3.data, and use the right indexes, show the table structure on your tables.

– danblack
Oct 17 '18 at 23:59

I have about 2443 x 3414569 = 8341792067 queries ... and 8341792067 output recordsets. What do you do with those array of data??? Do you really need in ALL of them separately? I think there exists some task, and you select wrong way to solve it...

– Akina
Oct 18 '18 at 6:39

"make multiple queries for each output" - sounds like the wrong approach. Typically running SQL statements in a loop is not going to scale well. Why can't you use WHERE row_c IN (... query_1 ) (and the same for the other column) and thus get rid of the looping in your code.

– a_horse_with_no_name
Oct 18 '18 at 8:15

Do you really need the results for all 2,443×3,414,569 combinations of (row_a, row_b) or is it all right to just get all the matching pairs, i.e. just the ones that are actually there in (row_c, row_d) of table_3?

– Andriy M
Oct 18 '18 at 13:21

@AndriyM I need to derive the exhaustive list of pairs i.e. (row_a, row_b). I'm currently looking at "CARTESIAN JOIN". Is there any faster approach? Thank you.

– gsbabil
Oct 18 '18 at 13:33

|
show 9 more comments

I seem to have hit some bottlenecks around nested SQL queries.

I have three tables in PostgreSQL with the following:

Table-1: has 2443 unique rows

Table-2: has 3414569 unique rows

Table-3: has 9516434 unique rows

My Goal

Problem

On the surface, this seems like an O(n^2) operation, and my SQL
queries under the nested-loop lead to horrible performance.

For example, I start with:

SELECT row_a FROM table_1; // query-1

SELECT row_b FROM table_2; // query-2

Then proceed to make multiple queries for each output of query-1,
combined with each output of query-2 as shown below:

SELECT count(*)

  FROM table_3

  WHERE row_c = <single-value-from-query-1>

  AND   row_d = <single-value-from-query-2>

Needless to say, this nested O(n^2) operation is running very slow
as a whole, where I have about 2443 x 3414569 = 8341792067 queries
to make through 9516434 rows in table-3 each time.

I want to speed up the whole process, and I'd be grateful if you
could please point me in the right direction.

edited Oct 18 '18 at 7:48

Paul White♦

53.9k14287459

asked Oct 17 '18 at 23:35

gsbabil

1041

I seem to have hit some bottlenecks around nested SQL queries.

I have three tables in PostgreSQL with the following:

Table-1: has 2443 unique rows

Table-2: has 3414569 unique rows

Table-3: has 9516434 unique rows

My Goal

Problem

On the surface, this seems like an O(n^2) operation, and my SQL
queries under the nested-loop lead to horrible performance.

For example, I start with:

SELECT row_a FROM table_1; // query-1

SELECT row_b FROM table_2; // query-2

Then proceed to make multiple queries for each output of query-1,
combined with each output of query-2 as shown below:

SELECT count(*)

  FROM table_3

  WHERE row_c = <single-value-from-query-1>

  AND   row_d = <single-value-from-query-2>

Needless to say, this nested O(n^2) operation is running very slow
as a whole, where I have about 2443 x 3414569 = 8341792067 queries
to make through 9516434 rows in table-3 each time.

I want to speed up the whole process, and I'd be grateful if you
could please point me in the right direction.

postgresql query-performance

edited Oct 18 '18 at 7:48

Paul White♦

53.9k14287459

asked Oct 17 '18 at 23:35

gsbabil

1041

edited Oct 18 '18 at 7:48

Paul White♦

53.9k14287459

asked Oct 17 '18 at 23:35

gsbabil

1041

edited Oct 18 '18 at 7:48

Paul White♦

53.9k14287459

edited Oct 18 '18 at 7:48

Paul White♦

53.9k14287459

edited Oct 18 '18 at 7:48

Paul White♦

53.9k14287459

asked Oct 17 '18 at 23:35

gsbabil

1041

asked Oct 17 '18 at 23:35

gsbabil

1041

asked Oct 17 '18 at 23:35

gsbabil

1041

bumped to the homepage by Community♦ 13 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

bumped to the homepage by Community♦ 13 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

SELECT ... FORM table_1 JOIN table_2 on table_1.data=table_2.data JOIN table_3 ON table_2.data=table_3.data, and use the right indexes, show the table structure on your tables.

– danblack
Oct 17 '18 at 23:59

I have about 2443 x 3414569 = 8341792067 queries ... and 8341792067 output recordsets. What do you do with those array of data??? Do you really need in ALL of them separately? I think there exists some task, and you select wrong way to solve it...

– Akina
Oct 18 '18 at 6:39

"make multiple queries for each output" - sounds like the wrong approach. Typically running SQL statements in a loop is not going to scale well. Why can't you use WHERE row_c IN (... query_1 ) (and the same for the other column) and thus get rid of the looping in your code.

– a_horse_with_no_name
Oct 18 '18 at 8:15

Do you really need the results for all 2,443×3,414,569 combinations of (row_a, row_b) or is it all right to just get all the matching pairs, i.e. just the ones that are actually there in (row_c, row_d) of table_3?

– Andriy M
Oct 18 '18 at 13:21

@AndriyM I need to derive the exhaustive list of pairs i.e. (row_a, row_b). I'm currently looking at "CARTESIAN JOIN". Is there any faster approach? Thank you.

– gsbabil
Oct 18 '18 at 13:33

|
show 9 more comments

SELECT ... FORM table_1 JOIN table_2 on table_1.data=table_2.data JOIN table_3 ON table_2.data=table_3.data, and use the right indexes, show the table structure on your tables.

– danblack
Oct 17 '18 at 23:59

I have about 2443 x 3414569 = 8341792067 queries ... and 8341792067 output recordsets. What do you do with those array of data??? Do you really need in ALL of them separately? I think there exists some task, and you select wrong way to solve it...

– Akina
Oct 18 '18 at 6:39

"make multiple queries for each output" - sounds like the wrong approach. Typically running SQL statements in a loop is not going to scale well. Why can't you use WHERE row_c IN (... query_1 ) (and the same for the other column) and thus get rid of the looping in your code.

– a_horse_with_no_name
Oct 18 '18 at 8:15

Do you really need the results for all 2,443×3,414,569 combinations of (row_a, row_b) or is it all right to just get all the matching pairs, i.e. just the ones that are actually there in (row_c, row_d) of table_3?

– Andriy M
Oct 18 '18 at 13:21

@AndriyM I need to derive the exhaustive list of pairs i.e. (row_a, row_b). I'm currently looking at "CARTESIAN JOIN". Is there any faster approach? Thank you.

– gsbabil
Oct 18 '18 at 13:33

SELECT ... FORM table_1 JOIN table_2 on table_1.data=table_2.data JOIN table_3 ON table_2.data=table_3.data, and use the right indexes, show the table structure on your tables.

– danblack
Oct 17 '18 at 23:59

I have about 2443 x 3414569 = 8341792067 queries ... and 8341792067 output recordsets. What do you do with those array of data??? Do you really need in ALL of them separately? I think there exists some task, and you select wrong way to solve it...

– Akina
Oct 18 '18 at 6:39

"make multiple queries for each output" - sounds like the wrong approach. Typically running SQL statements in a loop is not going to scale well. Why can't you use WHERE row_c IN (... query_1 ) (and the same for the other column) and thus get rid of the looping in your code.

– a_horse_with_no_name
Oct 18 '18 at 8:15

Do you really need the results for all 2,443×3,414,569 combinations of (row_a, row_b) or is it all right to just get all the matching pairs, i.e. just the ones that are actually there in (row_c, row_d) of table_3?

– Andriy M
Oct 18 '18 at 13:21

@AndriyM I need to derive the exhaustive list of pairs i.e. (row_a, row_b). I'm currently looking at "CARTESIAN JOIN". Is there any faster approach? Thank you.

– gsbabil
Oct 18 '18 at 13:33

|
show 9 more comments

1 Answer
1

active

oldest

votes

Thanks everyone for your inputs.

As discussed above, I was able to solve my problem above using CARTESIAN JOIN and ...

Nested SELECT like the following:

SELECT * FROM (SELECT column1, column2 FROM table)

Thank you.

answered Oct 27 '18 at 18:08

gsbabil

1041

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f220409%2fvery-slow-nested-sql-query%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Thanks everyone for your inputs.

As discussed above, I was able to solve my problem above using CARTESIAN JOIN and ...

Nested SELECT like the following:

SELECT * FROM (SELECT column1, column2 FROM table)

Thank you.

answered Oct 27 '18 at 18:08

gsbabil

1041

add a comment |

Thanks everyone for your inputs.

As discussed above, I was able to solve my problem above using CARTESIAN JOIN and ...

Nested SELECT like the following:

SELECT * FROM (SELECT column1, column2 FROM table)

Thank you.

answered Oct 27 '18 at 18:08

gsbabil

1041

add a comment |

Thanks everyone for your inputs.

As discussed above, I was able to solve my problem above using CARTESIAN JOIN and ...

Nested SELECT like the following:

SELECT * FROM (SELECT column1, column2 FROM table)

Thank you.

answered Oct 27 '18 at 18:08

gsbabil

1041

Thanks everyone for your inputs.

As discussed above, I was able to solve my problem above using CARTESIAN JOIN and ...

Nested SELECT like the following:

SELECT * FROM (SELECT column1, column2 FROM table)

Thank you.

answered Oct 27 '18 at 18:08

gsbabil

1041

answered Oct 27 '18 at 18:08

gsbabil

1041

answered Oct 27 '18 at 18:08

gsbabil

1041

answered Oct 27 '18 at 18:08

gsbabil

1041

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Database Administrators Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Sfrgttk