Why is a temp table a more efficient solution to the Halloween Problem than an eager spool?When converting a...

"Starve to death" Vs. "Starve to the point of death"

How do you funnel food off a cutting board?

Book where a space ship journeys to the center of the galaxy to find all the stars had gone supernova

Why do neural networks need so many training examples to perform?

Will rerolling initiative each round stop meta-gaming about initiative?

Renting a 2CV in France

Has any human ever had the choice to leave Earth permanently?

How much mayhem could I cause as a fish?

Article. The word "Respect"

Are the positive and negative planes inner or outer planes in the Great Wheel cosmology model?

Concatenating two int[]

Is `Object` a function in javascript?

Custom shape shows unwanted extra line

What is a good reason for every spaceship to carry a weapon on board?

What species should be used for storage of human minds?

Why do all the books in Game of Thrones library have their covers facing the back of the shelf?

A fantasy book with seven white haired women on the cover

What to do with threats of blacklisting?

Eww, those bytes are gross

Subsurf on a crown. How can I smooth some edges and keep others sharp?

How does Leonard in "Memento" remember reading and writing?

Reading Mishnayos without understanding

I have trouble understanding this fallacy: "If A, then B. Therefore if not-B, then not-A."

How big is a framed opening for a door relative to the finished door opening width?



Why is a temp table a more efficient solution to the Halloween Problem than an eager spool?


When converting a table valued function to inline, why do I get a lazy spool?SHOWPLAN does not display a warning but “Include Execution Plan” does for the same queryMost efficient way to insert rows into a temp table in a stored procedureIs the eager spool operator useful for this delete from a clustered columnstore?Insert into temp table is taking longer than temp variableClustered index scan appears to be costed too low with row count spoolThe actual number of row of Lazy Spool is hugeSql query performance measure IO vs TIMEPlan changes to include Eager Spool causes the query to run slowerWhat's a pathological case where a bitmap filter would not allow the PROBE(Field, IN-ROW) semijoin reduction optimization?













1















Consider the following query that inserts rows from a source table only if they aren't already in the target table:



INSERT INTO dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR WITH (TABLOCK)
SELECT maybe_new_rows.ID
FROM dbo.A_HEAP_OF_MOSTLY_NEW_ROWS maybe_new_rows
WHERE NOT EXISTS (
SELECT 1
FROM dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR halloween
WHERE maybe_new_rows.ID = halloween.ID
)
OPTION (MAXDOP 1, QUERYTRACEON 7470);


One possible plan shape includes a merge join and an eager spool. The eager spool operator is present to solve the Halloween Problem:



first plan



On my machine, the above code executes in about 6900 ms. Repro code to create the tables is included at the bottom of the question. If I'm dissatisfied with performance I might try to load the rows to be inserted into a temp table instead of relying on the eager spool. Here's one possible implementation:



DROP TABLE IF EXISTS #CONSULTANT_RECOMMENDED_TEMP_TABLE;
CREATE TABLE #CONSULTANT_RECOMMENDED_TEMP_TABLE (
ID BIGINT,
PRIMARY KEY (ID)
);

INSERT INTO #CONSULTANT_RECOMMENDED_TEMP_TABLE WITH (TABLOCK)
SELECT maybe_new_rows.ID
FROM dbo.A_HEAP_OF_MOSTLY_NEW_ROWS maybe_new_rows
WHERE NOT EXISTS (
SELECT 1
FROM dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR halloween
WHERE maybe_new_rows.ID = halloween.ID
)
OPTION (MAXDOP 1, QUERYTRACEON 7470);

INSERT INTO dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR WITH (TABLOCK)
SELECT new_rows.ID
FROM #CONSULTANT_RECOMMENDED_TEMP_TABLE new_rows
OPTION (MAXDOP 1);


The new code executes in about 4400 ms. I can get actual plans and use Actual Time Statistics™ to examine where time is spent at the operator level. Note that asking for an actual plan adds significant overhead for these queries so totals will not match the previous results.



╔═════════════╦═════════════╦══════════════╗
║ operator ║ first query ║ second query ║
╠═════════════╬═════════════╬══════════════╣
║ big scan ║ 1771 ║ 1744 ║
║ little scan ║ 163 ║ 166 ║
║ sort ║ 531 ║ 530 ║
║ merge join ║ 709 ║ 669 ║
║ spool ║ 3202 ║ N/A ║
║ temp insert ║ N/A ║ 422 ║
║ temp scan ║ N/A ║ 187 ║
║ insert ║ 3122 ║ 1545 ║
╚═════════════╩═════════════╩══════════════╝


The query plan with the eager spool seems to spend significantly more time on the insert and spool operators compared to the plan that uses the temp table.



Why is the plan with the temp table more efficient? Isn't an eager spool mostly just an internal temp table anyway? I believe I am looking for answers that focus on internals. I'm able to see how the call stacks are different but can't figure out the big picture.



I am on SQL Server 2017 CU 11 in case someone wants to know. Here is code to populate the tables used in the above queries:



DROP TABLE IF EXISTS dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR;

CREATE TABLE dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR (
ID BIGINT NOT NULL,
PRIMARY KEY (ID)
);

INSERT INTO dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR WITH (TABLOCK)
SELECT TOP (20000000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
CROSS JOIN master..spt_values t3
OPTION (MAXDOP 1);


DROP TABLE IF EXISTS dbo.A_HEAP_OF_MOSTLY_NEW_ROWS;

CREATE TABLE dbo.A_HEAP_OF_MOSTLY_NEW_ROWS (
ID BIGINT NOT NULL
);

INSERT INTO dbo.A_HEAP_OF_MOSTLY_NEW_ROWS WITH (TABLOCK)
SELECT TOP (1900000) 19999999 + ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;








share



























    1















    Consider the following query that inserts rows from a source table only if they aren't already in the target table:



    INSERT INTO dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR WITH (TABLOCK)
    SELECT maybe_new_rows.ID
    FROM dbo.A_HEAP_OF_MOSTLY_NEW_ROWS maybe_new_rows
    WHERE NOT EXISTS (
    SELECT 1
    FROM dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR halloween
    WHERE maybe_new_rows.ID = halloween.ID
    )
    OPTION (MAXDOP 1, QUERYTRACEON 7470);


    One possible plan shape includes a merge join and an eager spool. The eager spool operator is present to solve the Halloween Problem:



    first plan



    On my machine, the above code executes in about 6900 ms. Repro code to create the tables is included at the bottom of the question. If I'm dissatisfied with performance I might try to load the rows to be inserted into a temp table instead of relying on the eager spool. Here's one possible implementation:



    DROP TABLE IF EXISTS #CONSULTANT_RECOMMENDED_TEMP_TABLE;
    CREATE TABLE #CONSULTANT_RECOMMENDED_TEMP_TABLE (
    ID BIGINT,
    PRIMARY KEY (ID)
    );

    INSERT INTO #CONSULTANT_RECOMMENDED_TEMP_TABLE WITH (TABLOCK)
    SELECT maybe_new_rows.ID
    FROM dbo.A_HEAP_OF_MOSTLY_NEW_ROWS maybe_new_rows
    WHERE NOT EXISTS (
    SELECT 1
    FROM dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR halloween
    WHERE maybe_new_rows.ID = halloween.ID
    )
    OPTION (MAXDOP 1, QUERYTRACEON 7470);

    INSERT INTO dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR WITH (TABLOCK)
    SELECT new_rows.ID
    FROM #CONSULTANT_RECOMMENDED_TEMP_TABLE new_rows
    OPTION (MAXDOP 1);


    The new code executes in about 4400 ms. I can get actual plans and use Actual Time Statistics™ to examine where time is spent at the operator level. Note that asking for an actual plan adds significant overhead for these queries so totals will not match the previous results.



    ╔═════════════╦═════════════╦══════════════╗
    ║ operator ║ first query ║ second query ║
    ╠═════════════╬═════════════╬══════════════╣
    ║ big scan ║ 1771 ║ 1744 ║
    ║ little scan ║ 163 ║ 166 ║
    ║ sort ║ 531 ║ 530 ║
    ║ merge join ║ 709 ║ 669 ║
    ║ spool ║ 3202 ║ N/A ║
    ║ temp insert ║ N/A ║ 422 ║
    ║ temp scan ║ N/A ║ 187 ║
    ║ insert ║ 3122 ║ 1545 ║
    ╚═════════════╩═════════════╩══════════════╝


    The query plan with the eager spool seems to spend significantly more time on the insert and spool operators compared to the plan that uses the temp table.



    Why is the plan with the temp table more efficient? Isn't an eager spool mostly just an internal temp table anyway? I believe I am looking for answers that focus on internals. I'm able to see how the call stacks are different but can't figure out the big picture.



    I am on SQL Server 2017 CU 11 in case someone wants to know. Here is code to populate the tables used in the above queries:



    DROP TABLE IF EXISTS dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR;

    CREATE TABLE dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR (
    ID BIGINT NOT NULL,
    PRIMARY KEY (ID)
    );

    INSERT INTO dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR WITH (TABLOCK)
    SELECT TOP (20000000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
    FROM master..spt_values t1
    CROSS JOIN master..spt_values t2
    CROSS JOIN master..spt_values t3
    OPTION (MAXDOP 1);


    DROP TABLE IF EXISTS dbo.A_HEAP_OF_MOSTLY_NEW_ROWS;

    CREATE TABLE dbo.A_HEAP_OF_MOSTLY_NEW_ROWS (
    ID BIGINT NOT NULL
    );

    INSERT INTO dbo.A_HEAP_OF_MOSTLY_NEW_ROWS WITH (TABLOCK)
    SELECT TOP (1900000) 19999999 + ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
    FROM master..spt_values t1
    CROSS JOIN master..spt_values t2;








    share

























      1












      1








      1








      Consider the following query that inserts rows from a source table only if they aren't already in the target table:



      INSERT INTO dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR WITH (TABLOCK)
      SELECT maybe_new_rows.ID
      FROM dbo.A_HEAP_OF_MOSTLY_NEW_ROWS maybe_new_rows
      WHERE NOT EXISTS (
      SELECT 1
      FROM dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR halloween
      WHERE maybe_new_rows.ID = halloween.ID
      )
      OPTION (MAXDOP 1, QUERYTRACEON 7470);


      One possible plan shape includes a merge join and an eager spool. The eager spool operator is present to solve the Halloween Problem:



      first plan



      On my machine, the above code executes in about 6900 ms. Repro code to create the tables is included at the bottom of the question. If I'm dissatisfied with performance I might try to load the rows to be inserted into a temp table instead of relying on the eager spool. Here's one possible implementation:



      DROP TABLE IF EXISTS #CONSULTANT_RECOMMENDED_TEMP_TABLE;
      CREATE TABLE #CONSULTANT_RECOMMENDED_TEMP_TABLE (
      ID BIGINT,
      PRIMARY KEY (ID)
      );

      INSERT INTO #CONSULTANT_RECOMMENDED_TEMP_TABLE WITH (TABLOCK)
      SELECT maybe_new_rows.ID
      FROM dbo.A_HEAP_OF_MOSTLY_NEW_ROWS maybe_new_rows
      WHERE NOT EXISTS (
      SELECT 1
      FROM dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR halloween
      WHERE maybe_new_rows.ID = halloween.ID
      )
      OPTION (MAXDOP 1, QUERYTRACEON 7470);

      INSERT INTO dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR WITH (TABLOCK)
      SELECT new_rows.ID
      FROM #CONSULTANT_RECOMMENDED_TEMP_TABLE new_rows
      OPTION (MAXDOP 1);


      The new code executes in about 4400 ms. I can get actual plans and use Actual Time Statistics™ to examine where time is spent at the operator level. Note that asking for an actual plan adds significant overhead for these queries so totals will not match the previous results.



      ╔═════════════╦═════════════╦══════════════╗
      ║ operator ║ first query ║ second query ║
      ╠═════════════╬═════════════╬══════════════╣
      ║ big scan ║ 1771 ║ 1744 ║
      ║ little scan ║ 163 ║ 166 ║
      ║ sort ║ 531 ║ 530 ║
      ║ merge join ║ 709 ║ 669 ║
      ║ spool ║ 3202 ║ N/A ║
      ║ temp insert ║ N/A ║ 422 ║
      ║ temp scan ║ N/A ║ 187 ║
      ║ insert ║ 3122 ║ 1545 ║
      ╚═════════════╩═════════════╩══════════════╝


      The query plan with the eager spool seems to spend significantly more time on the insert and spool operators compared to the plan that uses the temp table.



      Why is the plan with the temp table more efficient? Isn't an eager spool mostly just an internal temp table anyway? I believe I am looking for answers that focus on internals. I'm able to see how the call stacks are different but can't figure out the big picture.



      I am on SQL Server 2017 CU 11 in case someone wants to know. Here is code to populate the tables used in the above queries:



      DROP TABLE IF EXISTS dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR;

      CREATE TABLE dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR (
      ID BIGINT NOT NULL,
      PRIMARY KEY (ID)
      );

      INSERT INTO dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR WITH (TABLOCK)
      SELECT TOP (20000000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
      FROM master..spt_values t1
      CROSS JOIN master..spt_values t2
      CROSS JOIN master..spt_values t3
      OPTION (MAXDOP 1);


      DROP TABLE IF EXISTS dbo.A_HEAP_OF_MOSTLY_NEW_ROWS;

      CREATE TABLE dbo.A_HEAP_OF_MOSTLY_NEW_ROWS (
      ID BIGINT NOT NULL
      );

      INSERT INTO dbo.A_HEAP_OF_MOSTLY_NEW_ROWS WITH (TABLOCK)
      SELECT TOP (1900000) 19999999 + ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
      FROM master..spt_values t1
      CROSS JOIN master..spt_values t2;








      share














      Consider the following query that inserts rows from a source table only if they aren't already in the target table:



      INSERT INTO dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR WITH (TABLOCK)
      SELECT maybe_new_rows.ID
      FROM dbo.A_HEAP_OF_MOSTLY_NEW_ROWS maybe_new_rows
      WHERE NOT EXISTS (
      SELECT 1
      FROM dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR halloween
      WHERE maybe_new_rows.ID = halloween.ID
      )
      OPTION (MAXDOP 1, QUERYTRACEON 7470);


      One possible plan shape includes a merge join and an eager spool. The eager spool operator is present to solve the Halloween Problem:



      first plan



      On my machine, the above code executes in about 6900 ms. Repro code to create the tables is included at the bottom of the question. If I'm dissatisfied with performance I might try to load the rows to be inserted into a temp table instead of relying on the eager spool. Here's one possible implementation:



      DROP TABLE IF EXISTS #CONSULTANT_RECOMMENDED_TEMP_TABLE;
      CREATE TABLE #CONSULTANT_RECOMMENDED_TEMP_TABLE (
      ID BIGINT,
      PRIMARY KEY (ID)
      );

      INSERT INTO #CONSULTANT_RECOMMENDED_TEMP_TABLE WITH (TABLOCK)
      SELECT maybe_new_rows.ID
      FROM dbo.A_HEAP_OF_MOSTLY_NEW_ROWS maybe_new_rows
      WHERE NOT EXISTS (
      SELECT 1
      FROM dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR halloween
      WHERE maybe_new_rows.ID = halloween.ID
      )
      OPTION (MAXDOP 1, QUERYTRACEON 7470);

      INSERT INTO dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR WITH (TABLOCK)
      SELECT new_rows.ID
      FROM #CONSULTANT_RECOMMENDED_TEMP_TABLE new_rows
      OPTION (MAXDOP 1);


      The new code executes in about 4400 ms. I can get actual plans and use Actual Time Statistics™ to examine where time is spent at the operator level. Note that asking for an actual plan adds significant overhead for these queries so totals will not match the previous results.



      ╔═════════════╦═════════════╦══════════════╗
      ║ operator ║ first query ║ second query ║
      ╠═════════════╬═════════════╬══════════════╣
      ║ big scan ║ 1771 ║ 1744 ║
      ║ little scan ║ 163 ║ 166 ║
      ║ sort ║ 531 ║ 530 ║
      ║ merge join ║ 709 ║ 669 ║
      ║ spool ║ 3202 ║ N/A ║
      ║ temp insert ║ N/A ║ 422 ║
      ║ temp scan ║ N/A ║ 187 ║
      ║ insert ║ 3122 ║ 1545 ║
      ╚═════════════╩═════════════╩══════════════╝


      The query plan with the eager spool seems to spend significantly more time on the insert and spool operators compared to the plan that uses the temp table.



      Why is the plan with the temp table more efficient? Isn't an eager spool mostly just an internal temp table anyway? I believe I am looking for answers that focus on internals. I'm able to see how the call stacks are different but can't figure out the big picture.



      I am on SQL Server 2017 CU 11 in case someone wants to know. Here is code to populate the tables used in the above queries:



      DROP TABLE IF EXISTS dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR;

      CREATE TABLE dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR (
      ID BIGINT NOT NULL,
      PRIMARY KEY (ID)
      );

      INSERT INTO dbo.HALLOWEEN_IS_COMING_EARLY_THIS_YEAR WITH (TABLOCK)
      SELECT TOP (20000000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
      FROM master..spt_values t1
      CROSS JOIN master..spt_values t2
      CROSS JOIN master..spt_values t3
      OPTION (MAXDOP 1);


      DROP TABLE IF EXISTS dbo.A_HEAP_OF_MOSTLY_NEW_ROWS;

      CREATE TABLE dbo.A_HEAP_OF_MOSTLY_NEW_ROWS (
      ID BIGINT NOT NULL
      );

      INSERT INTO dbo.A_HEAP_OF_MOSTLY_NEW_ROWS WITH (TABLOCK)
      SELECT TOP (1900000) 19999999 + ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
      FROM master..spt_values t1
      CROSS JOIN master..spt_values t2;






      sql-server sql-server-2017 database-internals





      share












      share










      share



      share










      asked 3 mins ago









      Joe ObbishJoe Obbish

      21.1k33083




      21.1k33083






















          0






          active

          oldest

          votes











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "182"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f230722%2fwhy-is-a-temp-table-a-more-efficient-solution-to-the-halloween-problem-than-an-e%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Database Administrators Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f230722%2fwhy-is-a-temp-table-a-more-efficient-solution-to-the-halloween-problem-than-an-e%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Parapolítica Índice Antecedentes El escándalo Proceso judicial Consecuencias Véase...

          How to remove border from elements in the last row?Targeting flex items on the last rowHow to vertically wrap...

          Tecnologías entrañables Índice Antecedentes Desarrollo Tecnologías Entrañables en la...