Optimizing table/indices for getting the latest row (filtering by one additional column)Indexing -...

Science Fiction story where a man invents a machine that can help him watch history unfold

Is there an Impartial Brexit Deal comparison site?

Why is delta-v is the most useful quantity for planning space travel?

Have I saved too much for retirement so far?

How do ultrasonic sensors differentiate between transmitted and received signals?

Why isn't KTEX's runway designation 10/28 instead of 9/27?

Is there any significance to the Valyrian Stone vault door of Qarth?

Lightning Web Component - do I need to track changes for every single input field in a form

Why does this part of the Space Shuttle launch pad seem to be floating in air?

Can the harmonic series explain the origin of the major scale?

Is the next prime number always the next number divisible by the current prime number, except for any numbers previously divisible by primes?

How can a jailer prevent the Forge Cleric's Artisan's Blessing from being used?

What (else) happened July 1st 1858 in London?

What if somebody invests in my application?

Simple recursive Sudoku solver

Installing PowerShell on 32-bit Kali OS fails

word describing multiple paths to the same abstract outcome

How to deal with or prevent idle in the test team?

In Star Trek IV, why did the Bounty go back to a time when whales were already rare?

Why are on-board computers allowed to change controls without notifying the pilots?

Is a naturally all "male" species possible?

The One-Electron Universe postulate is true - what simple change can I make to change the whole universe?

What is the term when two people sing in harmony, but they aren't singing the same notes?

Can I create an upright 7-foot × 5-foot wall with the Minor Illusion spell?



Optimizing table/indices for getting the latest row (filtering by one additional column)


Indexing - Uniqueidentifier Foreign Key or Intermediary mapping table?Oracle 11. Updating BLOB field. Db file sequential read inappropriately slow?Postgresql table with one integer column, sorted index, with duplicate primary keyindex mysql concatenated columnsChanging a column from NOT NULL to NULL - What's going on under the hood?Should I add a clustered Index on a foreign key that is not unique?deteriorating stored procedure running timesConsolidating multiple related queries, each using different window and/or criteria for counting recordsINSERT/SELECT xml column from one table to anotherStrange query plan when using OR in JOIN clause - Constant scan for every row in table













5















I have a SQL Server table (SQL Server 2012 SP3 Standard edition) that stores a bunch of configuration information (basically text blobs) for different organizations. The schema is something like this:



[ConfigurationID]       INT IDENTITY (1,1) NOT NULL,
[OrganizationID] INT NOT NULL,
[TimestampUtc] DATETIME NOT NULL,
[ConfigurationData] NVARCHAR (MAX) NOT NULL,
[ChangedBy] NVARCHAR (256) NOT NULL,
[Comment] NVARCHAR (MAX) NOT NULL,
[ChangeType] INT NOT NULL


The TimestampUtc will always be increasing (I won't ever be INSERTing "back-dated" entries into the table), and the rows won't ever be UPDATEd (I'm only INSERTing new rows). For some OrganizationIDs there will be lots of rows, for some very few, and a new row for any OrganizationID may be INSERTed at any time.



If needed, I can guarantee uniqueness of TimestampUtc (but it would be great to have a solution that didn't need that).



INSERTs are relatively rare (at most dozens of times per day, but typically much less than that), reads are very frequent (essentially on every web request to my application).



My goals are:




  • Getting the ConfigurationData with the latest TimestampUtc for a given OrganizationID should be extremely fast regardless of the size of the table

  • INSERT performance doesn't matter too much, but I'd like to avoid horrible index fragmentation if at all possible (so my first idea of a unique clustered index on OrganizationID ASC, TimestampUtc DESC is probably not a great idea).


Questions



I know I can denormalize and just store the latest ConfigurationData in one table and the historical log of previous values in another table, but is it possible to meet my goals with just one table? What's the best way to do it? (I.e. what's the best index structure? do I need to change anything about the table schema, etc.?)










share|improve this question
















bumped to the homepage by Community 9 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
















  • Which version and edition of sql server are you using?

    – SqlZim
    Feb 16 '17 at 17:06











  • SQL Server 2012 SP3 Standard, I've added this to the question

    – Eugene O
    Feb 16 '17 at 17:13








  • 4





    Why do you place index fragmentation in such high regard? The index you describe sounds perfect (though if you are going after a specific OrgID, I wonder why that wouldn't be a non-clustered index, or why it has to be unique, since timestamps aren't guaranteed to be unique). Worry about fragmentation when it actually proves to be an obvious performance problem. Until then, design for workload performance, not to avoid potential fragmentation goblins.

    – Aaron Bertrand
    Feb 16 '17 at 17:42













  • With my proposed index, wouldn't it add a whole new page on literally every insert? Is this not something I should be worried about? Also, why would you recommend a non-clustered index?

    – Eugene O
    Feb 16 '17 at 18:47











  • Will your two NVARCHAR(MAX) columns potentially be very long? I assume that if so, you will not be indexing those columns. Check the discussion on NVARCHAR(MAX) at: stackoverflow.com/questions/148398/…

    – RLF
    Feb 16 '17 at 22:05


















5















I have a SQL Server table (SQL Server 2012 SP3 Standard edition) that stores a bunch of configuration information (basically text blobs) for different organizations. The schema is something like this:



[ConfigurationID]       INT IDENTITY (1,1) NOT NULL,
[OrganizationID] INT NOT NULL,
[TimestampUtc] DATETIME NOT NULL,
[ConfigurationData] NVARCHAR (MAX) NOT NULL,
[ChangedBy] NVARCHAR (256) NOT NULL,
[Comment] NVARCHAR (MAX) NOT NULL,
[ChangeType] INT NOT NULL


The TimestampUtc will always be increasing (I won't ever be INSERTing "back-dated" entries into the table), and the rows won't ever be UPDATEd (I'm only INSERTing new rows). For some OrganizationIDs there will be lots of rows, for some very few, and a new row for any OrganizationID may be INSERTed at any time.



If needed, I can guarantee uniqueness of TimestampUtc (but it would be great to have a solution that didn't need that).



INSERTs are relatively rare (at most dozens of times per day, but typically much less than that), reads are very frequent (essentially on every web request to my application).



My goals are:




  • Getting the ConfigurationData with the latest TimestampUtc for a given OrganizationID should be extremely fast regardless of the size of the table

  • INSERT performance doesn't matter too much, but I'd like to avoid horrible index fragmentation if at all possible (so my first idea of a unique clustered index on OrganizationID ASC, TimestampUtc DESC is probably not a great idea).


Questions



I know I can denormalize and just store the latest ConfigurationData in one table and the historical log of previous values in another table, but is it possible to meet my goals with just one table? What's the best way to do it? (I.e. what's the best index structure? do I need to change anything about the table schema, etc.?)










share|improve this question
















bumped to the homepage by Community 9 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
















  • Which version and edition of sql server are you using?

    – SqlZim
    Feb 16 '17 at 17:06











  • SQL Server 2012 SP3 Standard, I've added this to the question

    – Eugene O
    Feb 16 '17 at 17:13








  • 4





    Why do you place index fragmentation in such high regard? The index you describe sounds perfect (though if you are going after a specific OrgID, I wonder why that wouldn't be a non-clustered index, or why it has to be unique, since timestamps aren't guaranteed to be unique). Worry about fragmentation when it actually proves to be an obvious performance problem. Until then, design for workload performance, not to avoid potential fragmentation goblins.

    – Aaron Bertrand
    Feb 16 '17 at 17:42













  • With my proposed index, wouldn't it add a whole new page on literally every insert? Is this not something I should be worried about? Also, why would you recommend a non-clustered index?

    – Eugene O
    Feb 16 '17 at 18:47











  • Will your two NVARCHAR(MAX) columns potentially be very long? I assume that if so, you will not be indexing those columns. Check the discussion on NVARCHAR(MAX) at: stackoverflow.com/questions/148398/…

    – RLF
    Feb 16 '17 at 22:05
















5












5








5


1






I have a SQL Server table (SQL Server 2012 SP3 Standard edition) that stores a bunch of configuration information (basically text blobs) for different organizations. The schema is something like this:



[ConfigurationID]       INT IDENTITY (1,1) NOT NULL,
[OrganizationID] INT NOT NULL,
[TimestampUtc] DATETIME NOT NULL,
[ConfigurationData] NVARCHAR (MAX) NOT NULL,
[ChangedBy] NVARCHAR (256) NOT NULL,
[Comment] NVARCHAR (MAX) NOT NULL,
[ChangeType] INT NOT NULL


The TimestampUtc will always be increasing (I won't ever be INSERTing "back-dated" entries into the table), and the rows won't ever be UPDATEd (I'm only INSERTing new rows). For some OrganizationIDs there will be lots of rows, for some very few, and a new row for any OrganizationID may be INSERTed at any time.



If needed, I can guarantee uniqueness of TimestampUtc (but it would be great to have a solution that didn't need that).



INSERTs are relatively rare (at most dozens of times per day, but typically much less than that), reads are very frequent (essentially on every web request to my application).



My goals are:




  • Getting the ConfigurationData with the latest TimestampUtc for a given OrganizationID should be extremely fast regardless of the size of the table

  • INSERT performance doesn't matter too much, but I'd like to avoid horrible index fragmentation if at all possible (so my first idea of a unique clustered index on OrganizationID ASC, TimestampUtc DESC is probably not a great idea).


Questions



I know I can denormalize and just store the latest ConfigurationData in one table and the historical log of previous values in another table, but is it possible to meet my goals with just one table? What's the best way to do it? (I.e. what's the best index structure? do I need to change anything about the table schema, etc.?)










share|improve this question
















I have a SQL Server table (SQL Server 2012 SP3 Standard edition) that stores a bunch of configuration information (basically text blobs) for different organizations. The schema is something like this:



[ConfigurationID]       INT IDENTITY (1,1) NOT NULL,
[OrganizationID] INT NOT NULL,
[TimestampUtc] DATETIME NOT NULL,
[ConfigurationData] NVARCHAR (MAX) NOT NULL,
[ChangedBy] NVARCHAR (256) NOT NULL,
[Comment] NVARCHAR (MAX) NOT NULL,
[ChangeType] INT NOT NULL


The TimestampUtc will always be increasing (I won't ever be INSERTing "back-dated" entries into the table), and the rows won't ever be UPDATEd (I'm only INSERTing new rows). For some OrganizationIDs there will be lots of rows, for some very few, and a new row for any OrganizationID may be INSERTed at any time.



If needed, I can guarantee uniqueness of TimestampUtc (but it would be great to have a solution that didn't need that).



INSERTs are relatively rare (at most dozens of times per day, but typically much less than that), reads are very frequent (essentially on every web request to my application).



My goals are:




  • Getting the ConfigurationData with the latest TimestampUtc for a given OrganizationID should be extremely fast regardless of the size of the table

  • INSERT performance doesn't matter too much, but I'd like to avoid horrible index fragmentation if at all possible (so my first idea of a unique clustered index on OrganizationID ASC, TimestampUtc DESC is probably not a great idea).


Questions



I know I can denormalize and just store the latest ConfigurationData in one table and the historical log of previous values in another table, but is it possible to meet my goals with just one table? What's the best way to do it? (I.e. what's the best index structure? do I need to change anything about the table schema, etc.?)







sql-server database-design sql-server-2012 performance index






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 16 '17 at 18:31









MDCCL

6,85331745




6,85331745










asked Feb 16 '17 at 16:54









Eugene OEugene O

1263




1263





bumped to the homepage by Community 9 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 9 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • Which version and edition of sql server are you using?

    – SqlZim
    Feb 16 '17 at 17:06











  • SQL Server 2012 SP3 Standard, I've added this to the question

    – Eugene O
    Feb 16 '17 at 17:13








  • 4





    Why do you place index fragmentation in such high regard? The index you describe sounds perfect (though if you are going after a specific OrgID, I wonder why that wouldn't be a non-clustered index, or why it has to be unique, since timestamps aren't guaranteed to be unique). Worry about fragmentation when it actually proves to be an obvious performance problem. Until then, design for workload performance, not to avoid potential fragmentation goblins.

    – Aaron Bertrand
    Feb 16 '17 at 17:42













  • With my proposed index, wouldn't it add a whole new page on literally every insert? Is this not something I should be worried about? Also, why would you recommend a non-clustered index?

    – Eugene O
    Feb 16 '17 at 18:47











  • Will your two NVARCHAR(MAX) columns potentially be very long? I assume that if so, you will not be indexing those columns. Check the discussion on NVARCHAR(MAX) at: stackoverflow.com/questions/148398/…

    – RLF
    Feb 16 '17 at 22:05





















  • Which version and edition of sql server are you using?

    – SqlZim
    Feb 16 '17 at 17:06











  • SQL Server 2012 SP3 Standard, I've added this to the question

    – Eugene O
    Feb 16 '17 at 17:13








  • 4





    Why do you place index fragmentation in such high regard? The index you describe sounds perfect (though if you are going after a specific OrgID, I wonder why that wouldn't be a non-clustered index, or why it has to be unique, since timestamps aren't guaranteed to be unique). Worry about fragmentation when it actually proves to be an obvious performance problem. Until then, design for workload performance, not to avoid potential fragmentation goblins.

    – Aaron Bertrand
    Feb 16 '17 at 17:42













  • With my proposed index, wouldn't it add a whole new page on literally every insert? Is this not something I should be worried about? Also, why would you recommend a non-clustered index?

    – Eugene O
    Feb 16 '17 at 18:47











  • Will your two NVARCHAR(MAX) columns potentially be very long? I assume that if so, you will not be indexing those columns. Check the discussion on NVARCHAR(MAX) at: stackoverflow.com/questions/148398/…

    – RLF
    Feb 16 '17 at 22:05



















Which version and edition of sql server are you using?

– SqlZim
Feb 16 '17 at 17:06





Which version and edition of sql server are you using?

– SqlZim
Feb 16 '17 at 17:06













SQL Server 2012 SP3 Standard, I've added this to the question

– Eugene O
Feb 16 '17 at 17:13







SQL Server 2012 SP3 Standard, I've added this to the question

– Eugene O
Feb 16 '17 at 17:13






4




4





Why do you place index fragmentation in such high regard? The index you describe sounds perfect (though if you are going after a specific OrgID, I wonder why that wouldn't be a non-clustered index, or why it has to be unique, since timestamps aren't guaranteed to be unique). Worry about fragmentation when it actually proves to be an obvious performance problem. Until then, design for workload performance, not to avoid potential fragmentation goblins.

– Aaron Bertrand
Feb 16 '17 at 17:42







Why do you place index fragmentation in such high regard? The index you describe sounds perfect (though if you are going after a specific OrgID, I wonder why that wouldn't be a non-clustered index, or why it has to be unique, since timestamps aren't guaranteed to be unique). Worry about fragmentation when it actually proves to be an obvious performance problem. Until then, design for workload performance, not to avoid potential fragmentation goblins.

– Aaron Bertrand
Feb 16 '17 at 17:42















With my proposed index, wouldn't it add a whole new page on literally every insert? Is this not something I should be worried about? Also, why would you recommend a non-clustered index?

– Eugene O
Feb 16 '17 at 18:47





With my proposed index, wouldn't it add a whole new page on literally every insert? Is this not something I should be worried about? Also, why would you recommend a non-clustered index?

– Eugene O
Feb 16 '17 at 18:47













Will your two NVARCHAR(MAX) columns potentially be very long? I assume that if so, you will not be indexing those columns. Check the discussion on NVARCHAR(MAX) at: stackoverflow.com/questions/148398/…

– RLF
Feb 16 '17 at 22:05







Will your two NVARCHAR(MAX) columns potentially be very long? I assume that if so, you will not be indexing those columns. Check the discussion on NVARCHAR(MAX) at: stackoverflow.com/questions/148398/…

– RLF
Feb 16 '17 at 22:05












2 Answers
2






active

oldest

votes


















0














Given the low rate of inserts, your proposed index will be completely fine, and perfect for the usage goal.



Given a fresh index with fill factor 100%, and sufficient history for each organisation to fill a page, there will be a page split on the first subsequent insert for each organisation. But then there won't be another page split for that organisation until the new page is filled up.

Even these splits and fragmentation can be mitigated by starting with a <100 fill factor, and regular reorganise.



Aaron's thought regarding using non-clustered was probably because that would allow you to cluster on the always ascending identity column, ensuring the only splits and fragmentation you will get is in a very compact separate index. But I suspect he was only mentioning that in the context of you wanting very much to avoid fragmentation, and not as something really necessary in this scenario.






share|improve this answer































    0














    If you alter your table and add a flag for the active records you can use a filtered index, in this case a unique on OrganizationID where IsActive=1. Of course you had to update the currently active record for an Organization before you can insert the new active entry.



    CREATE UNIQUE INDEX UI_test ON dbo.test (OrganizationID) WHERE IsActive=1





    share|improve this answer























      Your Answer








      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "182"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f164568%2foptimizing-table-indices-for-getting-the-latest-row-filtering-by-one-additional%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0














      Given the low rate of inserts, your proposed index will be completely fine, and perfect for the usage goal.



      Given a fresh index with fill factor 100%, and sufficient history for each organisation to fill a page, there will be a page split on the first subsequent insert for each organisation. But then there won't be another page split for that organisation until the new page is filled up.

      Even these splits and fragmentation can be mitigated by starting with a <100 fill factor, and regular reorganise.



      Aaron's thought regarding using non-clustered was probably because that would allow you to cluster on the always ascending identity column, ensuring the only splits and fragmentation you will get is in a very compact separate index. But I suspect he was only mentioning that in the context of you wanting very much to avoid fragmentation, and not as something really necessary in this scenario.






      share|improve this answer




























        0














        Given the low rate of inserts, your proposed index will be completely fine, and perfect for the usage goal.



        Given a fresh index with fill factor 100%, and sufficient history for each organisation to fill a page, there will be a page split on the first subsequent insert for each organisation. But then there won't be another page split for that organisation until the new page is filled up.

        Even these splits and fragmentation can be mitigated by starting with a <100 fill factor, and regular reorganise.



        Aaron's thought regarding using non-clustered was probably because that would allow you to cluster on the always ascending identity column, ensuring the only splits and fragmentation you will get is in a very compact separate index. But I suspect he was only mentioning that in the context of you wanting very much to avoid fragmentation, and not as something really necessary in this scenario.






        share|improve this answer


























          0












          0








          0







          Given the low rate of inserts, your proposed index will be completely fine, and perfect for the usage goal.



          Given a fresh index with fill factor 100%, and sufficient history for each organisation to fill a page, there will be a page split on the first subsequent insert for each organisation. But then there won't be another page split for that organisation until the new page is filled up.

          Even these splits and fragmentation can be mitigated by starting with a <100 fill factor, and regular reorganise.



          Aaron's thought regarding using non-clustered was probably because that would allow you to cluster on the always ascending identity column, ensuring the only splits and fragmentation you will get is in a very compact separate index. But I suspect he was only mentioning that in the context of you wanting very much to avoid fragmentation, and not as something really necessary in this scenario.






          share|improve this answer













          Given the low rate of inserts, your proposed index will be completely fine, and perfect for the usage goal.



          Given a fresh index with fill factor 100%, and sufficient history for each organisation to fill a page, there will be a page split on the first subsequent insert for each organisation. But then there won't be another page split for that organisation until the new page is filled up.

          Even these splits and fragmentation can be mitigated by starting with a <100 fill factor, and regular reorganise.



          Aaron's thought regarding using non-clustered was probably because that would allow you to cluster on the always ascending identity column, ensuring the only splits and fragmentation you will get is in a very compact separate index. But I suspect he was only mentioning that in the context of you wanting very much to avoid fragmentation, and not as something really necessary in this scenario.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Feb 17 '17 at 15:18









          T.H.T.H.

          81536




          81536

























              0














              If you alter your table and add a flag for the active records you can use a filtered index, in this case a unique on OrganizationID where IsActive=1. Of course you had to update the currently active record for an Organization before you can insert the new active entry.



              CREATE UNIQUE INDEX UI_test ON dbo.test (OrganizationID) WHERE IsActive=1





              share|improve this answer




























                0














                If you alter your table and add a flag for the active records you can use a filtered index, in this case a unique on OrganizationID where IsActive=1. Of course you had to update the currently active record for an Organization before you can insert the new active entry.



                CREATE UNIQUE INDEX UI_test ON dbo.test (OrganizationID) WHERE IsActive=1





                share|improve this answer


























                  0












                  0








                  0







                  If you alter your table and add a flag for the active records you can use a filtered index, in this case a unique on OrganizationID where IsActive=1. Of course you had to update the currently active record for an Organization before you can insert the new active entry.



                  CREATE UNIQUE INDEX UI_test ON dbo.test (OrganizationID) WHERE IsActive=1





                  share|improve this answer













                  If you alter your table and add a flag for the active records you can use a filtered index, in this case a unique on OrganizationID where IsActive=1. Of course you had to update the currently active record for an Organization before you can insert the new active entry.



                  CREATE UNIQUE INDEX UI_test ON dbo.test (OrganizationID) WHERE IsActive=1






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Jun 16 '17 at 11:03









                  Stefan WilmsStefan Wilms

                  954




                  954






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Database Administrators Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f164568%2foptimizing-table-indices-for-getting-the-latest-row-filtering-by-one-additional%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Anexo:Material bélico de la Fuerza Aérea de Chile Índice Aeronaves Defensa...

                      Always On Availability groups resolving state after failover - Remote harden of transaction...

                      update json value to null Announcing the arrival of Valued Associate #679: Cesar Manara ...