PostgreSQL: how to partition 1000 tenants Announcing the arrival of Valued Associate #679:...

If a VARCHAR(MAX) column is included in an index, is the entire value always stored in the index page(s)?

Why are both D and D# fitting into my E minor key?

Denied boarding although I have proper visa and documentation. To whom should I make a complaint?

Wu formula for manifolds with boundary

また usage in a dictionary

Do wooden building fires get hotter than 600°C?

What would be the ideal power source for a cybernetic eye?

For a new assistant professor in CS, how to build/manage a publication pipeline

Did MS DOS itself ever use blinking text?

Irreducible of finite Krull dimension implies quasi-compact?

Has negative voting ever been officially implemented in elections, or seriously proposed, or even studied?

What do you call the main part of a joke?

Extracting terms with certain heads in a function

Can a party unilaterally change candidates in preparation for a General election?

Can an alien society believe that their star system is the universe?

Closed form of recurrent arithmetic series summation

Withdrew £2800, but only £2000 shows as withdrawn on online banking; what are my obligations?

Can anything be seen from the center of the Boötes void? How dark would it be?

Does classifying an integer as a discrete log require it be part of a multiplicative group?

Should I use a zero-interest credit card for a large one-time purchase?

How do pianists reach extremely loud dynamics?

When a candle burns, why does the top of wick glow if bottom of flame is hottest?

Around usage results

How to react to hostile behavior from a senior developer?



PostgreSQL: how to partition 1000 tenants



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)PostgreSQL partition hierarchy design for centralized loggingHuge database logging of event type rows and ways to optimize itPostgres - Partitioning old tables, partition query planning, optimisationPostgreSQL partition pruning?Slow queries on billions-rows-table // index usedMultiple databases with medium tables or Single database with huge tablesHow can I implement a sequence for each foreign key value?Scalable query for running counts of events within x previous daysPostgreSQL ntile() partitionWhat are the downsides of using PARTITION for multi-tenant access?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







0















Setup



We have a multi-tenant app that has about 1000 customers more or less. When a customer churns we delete all their data after a period. We have a few tables that are pretty big and we're considering using partitioning to split them per customer.



Problem



1000 tenants (customers) are a lot of partitions - is it reasonable to do this on PostgreSQL?



More details



Currently, the separation between our tenants is via an account_id column on all tables in the DB. There are a few tables that are pretty big. For example, there is an event table (the one I'm interested in partitioning) that contains audit logs and other events for everything that happens in our app.



Here are a few facts about the event table:




  • It contains about 300M rows + a few of composite/partial indexes.

  • The count of events by account_id is very uneven, 5% accounts have 50% of the data.

  • There is a timestamp field and a few others (JSONB, author_id, etc..)

  • Write operations: inserts mostly and deletes (per account_id). Deletes can potentially be millions of rows. No updates. Deletes of big accounts are rare and not a big performance issue at the moment.

  • Selects are either for a specific event (by account_id + id) or all events in a given period of time. Period of time is not always set. account_id is always present in the query.


Possible solutions



Partionining by account_id:



Pros:




  • Deletes will be fast because of DROP TABLE.

  • Queries should also be pretty decent since all queries contains WHERE account_id=123


Cons:




  • 1000 partitions too much for Postgres?

  • Uneven distribution of events per account creating a few super big partitions and a few small ones.


Partitioning by timestamp:



Pros:




  • Recent data is usually mostly accessed and will make queries with a timestamp faster.

  • More predictable/even distribution of events.


Cons:




  • Deletion of a single account might touch a lot of partitions - not a big concern.

  • Always needs to contain a filter by timestamp - which is not always possible.









share





























    0















    Setup



    We have a multi-tenant app that has about 1000 customers more or less. When a customer churns we delete all their data after a period. We have a few tables that are pretty big and we're considering using partitioning to split them per customer.



    Problem



    1000 tenants (customers) are a lot of partitions - is it reasonable to do this on PostgreSQL?



    More details



    Currently, the separation between our tenants is via an account_id column on all tables in the DB. There are a few tables that are pretty big. For example, there is an event table (the one I'm interested in partitioning) that contains audit logs and other events for everything that happens in our app.



    Here are a few facts about the event table:




    • It contains about 300M rows + a few of composite/partial indexes.

    • The count of events by account_id is very uneven, 5% accounts have 50% of the data.

    • There is a timestamp field and a few others (JSONB, author_id, etc..)

    • Write operations: inserts mostly and deletes (per account_id). Deletes can potentially be millions of rows. No updates. Deletes of big accounts are rare and not a big performance issue at the moment.

    • Selects are either for a specific event (by account_id + id) or all events in a given period of time. Period of time is not always set. account_id is always present in the query.


    Possible solutions



    Partionining by account_id:



    Pros:




    • Deletes will be fast because of DROP TABLE.

    • Queries should also be pretty decent since all queries contains WHERE account_id=123


    Cons:




    • 1000 partitions too much for Postgres?

    • Uneven distribution of events per account creating a few super big partitions and a few small ones.


    Partitioning by timestamp:



    Pros:




    • Recent data is usually mostly accessed and will make queries with a timestamp faster.

    • More predictable/even distribution of events.


    Cons:




    • Deletion of a single account might touch a lot of partitions - not a big concern.

    • Always needs to contain a filter by timestamp - which is not always possible.









    share

























      0












      0








      0








      Setup



      We have a multi-tenant app that has about 1000 customers more or less. When a customer churns we delete all their data after a period. We have a few tables that are pretty big and we're considering using partitioning to split them per customer.



      Problem



      1000 tenants (customers) are a lot of partitions - is it reasonable to do this on PostgreSQL?



      More details



      Currently, the separation between our tenants is via an account_id column on all tables in the DB. There are a few tables that are pretty big. For example, there is an event table (the one I'm interested in partitioning) that contains audit logs and other events for everything that happens in our app.



      Here are a few facts about the event table:




      • It contains about 300M rows + a few of composite/partial indexes.

      • The count of events by account_id is very uneven, 5% accounts have 50% of the data.

      • There is a timestamp field and a few others (JSONB, author_id, etc..)

      • Write operations: inserts mostly and deletes (per account_id). Deletes can potentially be millions of rows. No updates. Deletes of big accounts are rare and not a big performance issue at the moment.

      • Selects are either for a specific event (by account_id + id) or all events in a given period of time. Period of time is not always set. account_id is always present in the query.


      Possible solutions



      Partionining by account_id:



      Pros:




      • Deletes will be fast because of DROP TABLE.

      • Queries should also be pretty decent since all queries contains WHERE account_id=123


      Cons:




      • 1000 partitions too much for Postgres?

      • Uneven distribution of events per account creating a few super big partitions and a few small ones.


      Partitioning by timestamp:



      Pros:




      • Recent data is usually mostly accessed and will make queries with a timestamp faster.

      • More predictable/even distribution of events.


      Cons:




      • Deletion of a single account might touch a lot of partitions - not a big concern.

      • Always needs to contain a filter by timestamp - which is not always possible.









      share














      Setup



      We have a multi-tenant app that has about 1000 customers more or less. When a customer churns we delete all their data after a period. We have a few tables that are pretty big and we're considering using partitioning to split them per customer.



      Problem



      1000 tenants (customers) are a lot of partitions - is it reasonable to do this on PostgreSQL?



      More details



      Currently, the separation between our tenants is via an account_id column on all tables in the DB. There are a few tables that are pretty big. For example, there is an event table (the one I'm interested in partitioning) that contains audit logs and other events for everything that happens in our app.



      Here are a few facts about the event table:




      • It contains about 300M rows + a few of composite/partial indexes.

      • The count of events by account_id is very uneven, 5% accounts have 50% of the data.

      • There is a timestamp field and a few others (JSONB, author_id, etc..)

      • Write operations: inserts mostly and deletes (per account_id). Deletes can potentially be millions of rows. No updates. Deletes of big accounts are rare and not a big performance issue at the moment.

      • Selects are either for a specific event (by account_id + id) or all events in a given period of time. Period of time is not always set. account_id is always present in the query.


      Possible solutions



      Partionining by account_id:



      Pros:




      • Deletes will be fast because of DROP TABLE.

      • Queries should also be pretty decent since all queries contains WHERE account_id=123


      Cons:




      • 1000 partitions too much for Postgres?

      • Uneven distribution of events per account creating a few super big partitions and a few small ones.


      Partitioning by timestamp:



      Pros:




      • Recent data is usually mostly accessed and will make queries with a timestamp faster.

      • More predictable/even distribution of events.


      Cons:




      • Deletion of a single account might touch a lot of partitions - not a big concern.

      • Always needs to contain a filter by timestamp - which is not always possible.







      postgresql





      share












      share










      share



      share










      asked 2 mins ago









      Alex PlugaruAlex Plugaru

      1162




      1162






















          0






          active

          oldest

          votes












          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "182"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f235108%2fpostgresql-how-to-partition-1000-tenants%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Database Administrators Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f235108%2fpostgresql-how-to-partition-1000-tenants%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Anexo:Material bélico de la Fuerza Aérea de Chile Índice Aeronaves Defensa...

          Always On Availability groups resolving state after failover - Remote harden of transaction...

          update json value to null Announcing the arrival of Valued Associate #679: Cesar Manara ...