Quickly appending data to text-columnMariaDB unicode text got truncatedMySQL is trimming my numbers and I...

Neighboring nodes in the network

Assassin's bullet with mercury

A reference to a well-known characterization of scattered compact spaces

How could indestructible materials be used in power generation?

Can a virus destroy the BIOS of a modern computer?

Arrow those variables!

I'm flying to France today and my passport expires in less than 2 months

What is the intuition behind short exact sequences of groups; in particular, what is the intuition behind group extensions?

Has there ever been an airliner design involving reducing generator load by installing solar panels?

Why is the ratio of two extensive quantities always intensive?

Were any external disk drives stacked vertically?

Watching something be written to a file live with tail

How to show the equivalence between the regularized regression and their constraint formulas using KKT

Facing a paradox: Earnshaw's theorem in one dimension

Can one be a co-translator of a book, if he does not know the language that the book is translated into?

Can I use a neutral wire from another outlet to repair a broken neutral?

How do conventional missiles fly?

How can I tell someone that I want to be his or her friend?

Does a druid starting with a bow start with no arrows?

Does casting Light, or a similar spell, have any effect when the caster is swallowed by a monster?

What killed these X2 caps?

Why do I get two different answers for this counting problem?

How do I find out when a node was added to an availability group?

Can I make "comment-region" comment empty lines?



Quickly appending data to text-column


MariaDB unicode text got truncatedMySQL is trimming my numbers and I can't figure out whyWhy is MySQL MariaDB GREATEST(timestamp,NOW()) padding appending results with zeros?Inserting BLOBs through PHP are always 0 bytesAppending IDs to duplicate descriptions for groups of data and single row dataFast index creation issueHow do you index a text column in MySQL?Innodb: after 48 hours of optimizing 10mb/sec write speedMariaDB LOAD DATA INFO can't import single column file to single column table?MySQL perfomance






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







0















I'm using a statement similar to the following to append data to a column of type mediumtext to a bunch of rows:



INSERT INTO myTable (myKey,myVal)
ON DUPLICATE KEY UPDATE myVal=CONCAT(myVal,VALUES(myVal))
VALUES (1,'foo'),(69,'bar'),(1337,'baz')


At first this is really fast. But the more data there already is the slower it gets. It seems that when appending data, the whole field is read, merged with the new bit and then inserted again.



This mysql bugreport brings up exactly this issue:
https://bugs.mysql.com/bug.php?id=47937



Is there any way of making this faster?










share|improve this question














bumped to the homepage by Community 2 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
















  • My guess is the answer is NO. A database (at least all the ones I've worked with) doesn't treat text as you would with a Java StringBuilder, but rather like Java String, which will suffer exactly the same problem.

    – joanolo
    Jun 30 '17 at 2:10











  • To paraphrase the bug report from 8 years ago, "yeah, nice to have, but not likely to happen".

    – Rick James
    Jul 7 '17 at 14:14


















0















I'm using a statement similar to the following to append data to a column of type mediumtext to a bunch of rows:



INSERT INTO myTable (myKey,myVal)
ON DUPLICATE KEY UPDATE myVal=CONCAT(myVal,VALUES(myVal))
VALUES (1,'foo'),(69,'bar'),(1337,'baz')


At first this is really fast. But the more data there already is the slower it gets. It seems that when appending data, the whole field is read, merged with the new bit and then inserted again.



This mysql bugreport brings up exactly this issue:
https://bugs.mysql.com/bug.php?id=47937



Is there any way of making this faster?










share|improve this question














bumped to the homepage by Community 2 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
















  • My guess is the answer is NO. A database (at least all the ones I've worked with) doesn't treat text as you would with a Java StringBuilder, but rather like Java String, which will suffer exactly the same problem.

    – joanolo
    Jun 30 '17 at 2:10











  • To paraphrase the bug report from 8 years ago, "yeah, nice to have, but not likely to happen".

    – Rick James
    Jul 7 '17 at 14:14














0












0








0








I'm using a statement similar to the following to append data to a column of type mediumtext to a bunch of rows:



INSERT INTO myTable (myKey,myVal)
ON DUPLICATE KEY UPDATE myVal=CONCAT(myVal,VALUES(myVal))
VALUES (1,'foo'),(69,'bar'),(1337,'baz')


At first this is really fast. But the more data there already is the slower it gets. It seems that when appending data, the whole field is read, merged with the new bit and then inserted again.



This mysql bugreport brings up exactly this issue:
https://bugs.mysql.com/bug.php?id=47937



Is there any way of making this faster?










share|improve this question














I'm using a statement similar to the following to append data to a column of type mediumtext to a bunch of rows:



INSERT INTO myTable (myKey,myVal)
ON DUPLICATE KEY UPDATE myVal=CONCAT(myVal,VALUES(myVal))
VALUES (1,'foo'),(69,'bar'),(1337,'baz')


At first this is really fast. But the more data there already is the slower it gets. It seems that when appending data, the whole field is read, merged with the new bit and then inserted again.



This mysql bugreport brings up exactly this issue:
https://bugs.mysql.com/bug.php?id=47937



Is there any way of making this faster?







mysql mariadb






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jun 30 '17 at 0:39









CloxClox

1012




1012





bumped to the homepage by Community 2 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 2 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • My guess is the answer is NO. A database (at least all the ones I've worked with) doesn't treat text as you would with a Java StringBuilder, but rather like Java String, which will suffer exactly the same problem.

    – joanolo
    Jun 30 '17 at 2:10











  • To paraphrase the bug report from 8 years ago, "yeah, nice to have, but not likely to happen".

    – Rick James
    Jul 7 '17 at 14:14



















  • My guess is the answer is NO. A database (at least all the ones I've worked with) doesn't treat text as you would with a Java StringBuilder, but rather like Java String, which will suffer exactly the same problem.

    – joanolo
    Jun 30 '17 at 2:10











  • To paraphrase the bug report from 8 years ago, "yeah, nice to have, but not likely to happen".

    – Rick James
    Jul 7 '17 at 14:14

















My guess is the answer is NO. A database (at least all the ones I've worked with) doesn't treat text as you would with a Java StringBuilder, but rather like Java String, which will suffer exactly the same problem.

– joanolo
Jun 30 '17 at 2:10





My guess is the answer is NO. A database (at least all the ones I've worked with) doesn't treat text as you would with a Java StringBuilder, but rather like Java String, which will suffer exactly the same problem.

– joanolo
Jun 30 '17 at 2:10













To paraphrase the bug report from 8 years ago, "yeah, nice to have, but not likely to happen".

– Rick James
Jul 7 '17 at 14:14





To paraphrase the bug report from 8 years ago, "yeah, nice to have, but not likely to happen".

– Rick James
Jul 7 '17 at 14:14










2 Answers
2






active

oldest

votes


















0















At first this is really fast. But the more data there already is the slower it gets. It seems that when appending data, the whole field is read, merged with the new bit and then inserted again.




MySQL is an MVCC database. It's one the results of an MVCC that rows must be rewritten entirely on update. It may be even more surprising to you, but it's considered an optimization if the index does not need to be written to when you're rewriting a row.




Assuming that mysql already knows the length of the field as part of the record storage, it should be possible to append data to a field without having to read it, so while the original query reads and writes the whole existing string, it should be possible to optimise that by recognising that source and destination fields are the same and simply appending to the existing value and increasing the stored length. This would return the complexity to a more manageable O(n), and thus give reasonable performance.




That would leave others vulnerable to a "dirty read." Likely there are too many real issues facing developers of MySQL to worry about this non-issue. You can't just write to a tuple. What if it's in someone else's snapshot? What if you want to ROLLBACK later? This is just simply not the way MVCC works. You copy and modify. Then you commit and flush.






share|improve this answer
























  • MySQL as MVCC: depends on engine; and I don't think it really works like PostgreSQL, but more like Oracle, which uses "undo logging" and actually keeps the versions there, and what considers "the last one" is written only once.

    – joanolo
    Jun 30 '17 at 2:15











  • See: dev.mysql.com/doc/refman/5.7/en/glossary.html#glos_undo_log. I guess doing things this way avoids the need to vacuum, but I think it works in a very asymmetrical fashion, which looks awful to me. If you have two transactions going on, I don't know how it decides which one version should be the one to put in undo and which one not.

    – joanolo
    Jun 30 '17 at 2:35





















0














My guess is the answer is NO. You won't make this any faster.



A database (at least all the ones I've worked with) doesn't treat text as you would with a Java StringBuilder, but rather like a Java String, which will suffer exactly the same problem (as would strings in most programming languages, except, maybe the V8 implementation of JavaScript). I wouldn't consider that a bug. It's a design decision. A database is not normally used in this fashion.



A StringBuilder has extra room for extra text, which is appended at the end of the already used space. When it actually runs out of space, it allocates a big chunk for further filling. A String is immutable and does not have extra room to add new text "at the end".



Perhaps what you should do is not modify your original row, and have instead a secondary related table where you store every string associated with myKey, together win an string_order column (could be an auto_increment, or a current timestamp) and, when needed, retrieve everything together with a GROUP_CONCAT of them. Your related table would just work as a kind of log, where you would be inserting the new "events" in order, a piece at a time.






share|improve this answer
























    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "182"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f177649%2fquickly-appending-data-to-text-column%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0















    At first this is really fast. But the more data there already is the slower it gets. It seems that when appending data, the whole field is read, merged with the new bit and then inserted again.




    MySQL is an MVCC database. It's one the results of an MVCC that rows must be rewritten entirely on update. It may be even more surprising to you, but it's considered an optimization if the index does not need to be written to when you're rewriting a row.




    Assuming that mysql already knows the length of the field as part of the record storage, it should be possible to append data to a field without having to read it, so while the original query reads and writes the whole existing string, it should be possible to optimise that by recognising that source and destination fields are the same and simply appending to the existing value and increasing the stored length. This would return the complexity to a more manageable O(n), and thus give reasonable performance.




    That would leave others vulnerable to a "dirty read." Likely there are too many real issues facing developers of MySQL to worry about this non-issue. You can't just write to a tuple. What if it's in someone else's snapshot? What if you want to ROLLBACK later? This is just simply not the way MVCC works. You copy and modify. Then you commit and flush.






    share|improve this answer
























    • MySQL as MVCC: depends on engine; and I don't think it really works like PostgreSQL, but more like Oracle, which uses "undo logging" and actually keeps the versions there, and what considers "the last one" is written only once.

      – joanolo
      Jun 30 '17 at 2:15











    • See: dev.mysql.com/doc/refman/5.7/en/glossary.html#glos_undo_log. I guess doing things this way avoids the need to vacuum, but I think it works in a very asymmetrical fashion, which looks awful to me. If you have two transactions going on, I don't know how it decides which one version should be the one to put in undo and which one not.

      – joanolo
      Jun 30 '17 at 2:35


















    0















    At first this is really fast. But the more data there already is the slower it gets. It seems that when appending data, the whole field is read, merged with the new bit and then inserted again.




    MySQL is an MVCC database. It's one the results of an MVCC that rows must be rewritten entirely on update. It may be even more surprising to you, but it's considered an optimization if the index does not need to be written to when you're rewriting a row.




    Assuming that mysql already knows the length of the field as part of the record storage, it should be possible to append data to a field without having to read it, so while the original query reads and writes the whole existing string, it should be possible to optimise that by recognising that source and destination fields are the same and simply appending to the existing value and increasing the stored length. This would return the complexity to a more manageable O(n), and thus give reasonable performance.




    That would leave others vulnerable to a "dirty read." Likely there are too many real issues facing developers of MySQL to worry about this non-issue. You can't just write to a tuple. What if it's in someone else's snapshot? What if you want to ROLLBACK later? This is just simply not the way MVCC works. You copy and modify. Then you commit and flush.






    share|improve this answer
























    • MySQL as MVCC: depends on engine; and I don't think it really works like PostgreSQL, but more like Oracle, which uses "undo logging" and actually keeps the versions there, and what considers "the last one" is written only once.

      – joanolo
      Jun 30 '17 at 2:15











    • See: dev.mysql.com/doc/refman/5.7/en/glossary.html#glos_undo_log. I guess doing things this way avoids the need to vacuum, but I think it works in a very asymmetrical fashion, which looks awful to me. If you have two transactions going on, I don't know how it decides which one version should be the one to put in undo and which one not.

      – joanolo
      Jun 30 '17 at 2:35
















    0












    0








    0








    At first this is really fast. But the more data there already is the slower it gets. It seems that when appending data, the whole field is read, merged with the new bit and then inserted again.




    MySQL is an MVCC database. It's one the results of an MVCC that rows must be rewritten entirely on update. It may be even more surprising to you, but it's considered an optimization if the index does not need to be written to when you're rewriting a row.




    Assuming that mysql already knows the length of the field as part of the record storage, it should be possible to append data to a field without having to read it, so while the original query reads and writes the whole existing string, it should be possible to optimise that by recognising that source and destination fields are the same and simply appending to the existing value and increasing the stored length. This would return the complexity to a more manageable O(n), and thus give reasonable performance.




    That would leave others vulnerable to a "dirty read." Likely there are too many real issues facing developers of MySQL to worry about this non-issue. You can't just write to a tuple. What if it's in someone else's snapshot? What if you want to ROLLBACK later? This is just simply not the way MVCC works. You copy and modify. Then you commit and flush.






    share|improve this answer














    At first this is really fast. But the more data there already is the slower it gets. It seems that when appending data, the whole field is read, merged with the new bit and then inserted again.




    MySQL is an MVCC database. It's one the results of an MVCC that rows must be rewritten entirely on update. It may be even more surprising to you, but it's considered an optimization if the index does not need to be written to when you're rewriting a row.




    Assuming that mysql already knows the length of the field as part of the record storage, it should be possible to append data to a field without having to read it, so while the original query reads and writes the whole existing string, it should be possible to optimise that by recognising that source and destination fields are the same and simply appending to the existing value and increasing the stored length. This would return the complexity to a more manageable O(n), and thus give reasonable performance.




    That would leave others vulnerable to a "dirty read." Likely there are too many real issues facing developers of MySQL to worry about this non-issue. You can't just write to a tuple. What if it's in someone else's snapshot? What if you want to ROLLBACK later? This is just simply not the way MVCC works. You copy and modify. Then you commit and flush.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Jun 30 '17 at 1:18









    Evan CarrollEvan Carroll

    33.4k1076232




    33.4k1076232













    • MySQL as MVCC: depends on engine; and I don't think it really works like PostgreSQL, but more like Oracle, which uses "undo logging" and actually keeps the versions there, and what considers "the last one" is written only once.

      – joanolo
      Jun 30 '17 at 2:15











    • See: dev.mysql.com/doc/refman/5.7/en/glossary.html#glos_undo_log. I guess doing things this way avoids the need to vacuum, but I think it works in a very asymmetrical fashion, which looks awful to me. If you have two transactions going on, I don't know how it decides which one version should be the one to put in undo and which one not.

      – joanolo
      Jun 30 '17 at 2:35





















    • MySQL as MVCC: depends on engine; and I don't think it really works like PostgreSQL, but more like Oracle, which uses "undo logging" and actually keeps the versions there, and what considers "the last one" is written only once.

      – joanolo
      Jun 30 '17 at 2:15











    • See: dev.mysql.com/doc/refman/5.7/en/glossary.html#glos_undo_log. I guess doing things this way avoids the need to vacuum, but I think it works in a very asymmetrical fashion, which looks awful to me. If you have two transactions going on, I don't know how it decides which one version should be the one to put in undo and which one not.

      – joanolo
      Jun 30 '17 at 2:35



















    MySQL as MVCC: depends on engine; and I don't think it really works like PostgreSQL, but more like Oracle, which uses "undo logging" and actually keeps the versions there, and what considers "the last one" is written only once.

    – joanolo
    Jun 30 '17 at 2:15





    MySQL as MVCC: depends on engine; and I don't think it really works like PostgreSQL, but more like Oracle, which uses "undo logging" and actually keeps the versions there, and what considers "the last one" is written only once.

    – joanolo
    Jun 30 '17 at 2:15













    See: dev.mysql.com/doc/refman/5.7/en/glossary.html#glos_undo_log. I guess doing things this way avoids the need to vacuum, but I think it works in a very asymmetrical fashion, which looks awful to me. If you have two transactions going on, I don't know how it decides which one version should be the one to put in undo and which one not.

    – joanolo
    Jun 30 '17 at 2:35







    See: dev.mysql.com/doc/refman/5.7/en/glossary.html#glos_undo_log. I guess doing things this way avoids the need to vacuum, but I think it works in a very asymmetrical fashion, which looks awful to me. If you have two transactions going on, I don't know how it decides which one version should be the one to put in undo and which one not.

    – joanolo
    Jun 30 '17 at 2:35















    0














    My guess is the answer is NO. You won't make this any faster.



    A database (at least all the ones I've worked with) doesn't treat text as you would with a Java StringBuilder, but rather like a Java String, which will suffer exactly the same problem (as would strings in most programming languages, except, maybe the V8 implementation of JavaScript). I wouldn't consider that a bug. It's a design decision. A database is not normally used in this fashion.



    A StringBuilder has extra room for extra text, which is appended at the end of the already used space. When it actually runs out of space, it allocates a big chunk for further filling. A String is immutable and does not have extra room to add new text "at the end".



    Perhaps what you should do is not modify your original row, and have instead a secondary related table where you store every string associated with myKey, together win an string_order column (could be an auto_increment, or a current timestamp) and, when needed, retrieve everything together with a GROUP_CONCAT of them. Your related table would just work as a kind of log, where you would be inserting the new "events" in order, a piece at a time.






    share|improve this answer




























      0














      My guess is the answer is NO. You won't make this any faster.



      A database (at least all the ones I've worked with) doesn't treat text as you would with a Java StringBuilder, but rather like a Java String, which will suffer exactly the same problem (as would strings in most programming languages, except, maybe the V8 implementation of JavaScript). I wouldn't consider that a bug. It's a design decision. A database is not normally used in this fashion.



      A StringBuilder has extra room for extra text, which is appended at the end of the already used space. When it actually runs out of space, it allocates a big chunk for further filling. A String is immutable and does not have extra room to add new text "at the end".



      Perhaps what you should do is not modify your original row, and have instead a secondary related table where you store every string associated with myKey, together win an string_order column (could be an auto_increment, or a current timestamp) and, when needed, retrieve everything together with a GROUP_CONCAT of them. Your related table would just work as a kind of log, where you would be inserting the new "events" in order, a piece at a time.






      share|improve this answer


























        0












        0








        0







        My guess is the answer is NO. You won't make this any faster.



        A database (at least all the ones I've worked with) doesn't treat text as you would with a Java StringBuilder, but rather like a Java String, which will suffer exactly the same problem (as would strings in most programming languages, except, maybe the V8 implementation of JavaScript). I wouldn't consider that a bug. It's a design decision. A database is not normally used in this fashion.



        A StringBuilder has extra room for extra text, which is appended at the end of the already used space. When it actually runs out of space, it allocates a big chunk for further filling. A String is immutable and does not have extra room to add new text "at the end".



        Perhaps what you should do is not modify your original row, and have instead a secondary related table where you store every string associated with myKey, together win an string_order column (could be an auto_increment, or a current timestamp) and, when needed, retrieve everything together with a GROUP_CONCAT of them. Your related table would just work as a kind of log, where you would be inserting the new "events" in order, a piece at a time.






        share|improve this answer













        My guess is the answer is NO. You won't make this any faster.



        A database (at least all the ones I've worked with) doesn't treat text as you would with a Java StringBuilder, but rather like a Java String, which will suffer exactly the same problem (as would strings in most programming languages, except, maybe the V8 implementation of JavaScript). I wouldn't consider that a bug. It's a design decision. A database is not normally used in this fashion.



        A StringBuilder has extra room for extra text, which is appended at the end of the already used space. When it actually runs out of space, it allocates a big chunk for further filling. A String is immutable and does not have extra room to add new text "at the end".



        Perhaps what you should do is not modify your original row, and have instead a secondary related table where you store every string associated with myKey, together win an string_order column (could be an auto_increment, or a current timestamp) and, when needed, retrieve everything together with a GROUP_CONCAT of them. Your related table would just work as a kind of log, where you would be inserting the new "events" in order, a piece at a time.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jun 30 '17 at 2:23









        joanolojoanolo

        9,87842154




        9,87842154






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Database Administrators Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f177649%2fquickly-appending-data-to-text-column%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Parapolítica Índice Antecedentes El escándalo Proceso judicial Consecuencias Véase...

            How to remove border from elements in the last row?Targeting flex items on the last rowHow to vertically wrap...

            Tecnologías entrañables Índice Antecedentes Desarrollo Tecnologías Entrañables en la...