How to optimize a Full Text Search Announcing the arrival of Valued Associate #679: Cesar...

Output the ŋarâþ crîþ alphabet song without using (m)any letters

Generate an RGB colour grid

What exactly is a "Meth" in Altered Carbon?

Echoing a tail command produces unexpected output?

Withdrew £2800, but only £2000 shows as withdrawn on online banking; what are my obligations?

What is Arya's weapon design?

What causes the vertical darker bands in my photo?

Gordon Ramsay Pudding Recipe

Is it fair for a professor to grade us on the possession of past papers?

How do I stop a creek from eroding my steep embankment?

How to call a function with default parameter through a pointer to function that is the return of another function?

3 doors, three guards, one stone

How to align text above triangle figure

Can a non-EU citizen traveling with me come with me through the EU passport line?

Do I really need recursive chmod to restrict access to a folder?

At the end of Thor: Ragnarok why don't the Asgardians turn and head for the Bifrost as per their original plan?

Fundamental Solution of the Pell Equation

How to react to hostile behavior from a senior developer?

How widely used is the term Treppenwitz? Is it something that most Germans know?

Why am I getting the error "non-boolean type specified in a context where a condition is expected" for this request?

When were vectors invented?

Extract all GPU name, model and GPU ram

Can an alien society believe that their star system is the universe?

Compare a given version number in the form major.minor.build.patch and see if one is less than the other



How to optimize a Full Text Search



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)Why full-text-search returns less rows than LIKEwhy/how does the number of matched columns influences the way of excecuting a queryMySQL Fulltext match with forward slashesQuery that was working on MySQL is not working on PostgreSQLMysql Full Text Search with word delimiterFulltext match against a number doesn't return any resultsMySQL indexing issue when trying to search for a part of a string/wordHow to perform FullText search on numeric column in MySQL to meet user requirementsOptimizing a “categorized” search table using trigramsWhy full text search on table with GIN index is still very slow





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







0















So we have this VARCHAR(255) column called code, which is a string of varying size we use in a WHERE clause.



Doing this search slows the query down by about 300ms and since the query is run a few hundred times a day, this amounts to quite some time lost.



The statement looks something like this:



SELECT *
FROM table1 t1
WHERE t1.code LIKE '%so%'


Where 'so' is the start of the string 'something'.



I've optimized the rest of the query, so this is really the only part that is left unoptimized.



I've tried adding an index to the column, however, since the input string doesn't always match the start of the string, it results in a search that could be anywhere within the string, meaning that a B-tree index doesn't work.



I've also tried adding a 'FULLTEXT' index, but it didn't speed up my query, it only used more disk space.




  • Is there any way to speed up a full text search with variable length strings , where the searched string could be at any position within each row of the column?


The problem with the FULLTEXT index is that when using match against it has a 50% threshold, which I don't really understand. The effect it has is that it returns an empty result set, because it matches too much.










share|improve this question
















bumped to the homepage by Community 10 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
















  • Please provide the entire SHOW CREATE TABLE and some examples of the 255 characters in "code", plus more examples of search strings (in addition to '%so%'; we might be able to come up with a workaround involving either the schema or the technique for searching.

    – Rick James
    May 3 '17 at 14:41











  • @RickJames Unfortunately I do not have permission for showing the create table output, as well as giving the structure of "code". All I can say is that code is a generated string based on multiple results of our application layer. We don't store these results in another table or column

    – FMashiro
    May 4 '17 at 8:25


















0















So we have this VARCHAR(255) column called code, which is a string of varying size we use in a WHERE clause.



Doing this search slows the query down by about 300ms and since the query is run a few hundred times a day, this amounts to quite some time lost.



The statement looks something like this:



SELECT *
FROM table1 t1
WHERE t1.code LIKE '%so%'


Where 'so' is the start of the string 'something'.



I've optimized the rest of the query, so this is really the only part that is left unoptimized.



I've tried adding an index to the column, however, since the input string doesn't always match the start of the string, it results in a search that could be anywhere within the string, meaning that a B-tree index doesn't work.



I've also tried adding a 'FULLTEXT' index, but it didn't speed up my query, it only used more disk space.




  • Is there any way to speed up a full text search with variable length strings , where the searched string could be at any position within each row of the column?


The problem with the FULLTEXT index is that when using match against it has a 50% threshold, which I don't really understand. The effect it has is that it returns an empty result set, because it matches too much.










share|improve this question
















bumped to the homepage by Community 10 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
















  • Please provide the entire SHOW CREATE TABLE and some examples of the 255 characters in "code", plus more examples of search strings (in addition to '%so%'; we might be able to come up with a workaround involving either the schema or the technique for searching.

    – Rick James
    May 3 '17 at 14:41











  • @RickJames Unfortunately I do not have permission for showing the create table output, as well as giving the structure of "code". All I can say is that code is a generated string based on multiple results of our application layer. We don't store these results in another table or column

    – FMashiro
    May 4 '17 at 8:25














0












0








0








So we have this VARCHAR(255) column called code, which is a string of varying size we use in a WHERE clause.



Doing this search slows the query down by about 300ms and since the query is run a few hundred times a day, this amounts to quite some time lost.



The statement looks something like this:



SELECT *
FROM table1 t1
WHERE t1.code LIKE '%so%'


Where 'so' is the start of the string 'something'.



I've optimized the rest of the query, so this is really the only part that is left unoptimized.



I've tried adding an index to the column, however, since the input string doesn't always match the start of the string, it results in a search that could be anywhere within the string, meaning that a B-tree index doesn't work.



I've also tried adding a 'FULLTEXT' index, but it didn't speed up my query, it only used more disk space.




  • Is there any way to speed up a full text search with variable length strings , where the searched string could be at any position within each row of the column?


The problem with the FULLTEXT index is that when using match against it has a 50% threshold, which I don't really understand. The effect it has is that it returns an empty result set, because it matches too much.










share|improve this question
















So we have this VARCHAR(255) column called code, which is a string of varying size we use in a WHERE clause.



Doing this search slows the query down by about 300ms and since the query is run a few hundred times a day, this amounts to quite some time lost.



The statement looks something like this:



SELECT *
FROM table1 t1
WHERE t1.code LIKE '%so%'


Where 'so' is the start of the string 'something'.



I've optimized the rest of the query, so this is really the only part that is left unoptimized.



I've tried adding an index to the column, however, since the input string doesn't always match the start of the string, it results in a search that could be anywhere within the string, meaning that a B-tree index doesn't work.



I've also tried adding a 'FULLTEXT' index, but it didn't speed up my query, it only used more disk space.




  • Is there any way to speed up a full text search with variable length strings , where the searched string could be at any position within each row of the column?


The problem with the FULLTEXT index is that when using match against it has a 50% threshold, which I don't really understand. The effect it has is that it returns an empty result set, because it matches too much.







mysql index innodb optimization






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 15 '18 at 17:19









Oreo

1,174217




1,174217










asked May 3 '17 at 6:25









FMashiroFMashiro

199112




199112





bumped to the homepage by Community 10 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 10 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • Please provide the entire SHOW CREATE TABLE and some examples of the 255 characters in "code", plus more examples of search strings (in addition to '%so%'; we might be able to come up with a workaround involving either the schema or the technique for searching.

    – Rick James
    May 3 '17 at 14:41











  • @RickJames Unfortunately I do not have permission for showing the create table output, as well as giving the structure of "code". All I can say is that code is a generated string based on multiple results of our application layer. We don't store these results in another table or column

    – FMashiro
    May 4 '17 at 8:25



















  • Please provide the entire SHOW CREATE TABLE and some examples of the 255 characters in "code", plus more examples of search strings (in addition to '%so%'; we might be able to come up with a workaround involving either the schema or the technique for searching.

    – Rick James
    May 3 '17 at 14:41











  • @RickJames Unfortunately I do not have permission for showing the create table output, as well as giving the structure of "code". All I can say is that code is a generated string based on multiple results of our application layer. We don't store these results in another table or column

    – FMashiro
    May 4 '17 at 8:25

















Please provide the entire SHOW CREATE TABLE and some examples of the 255 characters in "code", plus more examples of search strings (in addition to '%so%'; we might be able to come up with a workaround involving either the schema or the technique for searching.

– Rick James
May 3 '17 at 14:41





Please provide the entire SHOW CREATE TABLE and some examples of the 255 characters in "code", plus more examples of search strings (in addition to '%so%'; we might be able to come up with a workaround involving either the schema or the technique for searching.

– Rick James
May 3 '17 at 14:41













@RickJames Unfortunately I do not have permission for showing the create table output, as well as giving the structure of "code". All I can say is that code is a generated string based on multiple results of our application layer. We don't store these results in another table or column

– FMashiro
May 4 '17 at 8:25





@RickJames Unfortunately I do not have permission for showing the create table output, as well as giving the structure of "code". All I can say is that code is a generated string based on multiple results of our application layer. We don't store these results in another table or column

– FMashiro
May 4 '17 at 8:25










1 Answer
1






active

oldest

votes


















0














From Performance analysis of MySQL's FULLTEXT indexes and LIKE queries for full text search by Henning Koch:




FULLTEXT performs better when your text has low redundancy

FULLTEXT performance differs by a factor of 78 between a vocabulary of 1,000 words and 100,000 words. I guess that larger vocabularies result in a very wide but shallow inverted index that can quickly determine if a query has matches or not. An educated person has a passive vocabulary of 15,000 to 20,000 words, so FULLTEXT should work well for natural language texts.







share|improve this answer


























  • It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches

    – FMashiro
    May 3 '17 at 7:29











  • Can you not tune the FT index threshold to give back fewer results?

    – Oreo
    Feb 15 '18 at 17:01












Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f172577%2fhow-to-optimize-a-full-text-search%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














From Performance analysis of MySQL's FULLTEXT indexes and LIKE queries for full text search by Henning Koch:




FULLTEXT performs better when your text has low redundancy

FULLTEXT performance differs by a factor of 78 between a vocabulary of 1,000 words and 100,000 words. I guess that larger vocabularies result in a very wide but shallow inverted index that can quickly determine if a query has matches or not. An educated person has a passive vocabulary of 15,000 to 20,000 words, so FULLTEXT should work well for natural language texts.







share|improve this answer


























  • It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches

    – FMashiro
    May 3 '17 at 7:29











  • Can you not tune the FT index threshold to give back fewer results?

    – Oreo
    Feb 15 '18 at 17:01
















0














From Performance analysis of MySQL's FULLTEXT indexes and LIKE queries for full text search by Henning Koch:




FULLTEXT performs better when your text has low redundancy

FULLTEXT performance differs by a factor of 78 between a vocabulary of 1,000 words and 100,000 words. I guess that larger vocabularies result in a very wide but shallow inverted index that can quickly determine if a query has matches or not. An educated person has a passive vocabulary of 15,000 to 20,000 words, so FULLTEXT should work well for natural language texts.







share|improve this answer


























  • It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches

    – FMashiro
    May 3 '17 at 7:29











  • Can you not tune the FT index threshold to give back fewer results?

    – Oreo
    Feb 15 '18 at 17:01














0












0








0







From Performance analysis of MySQL's FULLTEXT indexes and LIKE queries for full text search by Henning Koch:




FULLTEXT performs better when your text has low redundancy

FULLTEXT performance differs by a factor of 78 between a vocabulary of 1,000 words and 100,000 words. I guess that larger vocabularies result in a very wide but shallow inverted index that can quickly determine if a query has matches or not. An educated person has a passive vocabulary of 15,000 to 20,000 words, so FULLTEXT should work well for natural language texts.







share|improve this answer















From Performance analysis of MySQL's FULLTEXT indexes and LIKE queries for full text search by Henning Koch:




FULLTEXT performs better when your text has low redundancy

FULLTEXT performance differs by a factor of 78 between a vocabulary of 1,000 words and 100,000 words. I guess that larger vocabularies result in a very wide but shallow inverted index that can quickly determine if a query has matches or not. An educated person has a passive vocabulary of 15,000 to 20,000 words, so FULLTEXT should work well for natural language texts.








share|improve this answer














share|improve this answer



share|improve this answer








edited May 26 '17 at 15:01









Paul White

54.2k14288461




54.2k14288461










answered May 3 '17 at 7:00









l.lijithl.lijith

4552716




4552716













  • It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches

    – FMashiro
    May 3 '17 at 7:29











  • Can you not tune the FT index threshold to give back fewer results?

    – Oreo
    Feb 15 '18 at 17:01



















  • It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches

    – FMashiro
    May 3 '17 at 7:29











  • Can you not tune the FT index threshold to give back fewer results?

    – Oreo
    Feb 15 '18 at 17:01

















It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches

– FMashiro
May 3 '17 at 7:29





It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches

– FMashiro
May 3 '17 at 7:29













Can you not tune the FT index threshold to give back fewer results?

– Oreo
Feb 15 '18 at 17:01





Can you not tune the FT index threshold to give back fewer results?

– Oreo
Feb 15 '18 at 17:01


















draft saved

draft discarded




















































Thanks for contributing an answer to Database Administrators Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f172577%2fhow-to-optimize-a-full-text-search%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

ORA-01691 (unable to extend lob segment) even though my tablespace has AUTOEXTEND onORA-01692: unable to...

Always On Availability groups resolving state after failover - Remote harden of transaction...

Circunscripción electoral de Guipúzcoa Referencias Menú de navegaciónLas claves del sistema electoral en...