How to optimize a Full Text Search Announcing the arrival of Valued Associate #679: Cesar...
Output the ŋarâþ crîþ alphabet song without using (m)any letters
Generate an RGB colour grid
What exactly is a "Meth" in Altered Carbon?
Echoing a tail command produces unexpected output?
Withdrew £2800, but only £2000 shows as withdrawn on online banking; what are my obligations?
What is Arya's weapon design?
What causes the vertical darker bands in my photo?
Gordon Ramsay Pudding Recipe
Is it fair for a professor to grade us on the possession of past papers?
How do I stop a creek from eroding my steep embankment?
How to call a function with default parameter through a pointer to function that is the return of another function?
3 doors, three guards, one stone
How to align text above triangle figure
Can a non-EU citizen traveling with me come with me through the EU passport line?
Do I really need recursive chmod to restrict access to a folder?
At the end of Thor: Ragnarok why don't the Asgardians turn and head for the Bifrost as per their original plan?
Fundamental Solution of the Pell Equation
How to react to hostile behavior from a senior developer?
How widely used is the term Treppenwitz? Is it something that most Germans know?
Why am I getting the error "non-boolean type specified in a context where a condition is expected" for this request?
When were vectors invented?
Extract all GPU name, model and GPU ram
Can an alien society believe that their star system is the universe?
Compare a given version number in the form major.minor.build.patch and see if one is less than the other
How to optimize a Full Text Search
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)Why full-text-search returns less rows than LIKEwhy/how does the number of matched columns influences the way of excecuting a queryMySQL Fulltext match with forward slashesQuery that was working on MySQL is not working on PostgreSQLMysql Full Text Search with word delimiterFulltext match against a number doesn't return any resultsMySQL indexing issue when trying to search for a part of a string/wordHow to perform FullText search on numeric column in MySQL to meet user requirementsOptimizing a “categorized” search table using trigramsWhy full text search on table with GIN index is still very slow
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}
So we have this VARCHAR(255)
column called code
, which is a string of varying size we use in a WHERE
clause.
Doing this search slows the query down by about 300ms and since the query is run a few hundred times a day, this amounts to quite some time lost.
The statement looks something like this:
SELECT *
FROM table1 t1
WHERE t1.code LIKE '%so%'
Where 'so' is the start of the string 'something'.
I've optimized the rest of the query, so this is really the only part that is left unoptimized.
I've tried adding an index to the column, however, since the input string doesn't always match the start of the string, it results in a search that could be anywhere within the string, meaning that a B-tree index doesn't work.
I've also tried adding a 'FULLTEXT' index, but it didn't speed up my query, it only used more disk space.
- Is there any way to speed up a full text search with variable length strings , where the searched string could be at any position within each row of the column?
The problem with the FULLTEXT index is that when using match against it has a 50% threshold, which I don't really understand. The effect it has is that it returns an empty result set, because it matches too much.
mysql index innodb optimization
bumped to the homepage by Community♦ 10 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
So we have this VARCHAR(255)
column called code
, which is a string of varying size we use in a WHERE
clause.
Doing this search slows the query down by about 300ms and since the query is run a few hundred times a day, this amounts to quite some time lost.
The statement looks something like this:
SELECT *
FROM table1 t1
WHERE t1.code LIKE '%so%'
Where 'so' is the start of the string 'something'.
I've optimized the rest of the query, so this is really the only part that is left unoptimized.
I've tried adding an index to the column, however, since the input string doesn't always match the start of the string, it results in a search that could be anywhere within the string, meaning that a B-tree index doesn't work.
I've also tried adding a 'FULLTEXT' index, but it didn't speed up my query, it only used more disk space.
- Is there any way to speed up a full text search with variable length strings , where the searched string could be at any position within each row of the column?
The problem with the FULLTEXT index is that when using match against it has a 50% threshold, which I don't really understand. The effect it has is that it returns an empty result set, because it matches too much.
mysql index innodb optimization
bumped to the homepage by Community♦ 10 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Please provide the entireSHOW CREATE TABLE
and some examples of the 255 characters in "code", plus more examples of search strings (in addition to '%so%'; we might be able to come up with a workaround involving either the schema or the technique for searching.
– Rick James
May 3 '17 at 14:41
@RickJames Unfortunately I do not have permission for showing the create table output, as well as giving the structure of "code". All I can say is that code is a generated string based on multiple results of our application layer. We don't store these results in another table or column
– FMashiro
May 4 '17 at 8:25
add a comment |
So we have this VARCHAR(255)
column called code
, which is a string of varying size we use in a WHERE
clause.
Doing this search slows the query down by about 300ms and since the query is run a few hundred times a day, this amounts to quite some time lost.
The statement looks something like this:
SELECT *
FROM table1 t1
WHERE t1.code LIKE '%so%'
Where 'so' is the start of the string 'something'.
I've optimized the rest of the query, so this is really the only part that is left unoptimized.
I've tried adding an index to the column, however, since the input string doesn't always match the start of the string, it results in a search that could be anywhere within the string, meaning that a B-tree index doesn't work.
I've also tried adding a 'FULLTEXT' index, but it didn't speed up my query, it only used more disk space.
- Is there any way to speed up a full text search with variable length strings , where the searched string could be at any position within each row of the column?
The problem with the FULLTEXT index is that when using match against it has a 50% threshold, which I don't really understand. The effect it has is that it returns an empty result set, because it matches too much.
mysql index innodb optimization
So we have this VARCHAR(255)
column called code
, which is a string of varying size we use in a WHERE
clause.
Doing this search slows the query down by about 300ms and since the query is run a few hundred times a day, this amounts to quite some time lost.
The statement looks something like this:
SELECT *
FROM table1 t1
WHERE t1.code LIKE '%so%'
Where 'so' is the start of the string 'something'.
I've optimized the rest of the query, so this is really the only part that is left unoptimized.
I've tried adding an index to the column, however, since the input string doesn't always match the start of the string, it results in a search that could be anywhere within the string, meaning that a B-tree index doesn't work.
I've also tried adding a 'FULLTEXT' index, but it didn't speed up my query, it only used more disk space.
- Is there any way to speed up a full text search with variable length strings , where the searched string could be at any position within each row of the column?
The problem with the FULLTEXT index is that when using match against it has a 50% threshold, which I don't really understand. The effect it has is that it returns an empty result set, because it matches too much.
mysql index innodb optimization
mysql index innodb optimization
edited Feb 15 '18 at 17:19
Oreo
1,174217
1,174217
asked May 3 '17 at 6:25
FMashiroFMashiro
199112
199112
bumped to the homepage by Community♦ 10 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 10 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Please provide the entireSHOW CREATE TABLE
and some examples of the 255 characters in "code", plus more examples of search strings (in addition to '%so%'; we might be able to come up with a workaround involving either the schema or the technique for searching.
– Rick James
May 3 '17 at 14:41
@RickJames Unfortunately I do not have permission for showing the create table output, as well as giving the structure of "code". All I can say is that code is a generated string based on multiple results of our application layer. We don't store these results in another table or column
– FMashiro
May 4 '17 at 8:25
add a comment |
Please provide the entireSHOW CREATE TABLE
and some examples of the 255 characters in "code", plus more examples of search strings (in addition to '%so%'; we might be able to come up with a workaround involving either the schema or the technique for searching.
– Rick James
May 3 '17 at 14:41
@RickJames Unfortunately I do not have permission for showing the create table output, as well as giving the structure of "code". All I can say is that code is a generated string based on multiple results of our application layer. We don't store these results in another table or column
– FMashiro
May 4 '17 at 8:25
Please provide the entire
SHOW CREATE TABLE
and some examples of the 255 characters in "code", plus more examples of search strings (in addition to '%so%'; we might be able to come up with a workaround involving either the schema or the technique for searching.– Rick James
May 3 '17 at 14:41
Please provide the entire
SHOW CREATE TABLE
and some examples of the 255 characters in "code", plus more examples of search strings (in addition to '%so%'; we might be able to come up with a workaround involving either the schema or the technique for searching.– Rick James
May 3 '17 at 14:41
@RickJames Unfortunately I do not have permission for showing the create table output, as well as giving the structure of "code". All I can say is that code is a generated string based on multiple results of our application layer. We don't store these results in another table or column
– FMashiro
May 4 '17 at 8:25
@RickJames Unfortunately I do not have permission for showing the create table output, as well as giving the structure of "code". All I can say is that code is a generated string based on multiple results of our application layer. We don't store these results in another table or column
– FMashiro
May 4 '17 at 8:25
add a comment |
1 Answer
1
active
oldest
votes
From Performance analysis of MySQL's FULLTEXT indexes and LIKE queries for full text search by Henning Koch:
FULLTEXT performs better when your text has low redundancy
FULLTEXT performance differs by a factor of 78 between a vocabulary of 1,000 words and 100,000 words. I guess that larger vocabularies result in a very wide but shallow inverted index that can quickly determine if a query has matches or not. An educated person has a passive vocabulary of 15,000 to 20,000 words, so FULLTEXT should work well for natural language texts.
It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches
– FMashiro
May 3 '17 at 7:29
Can you not tune the FT index threshold to give back fewer results?
– Oreo
Feb 15 '18 at 17:01
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f172577%2fhow-to-optimize-a-full-text-search%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
From Performance analysis of MySQL's FULLTEXT indexes and LIKE queries for full text search by Henning Koch:
FULLTEXT performs better when your text has low redundancy
FULLTEXT performance differs by a factor of 78 between a vocabulary of 1,000 words and 100,000 words. I guess that larger vocabularies result in a very wide but shallow inverted index that can quickly determine if a query has matches or not. An educated person has a passive vocabulary of 15,000 to 20,000 words, so FULLTEXT should work well for natural language texts.
It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches
– FMashiro
May 3 '17 at 7:29
Can you not tune the FT index threshold to give back fewer results?
– Oreo
Feb 15 '18 at 17:01
add a comment |
From Performance analysis of MySQL's FULLTEXT indexes and LIKE queries for full text search by Henning Koch:
FULLTEXT performs better when your text has low redundancy
FULLTEXT performance differs by a factor of 78 between a vocabulary of 1,000 words and 100,000 words. I guess that larger vocabularies result in a very wide but shallow inverted index that can quickly determine if a query has matches or not. An educated person has a passive vocabulary of 15,000 to 20,000 words, so FULLTEXT should work well for natural language texts.
It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches
– FMashiro
May 3 '17 at 7:29
Can you not tune the FT index threshold to give back fewer results?
– Oreo
Feb 15 '18 at 17:01
add a comment |
From Performance analysis of MySQL's FULLTEXT indexes and LIKE queries for full text search by Henning Koch:
FULLTEXT performs better when your text has low redundancy
FULLTEXT performance differs by a factor of 78 between a vocabulary of 1,000 words and 100,000 words. I guess that larger vocabularies result in a very wide but shallow inverted index that can quickly determine if a query has matches or not. An educated person has a passive vocabulary of 15,000 to 20,000 words, so FULLTEXT should work well for natural language texts.
From Performance analysis of MySQL's FULLTEXT indexes and LIKE queries for full text search by Henning Koch:
FULLTEXT performs better when your text has low redundancy
FULLTEXT performance differs by a factor of 78 between a vocabulary of 1,000 words and 100,000 words. I guess that larger vocabularies result in a very wide but shallow inverted index that can quickly determine if a query has matches or not. An educated person has a passive vocabulary of 15,000 to 20,000 words, so FULLTEXT should work well for natural language texts.
edited May 26 '17 at 15:01
Paul White♦
54.2k14288461
54.2k14288461
answered May 3 '17 at 7:00
l.lijithl.lijith
4552716
4552716
It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches
– FMashiro
May 3 '17 at 7:29
Can you not tune the FT index threshold to give back fewer results?
– Oreo
Feb 15 '18 at 17:01
add a comment |
It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches
– FMashiro
May 3 '17 at 7:29
Can you not tune the FT index threshold to give back fewer results?
– Oreo
Feb 15 '18 at 17:01
It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches
– FMashiro
May 3 '17 at 7:29
It doesn't necessarily have a low redundancy, however the string we search are so small and generic that it doesn't return anything because of the threshold used in FT index searches
– FMashiro
May 3 '17 at 7:29
Can you not tune the FT index threshold to give back fewer results?
– Oreo
Feb 15 '18 at 17:01
Can you not tune the FT index threshold to give back fewer results?
– Oreo
Feb 15 '18 at 17:01
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f172577%2fhow-to-optimize-a-full-text-search%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Please provide the entire
SHOW CREATE TABLE
and some examples of the 255 characters in "code", plus more examples of search strings (in addition to '%so%'; we might be able to come up with a workaround involving either the schema or the technique for searching.– Rick James
May 3 '17 at 14:41
@RickJames Unfortunately I do not have permission for showing the create table output, as well as giving the structure of "code". All I can say is that code is a generated string based on multiple results of our application layer. We don't store these results in another table or column
– FMashiro
May 4 '17 at 8:25