2012年3月11日星期日

Full text index ranking algorithm in SQL Server 2005

I am using full text index to search on one table. Table can be
simplified as (DiscussionID, Discussions) and DiscussionID is the
primary key.
I use freetexttable instead of containstable here.
For example, I want to search on Discussions field with key word
'exchange server'. Interestingly, the row with "exchange server" exact
match ranks lower, for those with "server" ranks higher.
So, I change to ' "exchange server" ', note, use double quote to look
for exact match. However, this won't return me those discussions having
'server' or 'exchange' keywords.
Anyone knows more how the ranking algorithm works? What is the
difference in searching single word and phrases? I am sure it is not as
simple as the most matched. And is there way to provide different
ranking algorithm or even customized ranking algorithm? I know it
cannot be done in 7.0 or 2000, but not sure in 2005.
Thanks,
Shelly
Basically the density or frequency of the keyword occurring relative to the
other keywords will contribute to a higher rank. So as server occurs more
frequently it gets a higher ranking.
The actual formula is Okapi BM-25.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"shellyshao" <shellyshao@.gmail.com> wrote in message
news:1131671866.196093.296410@.g14g2000cwa.googlegr oups.com...
> I am using full text index to search on one table. Table can be
> simplified as (DiscussionID, Discussions) and DiscussionID is the
> primary key.
> I use freetexttable instead of containstable here.
> For example, I want to search on Discussions field with key word
> 'exchange server'. Interestingly, the row with "exchange server" exact
> match ranks lower, for those with "server" ranks higher.
> So, I change to ' "exchange server" ', note, use double quote to look
> for exact match. However, this won't return me those discussions having
> 'server' or 'exchange' keywords.
> Anyone knows more how the ranking algorithm works? What is the
> difference in searching single word and phrases? I am sure it is not as
> simple as the most matched. And is there way to provide different
> ranking algorithm or even customized ranking algorithm? I know it
> cannot be done in 7.0 or 2000, but not sure in 2005.
> Thanks,
> Shelly
>
|||Shelly,
You may want to review this Dec. 2003 SQL Server 2005 FTS white paper for
info on how the ranking algorithms work:
SQL Server 2005 Full-Text Search: Internals and Enhancements (published
December 2003)
http://msdn.microsoft.com/library/de...05ftsearch.asp
Ranking of CONTAINSTABLE
StatisticalWeight = Log2( ( 2 + IndexDocumentCount ) / KeyDocumentCount )
Rank = min( MaxQueryRank, HitCount * 16 * StatisticalWeight /
MaxOccurrence )
Ranking of FREETEXT [or FREETEXTTABLE]
Freetext ranking is based on the OKAPI BM25 ranking formula. Each term in
the query is ranked, and the values are summed. Freetext queries will add
words to the query via inflectional generation (stemmed forms of the
original query terms); these words are treated as separate terms with no
special weighting or relationship with the words from which they were
generated. Synonyms generated from the Thesaurus feature are treated as
separate, equally weighted terms.
Regards,
John
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/
"shellyshao" <shellyshao@.gmail.com> wrote in message
news:1131671866.196093.296410@.g14g2000cwa.googlegr oups.com...
>I am using full text index to search on one table. Table can be
> simplified as (DiscussionID, Discussions) and DiscussionID is the
> primary key.
> I use freetexttable instead of containstable here.
> For example, I want to search on Discussions field with key word
> 'exchange server'. Interestingly, the row with "exchange server" exact
> match ranks lower, for those with "server" ranks higher.
> So, I change to ' "exchange server" ', note, use double quote to look
> for exact match. However, this won't return me those discussions having
> 'server' or 'exchange' keywords.
> Anyone knows more how the ranking algorithm works? What is the
> difference in searching single word and phrases? I am sure it is not as
> simple as the most matched. And is there way to provide different
> ranking algorithm or even customized ranking algorithm? I know it
> cannot be done in 7.0 or 2000, but not sure in 2005.
> Thanks,
> Shelly
>

没有评论:

发表评论