2012年3月21日星期三

Full text Pattern Search

Hi,
Would below mentioned thing is possible in sqlserver 2005 full text.
User Table - Table
Id Description - Column name
1 'Mission Impossible' - a row
I want to ge this row if the search criteria is 'pos'. it is simillar
to ' %pos%' in like search
is it possible to get that row using Contains() or Freetext(). I
tried the following, but it doesnt return any rows.
CONTAINS(Description, ' "*pos*" ')
~ Arjun
No, you can't do this directly. While you can do some suffix (end of word)
based searches, you can't do prefix (beginning of word) based searches.
It may be possible to use the thesaurus expansion feature if you know in
advance what all stem would be.
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
<priyananth@.gmail.com> wrote in message
news:1160545045.049080.147130@.m73g2000cwd.googlegr oups.com...
> Hi,
> Would below mentioned thing is possible in sqlserver 2005 full text.
> User Table - Table
> Id Description - Column name
> 1 'Mission Impossible' - a row
> I want to ge this row if the search criteria is 'pos'. it is simillar
> to ' %pos%' in like search
> is it possible to get that row using Contains() or Freetext(). I
> tried the following, but it doesnt return any rows.
> CONTAINS(Description, ' "*pos*" ')
> ~ Arjun
>
|||Hi Hilary,
Thank you for u r reply. Would it be possible to modify the indexing
mechanism which does. or could you please explain how to do it in
thesaurus.
Thanks.
~Arjun.
Hilary Cotter wrote:[vbcol=seagreen]
> No, you can't do this directly. While you can do some suffix (end of word)
> based searches, you can't do prefix (beginning of word) based searches.
> It may be possible to use the thesaurus expansion feature if you know in
> advance what all stem would be.
> --
> Hilary Cotter
> Director of Text Mining and Database Strategy
> RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
> This posting is my own and doesn't necessarily represent RelevantNoise's
> positions, strategies or opinions.
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
>
> <priyananth@.gmail.com> wrote in message
> news:1160545045.049080.147130@.m73g2000cwd.googlegr oups.com...
|||Hi Hilary,
Thank you for u r reply. Would it be possible to modify the indexing
mechanism which does. or could you please explain how to do it in
thesaurus.
Thanks.
~Arjun.
Hilary Cotter wrote:[vbcol=seagreen]
> No, you can't do this directly. While you can do some suffix (end of word)
> based searches, you can't do prefix (beginning of word) based searches.
> It may be possible to use the thesaurus expansion feature if you know in
> advance what all stem would be.
> --
> Hilary Cotter
> Director of Text Mining and Database Strategy
> RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
> This posting is my own and doesn't necessarily represent RelevantNoise's
> positions, strategies or opinions.
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
>
> <priyananth@.gmail.com> wrote in message
> news:1160545045.049080.147130@.m73g2000cwd.googlegr oups.com...
|||You would have to write your own word breaker to do this, and then write a
stemmer to stem for prefixes. I still don't think it can be done.
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
<priyananth@.gmail.com> wrote in message
news:1161075470.863928.272330@.i3g2000cwc.googlegro ups.com...
> Hi Hilary,
> Thank you for u r reply. Would it be possible to modify the indexing
> mechanism which does. or could you please explain how to do it in
> thesaurus.
> Thanks.
> ~Arjun.
>
> Hilary Cotter wrote:
>
|||I have the same problem.
It seems crippling that full text indexing does not support searching for
terms in the middle of token. Is there really that great of a demand for
searching the beginnings of words? Are there any third party products you
can recommend that will help me provide a search feature that will integrate
with SQL server?
CONTAINS term* works great but doesn't return the results I need and
'%term%' is far too slow to use in a production environment over 169000 rows
and multiple columns.
Your input is greatly appreciated.
"Hilary Cotter" wrote:

> You would have to write your own word breaker to do this, and then write a
> stemmer to stem for prefixes. I still don't think it can be done.
> --
> Hilary Cotter
> Director of Text Mining and Database Strategy
> RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
> This posting is my own and doesn't necessarily represent RelevantNoise's
> positions, strategies or opinions.
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
>
> <priyananth@.gmail.com> wrote in message
> news:1161075470.863928.272330@.i3g2000cwc.googlegro ups.com...
>
>
|||You might want to look at Lucene. The story on its performance is complex.
Basically SQL FTS does a really good job and offers better performance than
Lucene on a single server when you get big. When you get big you start
needing many machines to scale using lucene. The other problem with lucence
is that when you want to do property searches you end up storing data in
your full-text index and you search response times decline radically. So you
need to partition which means more machines.
How about you return your results from sql fts and then trim them using the
like? You will need to use the thesaurus option to expand your search
criteria to all possible terms you might be looking for.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Uriah" <Uriah@.discussions.microsoft.com> wrote in message
news:521E3F54-443B-4855-9A80-5DB71C3096F2@.microsoft.com...[vbcol=seagreen]
>I have the same problem.
> It seems crippling that full text indexing does not support searching for
> terms in the middle of token. Is there really that great of a demand for
> searching the beginnings of words? Are there any third party products you
> can recommend that will help me provide a search feature that will
> integrate
> with SQL server?
> CONTAINS term* works great but doesn't return the results I need and
> '%term%' is far too slow to use in a production environment over 169000
> rows
> and multiple columns.
> Your input is greatly appreciated.
> "Hilary Cotter" wrote:
|||My goal is to have a customer searching for parts over several fields and
finding any one of several tokens anywhere within the text (similar to the
LIKE '%term%' functionality). For example, they might want to search for
'56789' and get back the part with part number 'GAR56789'. I don't see how
it is possible to do that with FTS at all. If I knew what the search terms
were going to be I could certainly pre-add them to the thesaurus but the
point of the search is that I can't predict what terms they might enter.
Maybe this is where SQL server functionality breaks down and it's time to
move on to search appliance or software platform. I'll take a look at Lucene.
"Hilary Cotter" wrote:

> You might want to look at Lucene. The story on its performance is complex.
> Basically SQL FTS does a really good job and offers better performance than
> Lucene on a single server when you get big. When you get big you start
> needing many machines to scale using lucene. The other problem with lucence
> is that when you want to do property searches you end up storing data in
> your full-text index and you search response times decline radically. So you
> need to partition which means more machines.
> How about you return your results from sql fts and then trim them using the
> like? You will need to use the thesaurus option to expand your search
> criteria to all possible terms you might be looking for.
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
>
> "Uriah" <Uriah@.discussions.microsoft.com> wrote in message
> news:521E3F54-443B-4855-9A80-5DB71C3096F2@.microsoft.com...
>
>
|||If it is always a prefix you might want to store your content in reverse and
then reverse your query and do a wildcard.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Uriah" <Uriah@.discussions.microsoft.com> wrote in message
news:B695EBD3-11F8-4DEA-A81D-A1828178C577@.microsoft.com...[vbcol=seagreen]
> My goal is to have a customer searching for parts over several fields and
> finding any one of several tokens anywhere within the text (similar to the
> LIKE '%term%' functionality). For example, they might want to search for
> '56789' and get back the part with part number 'GAR56789'. I don't see
> how
> it is possible to do that with FTS at all. If I knew what the search
> terms
> were going to be I could certainly pre-add them to the thesaurus but the
> point of the search is that I can't predict what terms they might enter.
> Maybe this is where SQL server functionality breaks down and it's time to
> move on to search appliance or software platform. I'll take a look at
> Lucene.
> "Hilary Cotter" wrote:

没有评论:

发表评论