2012年3月27日星期二

full text search with language other than english(ex chinese, japanese)

I have set up a full text search to handle multiple columns searching
for chinese
But the result of the search isn't really what i have expected.
I have setup the catalog to have a chiense word break, and the columns
in the tables are all nvachar
when i do something like
select * from dbo.Table_1 where contains(*, '"<chinese
character>"',language 1082)
the search result is really inconsistent, especially with single
characters.I have also checked that these characters are not in the
noise filter file...
the search result is better when the input is more than a single
characters, but still, somtimes it will not return any result at all.
so, I try to use the "like" statement instead of "contains" to do the
search with the same inputs, and 100% of the time, it returns the
correct result.
does anyone have any experience about things like that? coz I guess
this is a more spcific issue with language. Is there any place that
you guys know of, can offer me some help?
thank you in advance.
On Feb 11, 5:33Xam, "Hilary Cotter" <hilary.cot...@.gmail.com> wrote:
> The Chinese word breaker does multiple passes on characters pulling out characters to try figure out which characters go together and which ones should be treated as a single "word". It even pulls out radicals.
> However, try this on the thishttp://search.microsoft.com/results.aspx?q=%E6%9F%90%E4%BA%BA%E7%82%B...
> searching on 某人為了商X或個人使用.
> Then try 某人為了商X或個人使 (note I have removed the last character)
> http://search.microsoft.com/results.aspx?q=%E6%9F%90%E4%BA%BA%E7%82%B...
> Nothing found.
> So while it works well in general, there are some inconsistencies.
> --
> Hilary Cotter
> Looking for a SQL Server replication book?http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTShttp://www.indexserverfaq.com
>
> <admin.on...@.gmail.com> wrote in messagenews:1171171251.557135.10240@.h3g2000cwc.goo glegroups.com...
>
>
>
>
>
> - Show quoted text -
icic
oh Thank you, so there is no way to improve that ?
thank you
|||On Feb 11, 5:33Xam, "Hilary Cotter" <hilary.cot...@.gmail.com> wrote:
> The Chinese word breaker does multiple passes on characters pulling out characters to try figure out which characters go together and which ones should be treated as a single "word". It even pulls out radicals.
> However, try this on the thishttp://search.microsoft.com/results.aspx?q=%E6%9F%90%E4%BA%BA%E7%82%B...
> searching on 某人為了商X或個人使用.
> Then try 某人為了商X或個人使 (note I have removed the last character)
> http://search.microsoft.com/results.aspx?q=%E6%9F%90%E4%BA%BA%E7%82%B...
> Nothing found.
> So while it works well in general, there are some inconsistencies.
> --
> Hilary Cotter
> Looking for a SQL Server replication book?http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTShttp://www.indexserverfaq.com
>
> <admin.on...@.gmail.com> wrote in messagenews:1171171251.557135.10240@.h3g2000cwc.goo glegroups.com...
>
>
>
>
>
> - Show quoted text -
I hit the send button too quickly for the last post...
coz I tried out a couple of single character that my search couldn't
handle in http://forums.asp.net, and it acutally can return results
there...
i am just wondering if there is any other settings i can try to
improve that ?
thank you
|||You may have better success using the Traditional word breaker.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
<admin.onQhk@.gmail.com> wrote in message
news:1171225568.325092.171990@.h3g2000cwc.googlegro ups.com...
On Feb 11, 5:33 am, "Hilary Cotter" <hilary.cot...@.gmail.com> wrote:
> The Chinese word breaker does multiple passes on characters pulling out
> characters to try figure out which characters go together and which ones
> should be treated as a single "word". It even pulls out radicals.
> However, try this on the
> thishttp://search.microsoft.com/results.aspx?q=%E6%9F%90%E4%BA%BA%E7%82%B...
> searching on ???.
> Then try ?? (note I have removed the last character)
> http://search.microsoft.com/results.aspx?q=%E6%9F%90%E4%BA%BA%E7%82%B...
> Nothing found.
> So while it works well in general, there are some inconsistencies.
> --
> Hilary Cotter
> Looking for a SQL Server replication
> book?http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL
> FTShttp://www.indexserverfaq.com
>
> <admin.on...@.gmail.com> wrote in
> messagenews:1171171251.557135.10240@.h3g2000cwc.goo glegroups.com...
>
>
>
>
>
> - Show quoted text -
icic
oh Thank you, so there is no way to improve that ?
thank you
|||They may be doing a like.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
<admin.onQhk@.gmail.com> wrote in message
news:1171226670.382397.154660@.k78g2000cwa.googlegr oups.com...
On Feb 11, 5:33 am, "Hilary Cotter" <hilary.cot...@.gmail.com> wrote:
> The Chinese word breaker does multiple passes on characters pulling out
> characters to try figure out which characters go together and which ones
> should be treated as a single "word". It even pulls out radicals.
> However, try this on the
> thishttp://search.microsoft.com/results.aspx?q=%E6%9F%90%E4%BA%BA%E7%82%B...
> searching on ???.
> Then try ?? (note I have removed the last character)
> http://search.microsoft.com/results.aspx?q=%E6%9F%90%E4%BA%BA%E7%82%B...
> Nothing found.
> So while it works well in general, there are some inconsistencies.
> --
> Hilary Cotter
> Looking for a SQL Server replication
> book?http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL
> FTShttp://www.indexserverfaq.com
>
> <admin.on...@.gmail.com> wrote in
> messagenews:1171171251.557135.10240@.h3g2000cwc.goo glegroups.com...
>
>
>
>
>
> - Show quoted text -
I hit the send button too quickly for the last post...
coz I tried out a couple of single character that my search couldn't
handle in http://forums.asp.net, and it acutally can return results
there...
i am just wondering if there is any other settings i can try to
improve that ?
thank you

没有评论:

发表评论