2012年3月27日星期二

full text search with language other than english(ex chinese, japanese)

I have set up a full text search to handle multiple columns searching for chinese

But the result of the search isn't really what i have expected.

I have setup the catalog to have a chiense word break, and the columns in the tables are all nvachar

when i do something like

select * from dbo.Table_1 where contains(*, '"<chinese character>"',language 1082)

the search result is really inconsistent, especially with single characters.I have also checked that these characters are not in the noise filter file...

the search result is better when the input is more than a single characters, but still, somtimes it will not return any result at all.

so, I try to use the "like" statement instead of "contains" to do the search with the same inputs, and 100% of the time, it returns the correct result.

does anyone have any experience about things like that? coz I guess this is a more spcific issue with language. Is there any place that you guys know of, can offer me some help?

thank you in advance.

I think you need to use the correct Chinese collation instead of just Nvarchar because Nvarchar just tells SQL Server you are not using ASCII but Chinese and Japanese alphabet are more than two thousand characters while Latin is just 26. There are six Chinese and I think three Japanese collations defined in SQL Server, try the link below to choose your collation. I also think you need to make sure your Microsoft search catalog is populated and check the noise word file. Run a search for noise words and Microsoft search catalog in SQL Server BOL(books online). Hope this helps.

http://msdn2.microsoft.com/en-us/library/ms180175.aspx

|||

Thank you so much

But after trying lots... of different collation settings, it still yields the same results...
Is there any more think that I can try ?

thank you

|||

Out of cuorcuriosity, I try to search a single chinese character which my search couldn't handle in this asp.net forums. This forum's search acutally returns result ...
so there must be something wrong that I am doing coz I really doubt that they will do anything special in this forum to handle chinese characters specifically......

just wondering if there is anything else that i could try ??

thank you in advance.....

没有评论:

发表评论