2012年3月26日星期一

Full Text Search Indexing HTML - does the filter expect certain tags to be present as standard?

Hi, I was wondering if any SQL Server gurus out there could help me...

I have a table which contains text resources for my application. The text resources are multi-lingual so I've read that if I add a html language indicator meta tag e.g.
<META NAME="MS.LOCALE" CONTENT="ES">
and store the text in a varbinary column with a supporting Document Type column containing ".html" of varchar(5) then the full text index service should be intelligent about the language word breakers it applies when indexing the text. (I hope this is correct technique for best multi-lingual support in a single table?)

However, when I come to query this data the results always return 0 rows (no errors are encountered). e.g.
DECLARE @.SearchWord nvarchar(256)
SET @.SearchWord = 'search' -- Yes, this word is definitely present in my resources.
SELECT * FROM Resource WHERE CONTAINS(Document, @.SearchWord)

I'm a little puzzled as Full Text search is working fine on another table that employs an nvarchar column (just plain text, no html).

Does the filter used for full text indexing of html expect certain tags to be present as standard? E.g. <html> and <body> tags? At present the data I have stored might look like this (no html or body wrapping tags):

Example record 1 data: <META NAME="MS.LOCALE" CONTENT="EN">Search for keywords:

Example record 2 data: <META NAME="MS.LOCALE" CONTENT="EN">Sorry no results were found for your search.

etc.

Any pointers / suggestions would be greatly appreciated. Cheers,
Gavin.

UPDATE:
I have tried wrapping the text in more usual html tags and re-built the full text index but I still never get any rows returned for my query results. Example of content wrapping tried - <HTML><HEAD><META NAME="MS.LOCALE" CONTENT="EN"></HEAD><BODY>Test text.</BODY></HTML>

I've also tried stripping all html tags from the content and set the Document Type column = .txt but I still get no rows returned?!?I've further isolated what the problem is and have started a new thread to request more specific help...
http://forums.microsoft.com/TechNet/ShowPost.aspx?PostID=1844786&SiteID=17

没有评论:

发表评论