Hi, I was wondering if any SQL Server gurus out there could help me...
I have a table I'm trying to apply a full text catalog to, however no results are ever returned due to the text column being cataloged being of varbinary(max) that's being populated from a converted nvarchar(max) value.
To re-create the problem quickly...
If I populate the column via
CONVERT(varbinary(max), 'test text')
then there is no problem, I get results as expected.
However if I populate the column via
CONVERT(varbinary(max), CAST('test text' as nvarchar(max)))
no results are ever returned.
Is this a bug with SQL Server 2005 Full Text Indexing? I'm happily creating full text catalogs when an nvarchar is not getting converted into a varbinary.
I'm setting the Document Type column to '.html' (I've tried changing this to '.txt' in case it was a fault with the html ifilter but the problem persists so I believe I can rule this out).
The reason I need to convert an nvarchar to varbinary is that the table holds multi-lingual text and I'm adding a html meta tag <META NAME="MS.LOCALE" CONTENT="ES"> to the beginning in order for the full text indexing word breaker to select the correct language to catalog the text with. The aim being to provide more relevant searches in users native languages (I've read a few articles that describe this technique, but it's the first time I've tried to apply it).
Any pointers / suggestions would be greatly appreciated. Cheers,
Gavin.
Below is a T-SQL script you can run to demonstrate the effect I'm experiencing...
-- Create test database
CREATE DATABASE FullTextTest
GO
USE FullTextTest
GO
-- Create test data table
CREATE TABLE TestTable
(
pk UNIQUEIDENTIFIER NOT NULL CONSTRAINT tablePK PRIMARY KEY,
varbinarycol VARBINARY(MAX),
documentExtension VARCHAR(5),
)
GO
-- The below single entry WILL BE FOUND (the text source is being entered directly)
INSERT INTO TestTable (pk, varbinarycol, documentExtension) VALUES (NEWID(), CONVERT(VARBINARY(MAX),'<META NAME="MS.LOCALE" CONTENT="EN">test entry 1'), '.html')
-- The bellow two entries below WILL NOT BE FOUND (the text source is taken from an NVARCHAR(MAX) value)
INSERT INTO TestTable (pk, varbinarycol, documentExtension) VALUES (NEWID(), CONVERT(VARBINARY(MAX), CAST('<META NAME="MS.LOCALE" CONTENT="EN">test entry 2' AS NVARCHAR(MAX))), '.html')
INSERT INTO TestTable (pk, varbinarycol, documentExtension) VALUES (NEWID(), CONVERT(VARBINARY(MAX), CAST('<META NAME="MS.LOCALE" CONTENT="EN">test entry 3' AS NVARCHAR(MAX))), '.html')
GO
-- Create the full text catalog
sp_fulltext_database 'enable'
GO
CREATE FULLTEXT CATALOG TEST AS DEFAULT
GO
CREATE FULLTEXT INDEX ON TestTable (varbinarycol TYPE COLUMN documentExtension LANGUAGE 1033)
KEY INDEX tablePK
GO
-- NOTE: You might need to give the catalog a chance to build before running the script below.
-- Now do a search that SHOULD RETURN 3 ROWS of data, but ONLY 1 ROW IS RETURNED
SELECT CAST(varbinarycol AS NVARCHAR(MAX)) FROM TestTable WHERE CONTAINS(varbinarycol, 'test')Varbinary is supported in FT see these links:
http://www.simple-talk.com/sql/learn-sql-server/sql-server-full-text-search-language-features,-part-2/ and http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/ftslesld.mspx|||Hi Satya,
I know varbinary is supported by FT, but if you run my example code you'll see it has problems when an nvarchar is being converted into varbinary. Those rows of data don't show us when you do a CONTAINS or FREETEXT search.
This is my problem as I need to take nvarchars in order allow full multi-lingual functionality support.
Regards,
Gavin.|||I've also tried adding the following extra meta tag - <META http-equiv="Content-Type" content="text/html; charset=utf-16">
But this seemed to break the Full Text Index even further. If applied to the example T-SQL I give above then even the non-nvarchar input is no longer returned:
INSERT INTO TestTable (pk, varbinarycol, documentExtension) VALUES (NEWID(), CONVERT(VARBINARY(MAX),'<META http-equiv="Content-Type" content="text/html; charset=utf-16"><META NAME="MS.LOCALE" CONTENT="EN">test entry 1'), '.html')
订阅:
博文评论 (Atom)
没有评论:
发表评论