2012年3月27日星期二

Full Text Search Setup

I am trying to setup FTS on a table in my database; the column I am
interested in FTS'ing is all image datatypes. The data in the colum is
populated via asp.net; and we store the document type as a mime type e.g.
'application/msword' as it makes it easy to get data back out from the
database.
My table has teh following columns
DocumentID UniqueIdentifier,
DocumentTypeRef int
DocumentSubject varchar(100)
DocumentValue image
DocumentDocType varchar(50)
DocumentDocType sysname -- added to use FTS would rather use vc(50) field
DocumentByteSize int
DocumentFileName varchar(100)
DocumentCreated datetime
MailMergeTemplateID uniqueidentifier
ToDoID uniqueidentifier
DocumentUploadWaiting bit
The table has 50 rows in it; in my test system. mostly word documents. I
have run teh following script to setup fts (and tried the wizard without
success)
use Activities_FTS
exec sp_fulltext_database 'enable'
exec sp_fulltext_catalog 'FTS', 'create', 'D:\\sqldata'
exec sp_fulltext_table 'tblDocuments', 'create', 'FTS', 'PK_tblDocuments'
exec sp_fulltext_column @.tabname = tblDocuments,
@.colname = DocumentValue,
@.action = 'dro',
@.Type_Colname = DocumentDocType2
exec sp_fulltext_table 'tblDocuments', 'activate'
exec sp_fulltext_table 'tblDocuments', 'start_full'
select FULLTEXTCATALOGPROPERTY('FTS', 'Populatestatus')
select * from freetexttable(tblDocuments,DocumentValue , '%%') order by
[rank] desc
This is based around the steps which worked successfully from the tutorial
on this page
http://msdn.microsoft.com/library/de...extsearch.asp.
When i run the index population check it does run for about 30 seconds; but i
cannot find out why it is not indexing the documents; or is there another
query i can use to test this. (My FTS catalog never grows above 1mb)
If you have any questions please do not hestitate to reply as i am slightly
lost now
Many thanks
Chris
Can you check what your gatherer logs report? I'm not sure if storing them
as a mime type is the best way to go.
To check your gatherer log output do this:
go to c:\program files\common files\system\mssearch\bin and copy gthrlog.vbs
to %windir% then go to your catalog location for your catalog (this should
be c:\program files\microsoft sql server\mssql\ftdata\sqlserver\gatherlogs.
Now you want to id your gatherlog for your catalog. the totally brainless
what to do this is to issue another incremental population for your catalog
and then do a dir /od. Your catalog
gather logs will then appear last in the list. The syntax is
SQLXXXXXYYYYY.?.gthr where the X is your db_id (you can determine this by
doing
this select db_id('database_name') and the YYYYYY is your catalog id, which
you can determine by doing to your database in isqlw and doing this select
id, name from syscatalogs where name ='catalog_name'.
Then with this information what you do is this (from a command prompt)
cscript gthrlog.vbs SQL0017900635.6.gthr
You will get a lot of output, but it will tell you on which row(s) the
indexer had a problem. Then what you have to do is have a look at that row
and see if you can figure out what the problem is with the data in that row
"Chris Hoare" <ChrisHoare@.discussions.microsoft.com> wrote in message
news:AB36FF24-EE13-4C7A-B8E5-E110588526CC@.microsoft.com...
>I am trying to setup FTS on a table in my database; the column I am
> interested in FTS'ing is all image datatypes. The data in the colum is
> populated via asp.net; and we store the document type as a mime type e.g.
> 'application/msword' as it makes it easy to get data back out from the
> database.
> My table has teh following columns
> DocumentID UniqueIdentifier,
> DocumentTypeRef int
> DocumentSubject varchar(100)
> DocumentValue image
> DocumentDocType varchar(50)
> DocumentDocType sysname -- added to use FTS would rather use vc(50) field
> DocumentByteSize int
> DocumentFileName varchar(100)
> DocumentCreated datetime
> MailMergeTemplateID uniqueidentifier
> ToDoID uniqueidentifier
> DocumentUploadWaiting bit
> The table has 50 rows in it; in my test system. mostly word documents. I
> have run teh following script to setup fts (and tried the wizard without
> success)
> use Activities_FTS
> exec sp_fulltext_database 'enable'
> exec sp_fulltext_catalog 'FTS', 'create', 'D:\\sqldata'
> exec sp_fulltext_table 'tblDocuments', 'create', 'FTS', 'PK_tblDocuments'
> exec sp_fulltext_column @.tabname = tblDocuments,
> @.colname = DocumentValue,
> @.action = 'dro',
> @.Type_Colname = DocumentDocType2
> exec sp_fulltext_table 'tblDocuments', 'activate'
> exec sp_fulltext_table 'tblDocuments', 'start_full'
> --
> select FULLTEXTCATALOGPROPERTY('FTS', 'Populatestatus')
> select * from freetexttable(tblDocuments,DocumentValue , '%%') order by
> [rank] desc
>
> This is based around the steps which worked successfully from the tutorial
> on this page
> http://msdn.microsoft.com/library/de...extsearch.asp.
> When i run the index population check it does run for about 30 seconds;
> but i
> cannot find out why it is not indexing the documents; or is there another
> query i can use to test this. (My FTS catalog never grows above 1mb)
> If you have any questions please do not hestitate to reply as i am
> slightly
> lost now
> Many thanks
> Chris
|||Chris,
I'm assuming that you're using SQL Server 2000 as this feature is new with
SQL Server 2000, but what OS platform is it installed on? Could you post the
full output of -- SELECT @.@.version -- as this is very helpful in
troubleshooting SQL FTS issues!
You should use the datatype of sysname (or char(3) or varchar(4)) with your
column "DocumentDocType". Could you also provide details on what values you
have populated in this column? Furthermore, how exactly did you import the
MS Word documents and what is the language of the text stored in the MS Word
documents?
Additionally, the following FREETEXTTABLE query will not return any results
as the "%" (percent) symbols are ignored by the MSSearch engine (depending
upon the OS platform):
select * from freetexttable(tblDocuments,DocumentValue , '%%') order by
[rank] desc
Instead you should use the following:
declare @.searchTerm varchar(1024)
set @.searchTerm = 'some_valid_search_word_here'
select * from freetexttable(tblDocuments, DocumentValue, @.searchTerm) order
by [rank] desc
Finally, you should review your server's Application event log for any
"Microsoft Search" or MssCi source events (warnings, informational and
errors) to determine why the initial FT Indexing is failing as this is the
only place such errors or warnings are written. Have you or anyone else
changed the SQL Server (MSSQLServer) service account &/or password via
Win2K's Component Services vs. changing this in the Enterprise Manager? If
so, then you should also review KB article 277549 (Q277549) PRB: Unable to
Build Full-Text Catalog After You Modify MSSQLServer Logon Account Through
[NT4.0) Control Panel [or Win2K Component Services] at:
http://support.microsoft.com/default...B;EN-US;277549
Regards,
John
"Chris Hoare" <ChrisHoare@.discussions.microsoft.com> wrote in message
news:AB36FF24-EE13-4C7A-B8E5-E110588526CC@.microsoft.com...
> I am trying to setup FTS on a table in my database; the column I am
> interested in FTS'ing is all image datatypes. The data in the colum is
> populated via asp.net; and we store the document type as a mime type e.g.
> 'application/msword' as it makes it easy to get data back out from the
> database.
> My table has teh following columns
> DocumentID UniqueIdentifier,
> DocumentTypeRef int
> DocumentSubject varchar(100)
> DocumentValue image
> DocumentDocType varchar(50)
> DocumentDocType sysname -- added to use FTS would rather use vc(50) field
> DocumentByteSize int
> DocumentFileName varchar(100)
> DocumentCreated datetime
> MailMergeTemplateID uniqueidentifier
> ToDoID uniqueidentifier
> DocumentUploadWaiting bit
> The table has 50 rows in it; in my test system. mostly word documents. I
> have run teh following script to setup fts (and tried the wizard without
> success)
> use Activities_FTS
> exec sp_fulltext_database 'enable'
> exec sp_fulltext_catalog 'FTS', 'create', 'D:\\sqldata'
> exec sp_fulltext_table 'tblDocuments', 'create', 'FTS', 'PK_tblDocuments'
> exec sp_fulltext_column @.tabname = tblDocuments,
> @.colname = DocumentValue,
> @.action = 'dro',
> @.Type_Colname = DocumentDocType2
> exec sp_fulltext_table 'tblDocuments', 'activate'
> exec sp_fulltext_table 'tblDocuments', 'start_full'
> --
> select FULLTEXTCATALOGPROPERTY('FTS', 'Populatestatus')
> select * from freetexttable(tblDocuments,DocumentValue , '%%') order by
> [rank] desc
>
> This is based around the steps which worked successfully from the tutorial
> on this page
>
http://msdn.microsoft.com/library/de...extsearch.asp.
> When i run the index population check it does run for about 30 seconds;
but i
> cannot find out why it is not indexing the documents; or is there another
> query i can use to test this. (My FTS catalog never grows above 1mb)
> If you have any questions please do not hestitate to reply as i am
slightly
> lost now
> Many thanks
> Chris
|||OK,
SQL Server details
Microsoft SQL Server 2000 - 8.00.760 (Intel X86)
Dec 17 2002 14:22:05
Copyright (c) 1988-2003 Microsoft Corporation
Enterprise Edition on Windows NT 5.2 (Build 3790: )
The documents were all imported using request.file object in asp.net; and
were streamed into a stored procuedre. ASP.net can open these files again
(which is why we are storing the mime type.)
The column I referenced as DocumentDocType2 is typed as 'sysname.' this is
the column that i am referencing as Documenttype for sp_fulltext_column
I have changed the contents of the documentdoctype2 column to be both
'word' & 'msword' to be safe and run the recatalog command (Originally these
were mime types 'application/msword' or similar. In both cases this was the
result;
The application log shows the following result : Event ID 2001
"One or more documents stored in image columns with extension 'word' did not
get full-text indexed because loading the filter failed with error '0x1'.
Note: These documents will not be passed to MSSearch for indexing, and
therefore this failure will not be reflected in the end of crawl summary
statistics."
Though strangely it went on to say
"The end of crawl for project <SQLServer SQL0003400005> has been detected.
The Gatherer successfully processed 50 documents totaling 0K. It failed to
filter 0 documents. 0 URLs could not be reached or were denied access."
Finally there is an event for Master merge has completed on the index.
Thanks for your help
|||You're welcome, Chris,
Thanks for the version info as that does help, especially the fact that you
have SQL Server 2000 SP3 on Win2003! You may want to review all of the below
KB articles, and especially 326502 (Q326502) as it provides coding example
on how to use ASP.NET to load MS Word files via:
Response.AddHeader("Content-Disposition", "attachment;filename=blob.doc")
Response.ContentType = "application/msword"
The most likely cause of the Full Population failing is that only valid MS
Word (doc or .doc) files are supported when properly imported into a SQL
Server 2000 table's column defined with the IMAGE (or BLOB) datatype.
Additionally, by populating the column documentdoctype2 column with both
'word' & 'msword' will not succeed as the MSSearch service does not
recognize these as valid file extensions. See SQL Server 2000 BOL title
"Filtering Supported File Types" - specifically, "Microsoft SQL ServerT
2000 includes filters for these file extensions: .doc, .xls, .ppt, .txt, and
..htm".
I grant you that the second informational MSSearch message text was somewhat
mis-leading, but the key error "One or more documents stored in image
columns with extension 'word' did not get full-text indexed because loading
the filter failed with error '0x1'" indicates the FT Population failure
because the wrong value was placed in the documentdoctype2 column.
258038 (Q258038) HOWTO: Access and Modify SQL Server BLOB Data by Using the
ADO Stream Object
http://support.microsoft.com/?kbid=258038
309158 (Q309158) HOW TO: Read and Write BLOB Data by Using ADO.NET with C#
http://support.microsoft.com/default...b;EN-US;309158
308042 (Q308042) HOW TO: Read and Write BLOB Data by Using ADO.NET with
VB.NET
http://support.microsoft.com/default...b;EN-US;308042
326502 (Q326502) HOW TO: Read and Write BLOB Data by Using ADO.NET Through
ASP.NET
http://support.microsoft.com/?id=326502
Regards,
John
"Chris Hoare" <ChrisHoare@.discussions.microsoft.com> wrote in message
news:066C2D40-1BF5-4B26-821E-69F71F39844C@.microsoft.com...
> OK,
> SQL Server details
> Microsoft SQL Server 2000 - 8.00.760 (Intel X86)
> Dec 17 2002 14:22:05
> Copyright (c) 1988-2003 Microsoft Corporation
> Enterprise Edition on Windows NT 5.2 (Build 3790: )
> The documents were all imported using request.file object in asp.net; and
> were streamed into a stored procuedre. ASP.net can open these files again
> (which is why we are storing the mime type.)
> The column I referenced as DocumentDocType2 is typed as 'sysname.' this is
> the column that i am referencing as Documenttype for sp_fulltext_column
> I have changed the contents of the documentdoctype2 column to be both
> 'word' & 'msword' to be safe and run the recatalog command (Originally
these
> were mime types 'application/msword' or similar. In both cases this was
the
> result;
> The application log shows the following result : Event ID 2001
> "One or more documents stored in image columns with extension 'word' did
not
> get full-text indexed because loading the filter failed with error '0x1'.
> Note: These documents will not be passed to MSSearch for indexing, and
> therefore this failure will not be reflected in the end of crawl summary
> statistics."
> Though strangely it went on to say
> "The end of crawl for project <SQLServer SQL0003400005> has been detected.
> The Gatherer successfully processed 50 documents totaling 0K. It failed to
> filter 0 documents. 0 URLs could not be reached or were denied access."
> Finally there is an event for Master merge has completed on the index.
> Thanks for your help
sql

没有评论:

发表评论