2012年3月21日星期三

Full Text Query Problem

Hi
We have a full text index on tblPages, the two fields that we wish to search
for 'server' are Symtoms and Whilst.
The problem is that we were running the join in the WHERE clause, but we
were not getting distinct results where 'server' was in both symtoms and
whilst.
So we have changed the join to an inner join, the only problem is that with
the current setup it is joining tblPages > A > B where as we want:
tblPages > A
> B
Any advice would be much appreciated.
Thanks
B
SELECT distinct(pageref), tblPages.Location, tblPages.Symptoms,
tblPages.Whilst, A.RANK, B.RANK
from tblPages AS tblPages
INNER JOIN containstable(tblPages, Symptoms, 'server') as A on
A.[KEY] = tblPages.PageRef
INNER JOIN containstable(tblPages, Whilst, 'server') as B ON B.[KEY]
= tblPages.PageRef"Ben" <Ben@.NoSpam.com> wrote in message
news:Ou7X2thUFHA.2420@.TK2MSFTNGP12.phx.gbl...
> Hi
> We have a full text index on tblPages, the two fields that we wish to
> search for 'server' are Symtoms and Whilst.
> The problem is that we were running the join in the WHERE clause, but we
> were not getting distinct results where 'server' was in both symtoms and
> whilst.
> So we have changed the join to an inner join, the only problem is that
> with the current setup it is joining tblPages > A > B where as we want:
> tblPages > A
> Any advice would be much appreciated.
> Thanks
> B
> SELECT distinct(pageref), tblPages.Location, tblPages.Symptoms,
> tblPages.Whilst, A.RANK, B.RANK
> from tblPages AS tblPages
> INNER JOIN containstable(tblPages, Symptoms, 'server') as A on
> A.[KEY] = tblPages.PageRef
> INNER JOIN containstable(tblPages, Whilst, 'server') as B ON
> B.[KEY] = tblPages.PageRef
DISTINCT works across all columns, you can't specify it to only be for one
column, so those parentheses around pageref actually do nothing and are
ignored by the parser. If you have non-unique pageref values in the results
then you need to use a GROUP BY clause on pageref and summation functions on
each row (eg. min, max, avg, etc) to tell SQL Server what you want returned
for each unique pageref value. However, if pageref is the primary key then
you will not need to do this using the following.
Anyway, onto the join. By specifying inner for both A and B you are limiting
the results to only those rows that have the term in both Whilst and
Symptoms. From the wording of your post you actually want rows that are in
either, for this you can use left outer joins.
SELECT distinct(pageref), tblPages.Location, tblPages.Symptoms,
tblPages.Whilst, A.RANK, B.RANK
from tblPages AS tblPages
LEFT OUTER JOIN containstable(tblPages, Symptoms, 'server') as A ON
tblPages.PageRef = A.[KEY]
LEFT OUTER JOIN containstable(tblPages, Whilst, 'server') as B ON
tblPages.PageRef = B.[KEY]
where A.[KEY] is not null OR B.[KEY] is not null
This will return all rows from tblPages where there is a match in either A
or B, or both.
You can also use EXISTS to give the same results, although I'm not sure on
the performance difference. I've read that EXISTS might be quicker on tables
with a large number of rows as rather than joining all rows to determine
which should be returned an index lookup can be performed to determine if a
row matches. I've run some tests on my own data and so far I've found left
joins and exists to be similar in performance, but on your data you may well
get different results.
Dan|||Thanks Dan
Thats ideal
B
"Daniel Crichton" <msnews@.worldofspack.co.uk> wrote in message
news:%23wzMMIiUFHA.1044@.TK2MSFTNGP10.phx.gbl...
> "Ben" <Ben@.NoSpam.com> wrote in message
> news:Ou7X2thUFHA.2420@.TK2MSFTNGP12.phx.gbl...
> DISTINCT works across all columns, you can't specify it to only be for one
> column, so those parentheses around pageref actually do nothing and are
> ignored by the parser. If you have non-unique pageref values in the
> results then you need to use a GROUP BY clause on pageref and summation
> functions on each row (eg. min, max, avg, etc) to tell SQL Server what you
> want returned for each unique pageref value. However, if pageref is the
> primary key then you will not need to do this using the following.
> Anyway, onto the join. By specifying inner for both A and B you are
> limiting the results to only those rows that have the term in both Whilst
> and Symptoms. From the wording of your post you actually want rows that
> are in either, for this you can use left outer joins.
> SELECT distinct(pageref), tblPages.Location, tblPages.Symptoms,
> tblPages.Whilst, A.RANK, B.RANK
> from tblPages AS tblPages
> LEFT OUTER JOIN containstable(tblPages, Symptoms, 'server') as A ON
> tblPages.PageRef = A.[KEY]
> LEFT OUTER JOIN containstable(tblPages, Whilst, 'server') as B ON
> tblPages.PageRef = B.[KEY]
> where A.[KEY] is not null OR B.[KEY] is not null
> This will return all rows from tblPages where there is a match in either A
> or B, or both.
> You can also use EXISTS to give the same results, although I'm not sure on
> the performance difference. I've read that EXISTS might be quicker on
> tables with a large number of rows as rather than joining all rows to
> determine which should be returned an index lookup can be performed to
> determine if a row matches. I've run some tests on my own data and so far
> I've found left joins and exists to be similar in performance, but on your
> data you may well get different results.
> Dan
>|||Hi Dan
Sorry, one more question, a little different this time.
The Containstable function works fine with single words, but problems occur
as I will allow users to add free text.
When there is more than one word then AND, OR, AND NOT is needed.
I can do some basic checking such as replacing double spaces and spaces with
AND (or OR), but if they enter a comma etc more problems occur.
Is there any solid method of doing this or should I use basic
checking/replace statments?
THanks
B
"Daniel Crichton" <msnews@.worldofspack.co.uk> wrote in message
news:%23wzMMIiUFHA.1044@.TK2MSFTNGP10.phx.gbl...
> "Ben" <Ben@.NoSpam.com> wrote in message
> news:Ou7X2thUFHA.2420@.TK2MSFTNGP12.phx.gbl...
> DISTINCT works across all columns, you can't specify it to only be for one
> column, so those parentheses around pageref actually do nothing and are
> ignored by the parser. If you have non-unique pageref values in the
> results then you need to use a GROUP BY clause on pageref and summation
> functions on each row (eg. min, max, avg, etc) to tell SQL Server what you
> want returned for each unique pageref value. However, if pageref is the
> primary key then you will not need to do this using the following.
> Anyway, onto the join. By specifying inner for both A and B you are
> limiting the results to only those rows that have the term in both Whilst
> and Symptoms. From the wording of your post you actually want rows that
> are in either, for this you can use left outer joins.
> SELECT distinct(pageref), tblPages.Location, tblPages.Symptoms,
> tblPages.Whilst, A.RANK, B.RANK
> from tblPages AS tblPages
> LEFT OUTER JOIN containstable(tblPages, Symptoms, 'server') as A ON
> tblPages.PageRef = A.[KEY]
> LEFT OUTER JOIN containstable(tblPages, Whilst, 'server') as B ON
> tblPages.PageRef = B.[KEY]
> where A.[KEY] is not null OR B.[KEY] is not null
> This will return all rows from tblPages where there is a match in either A
> or B, or both.
> You can also use EXISTS to give the same results, although I'm not sure on
> the performance difference. I've read that EXISTS might be quicker on
> tables with a large number of rows as rather than joining all rows to
> determine which should be returned an index lookup can be performed to
> determine if a row matches. I've run some tests on my own data and so far
> I've found left joins and exists to be similar in performance, but on your
> data you may well get different results.
> Dan
>|||"Ben" <Ben@.NoSpam.com> wrote in message
news:eyPByIjUFHA.2540@.tk2msftngp13.phx.gbl...
> Hi Dan
> Sorry, one more question, a little different this time.
> The Containstable function works fine with single words, but problems
> occur as I will allow users to add free text.
> When there is more than one word then AND, OR, AND NOT is needed.
> I can do some basic checking such as replacing double spaces and spaces
> with AND (or OR), but if they enter a comma etc more problems occur.
> Is there any solid method of doing this or should I use basic
> checking/replace statments?
There have been lots of posts in microsoft.public.sqlserver.fulltext with
solutions to this, many using regular expressions to quickly create clauses.
I myself use a lump of code I wrote about 10 years ago which deals with this
and parentheses and quoted phrases, but it's messy and I'd rather clean it
up before posting.
A search on google groups for fulltext parsing should pull up some useful
info, such as
http://groups.google.co.uk/group/mi...fulltext&hl=en
Dan
Dan|||Ben,
This is a somewhat common question in the fulltext newsgroup (I post replies
there often), and you may benefit from the following KB article: 246800
(Q246800) "INF: Correctly Parsing Quotation Marks in FTS Queries" at:
http://support.microsoft.com//defau...kb;EN-US;246800
You may want to alter the below Northwind FTS query to fit your environment,
as it handles FTS across (AND) columns:
use Northwind
go
SELECT e.LastName, e.FirstName, e.Title, e.Notes
from Employees AS e,
containstable(Employees, Notes, 'BA') as A,
containstable(Employees, Title, 'Sales') as B
where
A.[KEY] = e.EmployeeID and
B.[KEY] = e.EmployeeID
Note, that you can alter the AND between the KEY columns to an OR condition
and then use distinct in your select list.
Regards,
John
--
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/
"Ben" <Ben@.NoSpam.com> wrote in message
news:eyPByIjUFHA.2540@.tk2msftngp13.phx.gbl...
> Hi Dan
> Sorry, one more question, a little different this time.
> The Containstable function works fine with single words, but problems
occur
> as I will allow users to add free text.
> When there is more than one word then AND, OR, AND NOT is needed.
> I can do some basic checking such as replacing double spaces and spaces
with
> AND (or OR), but if they enter a comma etc more problems occur.
> Is there any solid method of doing this or should I use basic
> checking/replace statments?
> THanks
> B
> "Daniel Crichton" <msnews@.worldofspack.co.uk> wrote in message
> news:%23wzMMIiUFHA.1044@.TK2MSFTNGP10.phx.gbl...
we
and
one
you
Whilst
ON
A
on
far
your
>

没有评论:

发表评论