2012年3月7日星期三

Full Text - Index Population - HDD Access

Hi,
I'm indexing a single field on a table with 10 million rows in it, its
currently taking about 48 hours to build.
I've been looking into performance bottlenecks on the system and see that
this delay is being mainly caused by MSSearch writing massive amounts of
data to the FT Index location. Since the finished index is only about 400mb
I can only imagine that the population process is rewriting the data again
and again to the disk as it is updated.
Is this a fair assumption? If so is there anyway around it, maybe to tell
MSSearch to use ram until the population is build then write the index to
disk?
Thanks
Michael
No.
There is the resource usage setting which may increase performance during
the indexing process (sp_fulltext_service 'catalogname','resource_usgae',5)
The more resources you can spend to improve the indexing process the better,
normally you get the best bang for your buck by investing in the disk
subsystem. For instance placing your catalogs on a RAID 10, 1, or 01 array.
01 is not considered to be a good choice as it does not offer the the
protection 10 does.
Also your database should be raid 5 depending on the level of write activity
on it.
What type of content are you indexing. You will get best performance by
indexing text content as opposed to indexing documents in their native
format (doc, xls, pdf, etc).
Please review the full text white paper on SQL FTS performance at:
http://support.microsoft.com/support...er.asp?GSSNB=1
Save this document as a zip and extract it.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
"Mike Davies" <michael.davies@.synsoft.co.uk> wrote in message
news:ezLfihZmEHA.3824@.TK2MSFTNGP12.phx.gbl...
> Hi,
> I'm indexing a single field on a table with 10 million rows in it, its
> currently taking about 48 hours to build.
> I've been looking into performance bottlenecks on the system and see that
> this delay is being mainly caused by MSSearch writing massive amounts of
> data to the FT Index location. Since the finished index is only about
400mb
> I can only imagine that the population process is rewriting the data again
> and again to the disk as it is updated.
> Is this a fair assumption? If so is there anyway around it, maybe to tell
> MSSearch to use ram until the population is build then write the index to
> disk?
> Thanks
> Michael
>
|||Mike
What hardware is it running on?
Is the Full-Text Index on the same physical drive/lun as the database? It
should be possible to increase IO throughput by placing the Full-Text Index
on its own drive.
Are you using SCSI disks? If so are you using the faster 15k RPM disks as
apposed to 10k RPM disks.
How much RAM does the machine have?
How many processors and what spec are they?
Is SQL Server configured to leave some RAM free for the Operating System and
other Applications, i.e. if you have 2GB RAM is SQL configured to use max
1.5GB?
You might want to try SQL Server 2005 Beta 2. I understand that Full-Text
Search is much faster at indexing.
http://msdn.microsoft.com/library/en...asp?frame=true
Rich
"Hilary Cotter" wrote:

> No.
> There is the resource usage setting which may increase performance during
> the indexing process (sp_fulltext_service 'catalogname','resource_usgae',5)
> The more resources you can spend to improve the indexing process the better,
> normally you get the best bang for your buck by investing in the disk
> subsystem. For instance placing your catalogs on a RAID 10, 1, or 01 array.
> 01 is not considered to be a good choice as it does not offer the the
> protection 10 does.
> Also your database should be raid 5 depending on the level of write activity
> on it.
> What type of content are you indexing. You will get best performance by
> indexing text content as opposed to indexing documents in their native
> format (doc, xls, pdf, etc).
> Please review the full text white paper on SQL FTS performance at:
> http://support.microsoft.com/support...er.asp?GSSNB=1
> Save this document as a zip and extract it.
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
>
> "Mike Davies" <michael.davies@.synsoft.co.uk> wrote in message
> news:ezLfihZmEHA.3824@.TK2MSFTNGP12.phx.gbl...
> 400mb
>
>
|||Mike,
Yes, it is a fair assumption. You should review the SQL Server 2000 BOL
title "Full-text Search Recommendations" as well as the FTS Deployment white
paper. Are there workarounds? Yes. You can ensure that the disk drive that
your FT Catalog resides is separate from your database files as well as very
fast (>10,000 RPMs) and configured as RAID0 or RAID10. The Full Text
Indexing process is primarily I/O bound, but can also consume much memory
and CPU usage from time-to-time during Shadow Merges and at the end of the
process due to Master Merge.
Regards,
John
"Mike Davies" <michael.davies@.synsoft.co.uk> wrote in message
news:ezLfihZmEHA.3824@.TK2MSFTNGP12.phx.gbl...
> Hi,
> I'm indexing a single field on a table with 10 million rows in it, its
> currently taking about 48 hours to build.
> I've been looking into performance bottlenecks on the system and see that
> this delay is being mainly caused by MSSearch writing massive amounts of
> data to the FT Index location. Since the finished index is only about
400mb
> I can only imagine that the population process is rewriting the data again
> and again to the disk as it is updated.
> Is this a fair assumption? If so is there anyway around it, maybe to tell
> MSSearch to use ram until the population is build then write the index to
> disk?
> Thanks
> Michael
>
|||Hi Rich,
It's running on a single P4 Prescott at 3Ghz with 4GB of RAM. SQL Server
currently has access to all system RAM but is sat at approx 1.7Gb during the
indexing process.
Both the FullText index and the SQL database are on separate physical SATA
drives, however because of an apparent incompatibility issue with Windows
and running the two SATA drives on their own bus they are both running
through the standard IDE bus, I imagine this is not doing the performance
any favours.
Thanks for the 2005 tip, I'll give that a try.
Can you offer any other suggestions to the setup?
Thanks
Mike
"Richard Yeo" <RichardYeo@.discussions.microsoft.com> wrote in message
news:692389A4-ED05-42C7-B645-982E7E9CE989@.microsoft.com...
> Mike
> What hardware is it running on?
> Is the Full-Text Index on the same physical drive/lun as the database? It
> should be possible to increase IO throughput by placing the Full-Text
Index
> on its own drive.
> Are you using SCSI disks? If so are you using the faster 15k RPM disks as
> apposed to 10k RPM disks.
> How much RAM does the machine have?
> How many processors and what spec are they?
> Is SQL Server configured to leave some RAM free for the Operating System
and
> other Applications, i.e. if you have 2GB RAM is SQL configured to use max
> 1.5GB?
> You might want to try SQL Server 2005 Beta 2. I understand that Full-Text
> Search is much faster at indexing.
>
http://msdn.microsoft.com/library/en...asp?frame=true[vbcol=seagreen]
> Rich
>
> "Hilary Cotter" wrote:
during[vbcol=seagreen]
'catalogname','resource_usgae',5)[vbcol=seagreen]
better,[vbcol=seagreen]
array.[vbcol=seagreen]
activity[vbcol=seagreen]
http://support.microsoft.com/support...er.asp?GSSNB=1[vbcol=seagreen]
that[vbcol=seagreen]
of[vbcol=seagreen]
again[vbcol=seagreen]
tell[vbcol=seagreen]
to[vbcol=seagreen]
|||Mike
Have you implemented the /3GB switch in the OS boot.ini? If not by default
2GB is set aside for apps and 2GB for OS. By using this switch 3GB is set
aside for apps and 1GB for OS. This probably explains why SQL Server is only
using 1.7GB RAM. It would probably be using more if you had the /3GB switch
set. Try setting /3GB switch, instruct SQL Server to use Physical RAM, and
set max memory for SQL Server to be 2.5GB. This will leave 0.5GB for other
apps including MSSearch (I think), file system caching, etc.
Even if you don't have the budget for a 2 proc SQL license I would always
recommend purchasing a second physical processor and configuring SQL Server
to only use one of the processors. This leaves the second processor free for
other applications, OS, remote administration.
You might want to try the manufacturers web-site regarding BIOS / firmware
updates for your SATA / IDE issue.
Rich
"Mike Davies" wrote:

> Hi Rich,
> It's running on a single P4 Prescott at 3Ghz with 4GB of RAM. SQL Server
> currently has access to all system RAM but is sat at approx 1.7Gb during the
> indexing process.
> Both the FullText index and the SQL database are on separate physical SATA
> drives, however because of an apparent incompatibility issue with Windows
> and running the two SATA drives on their own bus they are both running
> through the standard IDE bus, I imagine this is not doing the performance
> any favours.
> Thanks for the 2005 tip, I'll give that a try.
> Can you offer any other suggestions to the setup?
> Thanks
> Mike
>
> "Richard Yeo" <RichardYeo@.discussions.microsoft.com> wrote in message
> news:692389A4-ED05-42C7-B645-982E7E9CE989@.microsoft.com...
> Index
> and
> http://msdn.microsoft.com/library/en...asp?frame=true
> during
> 'catalogname','resource_usgae',5)
> better,
> array.
> activity
> http://support.microsoft.com/support...er.asp?GSSNB=1
> that
> of
> again
> tell
> to
>
>
|||Mike,
Rich, while the /3GB switch can help SQL Server 2000, it has no affect on
the "Microsoft Search" (mssearch.exe) FT Search engine as it can only use up
to a max of 512MB of physical RAM. However, you must set the resource_usage
value to 5 (dedicated) via sp_fulltext_service 'resource_usage', 5 and you
must have at least 512Mb of RAM reserved for the MSSearch service that is
NOT used by the OS or by SQL Server.
If you can purchase an additional CPU, you can set the MSSearch service's to
use one CPU and then use sp_configure to set the affinity of SQL Server to
that cpu to avoid cpu contention during heavy MSSearch CPU usage via the
following method:
You can launch the Tskmgr.exe from the AT command on the multi-proc server
where sql server resides, as follows:
at <current_time+1min> /interactive taskmgr.exe
when it launches, you can then set "cpu affinity" for the MSSearch service
to a cpu or set of cpu's not being used by SQL Server. Then you would use
sp_configure to set SQL Server's cpu affinity to the other cpu's, thus
preventing the cpu usage of mssearch from affecting your sql server
processing. Note, this only works on multi-processor servers.
Regards,
John
"Richard Yeo" <RichardYeo@.discussions.microsoft.com> wrote in message
news:2AEB4663-795F-48F6-BA68-DEAA0D1C4BD8@.microsoft.com...
> Mike
> Have you implemented the /3GB switch in the OS boot.ini? If not by default
> 2GB is set aside for apps and 2GB for OS. By using this switch 3GB is set
> aside for apps and 1GB for OS. This probably explains why SQL Server is
only
> using 1.7GB RAM. It would probably be using more if you had the /3GB
switch
> set. Try setting /3GB switch, instruct SQL Server to use Physical RAM, and
> set max memory for SQL Server to be 2.5GB. This will leave 0.5GB for other
> apps including MSSearch (I think), file system caching, etc.
> Even if you don't have the budget for a 2 proc SQL license I would always
> recommend purchasing a second physical processor and configuring SQL
Server
> to only use one of the processors. This leaves the second processor free
for[vbcol=seagreen]
> other applications, OS, remote administration.
> You might want to try the manufacturers web-site regarding BIOS / firmware
> updates for your SATA / IDE issue.
> Rich
>
> "Mike Davies" wrote:
Server[vbcol=seagreen]
the[vbcol=seagreen]
SATA[vbcol=seagreen]
Windows[vbcol=seagreen]
performance[vbcol=seagreen]
It[vbcol=seagreen]
as[vbcol=seagreen]
System[vbcol=seagreen]
max[vbcol=seagreen]
Full-Text[vbcol=seagreen]
http://msdn.microsoft.com/library/en...asp?frame=true[vbcol=seagreen]
disk[vbcol=seagreen]
the[vbcol=seagreen]
by[vbcol=seagreen]
native[vbcol=seagreen]
http://support.microsoft.com/support...er.asp?GSSNB=1[vbcol=seagreen]
its[vbcol=seagreen]
see[vbcol=seagreen]
amounts[vbcol=seagreen]
about[vbcol=seagreen]
data[vbcol=seagreen]
to[vbcol=seagreen]
index[vbcol=seagreen]
|||"Richard Yeo" <RichardYeo@.discussions.microsoft.com> wrote in message
news:2AEB4663-795F-48F6-BA68-DEAA0D1C4BD8@.microsoft.com...
> Mike
>
<SNIP>
> Even if you don't have the budget for a 2 proc SQL license I would always
> recommend purchasing a second physical processor and configuring SQL
> Server
> to only use one of the processors. This leaves the second processor free
> for
> other applications, OS, remote administration.
>
</SNIP>
Please be careful about that because, in general, you have to purchase a
processor license for every processor that the OS can see, whether that
processor is configured to run SQL Server or not. For more information,
please refer to the SQL Server 2000 Licensing FAQ web page at
http://www.microsoft.com/sql/howtobuy/faq.asp. The third question under
Licensing is reproduced below for your benefit.
Q. Do I have to license all of the processors in a server?
A. You only have to acquire licenses for processors that are
accessible to any copy of the operating system upon which SQL Server 2000 is
installed.
Sincerely,
Stephen Dybing
This posting is provided "AS IS" with no warranties, and confers no rights.
begin 666 ts.gif
J1TE&.#EA`0`!`( ``````/___R'Y! $`````+ `````!``$```(!1 `[
`
end
|||"Richard Yeo" <RichardYeo@.discussions.microsoft.com> wrote in message
news:2AEB4663-795F-48F6-BA68-DEAA0D1C4BD8@.microsoft.com...
> Mike
> Have you implemented the /3GB switch in the OS boot.ini? If not by default
> 2GB is set aside for apps and 2GB for OS. By using this switch 3GB is set
> aside for apps and 1GB for OS. This probably explains why SQL Server is
only
> using 1.7GB RAM. It would probably be using more if you had the /3GB
switch
> set. Try setting /3GB switch, instruct SQL Server to use Physical RAM, and
> set max memory for SQL Server to be 2.5GB. This will leave 0.5GB for other
> apps including MSSearch (I think), file system caching, etc.
> Even if you don't have the budget for a 2 proc SQL license I would always
> recommend purchasing a second physical processor and configuring SQL
Server
> to only use one of the processors. This leaves the second processor free
for
> other applications, OS, remote administration.
Unless thigns have changed, last I knew, MS didn't allow that.
|||Stephen
Thanks for clarifying that.
Obviously there is a difference between what the product is capable of and
what the licensing folk think the product is or should be capable of.
In this instance I feel the Product Team got it right. In the real world
people often have to run more than one application on a server and should not
be forced to buy additional licenses.
With the move in the industry to consolidate this could a lot of unnecessary
expense.
Lets hope this license issue is addressed in the next release of SQL Server
2005.
Rich
"Stephen Dybing [MSFT]" wrote:

> "Richard Yeo" <RichardYeo@.discussions.microsoft.com> wrote in message
> news:2AEB4663-795F-48F6-BA68-DEAA0D1C4BD8@.microsoft.com...
> <SNIP>
> </SNIP>
> Please be careful about that because, in general, you have to purchase a
> processor license for every processor that the OS can see, whether that
> processor is configured to run SQL Server or not. For more information,
> please refer to the SQL Server 2000 Licensing FAQ web page at
> http://www.microsoft.com/sql/howtobuy/faq.asp. The third question under
> Licensing is reproduced below for your benefit.
> Q. Do I have to license all of the processors in a server?
> A. You only have to acquire licenses for processors that are
> accessible to any copy of the operating system upon which SQL Server 2000 is
> installed.
>
>
> --
> Sincerely,
> Stephen Dybing
> This posting is provided "AS IS" with no warranties, and confers no rights

没有评论:

发表评论