SQL Server 2005 Full Text Performance With Large Number Of Records
Dec 12, 2007
Hi
We are using the SQL Server 2005 Full Text Service. The data is not huge, but the kind of data is that each record is small and there are a large number of records. There are 35 million records now with 11 GB of data and about 1.6 GB of FT catalog on the table. This is expected to grow to at least 10 times the size of this data.
The issue is with FTS taking a long time to return results when the number of hits (rows) getting returned from FTS is large for some searches, it takes a very long time. With the same data & catalog, those full text queries for less common words return timely. The nature of the problem doesnt allow us to only have top results. We need all the results. So it’s not about the size of data but the number of results getting returned from FT. (As the catalog is inverted). The machine is dual processor with 4 GB RAM.
I am considering splitting the table and hence the catalog and using multiple servers to do full text searches in smaller catalogs. Is there any other way this issue can be solved ?
If splitting is the only way, can you give me an idea as to what is a statistical/standard limit to the number of search results/cataog size as which FTS gives good results
Hello everybody, I've got a little problem wich i'm trying to solve since 1-2 years and i hoped it would go away with SQL 2005 - but that wasn't the case :(.
Situation: I've just bought a new Server containing: SQL 2005 64 Bit Enviroment 4 GB RAM 2x AMD Opteron 2 GHz Prozeccors (Dual Core) 2x RAID Controllers (RAID 1) containing 1.1 System 1.2 Data 2.1 Transaction Logs
I've created a full-text table containing all the search terms i need to search. Table build: RecID - int - Primary Key SrcID - varchar(30) ArticleID - int - referring to an original table SearchField - varchar(150) - Containing the search terms timestamp - timestamp field
Fulltext index: RecID as Primary Key SearchField as indexed field - Wordbreaker: Neutral (containing several languages), Accent sensitivity off
Now i've got different tables imported in here resulting in a table size of ~ 13 million rows.
There is no problem with the performance on this catalog if i search a term wich isn't contained in more than 200-300 recordsets - but if i search for a term wich could occur in 200'000 upwards it gets extremely slow.
On the slow query the first records get in after no time, but until the query finished up to 60 seconds pass. The problem is that i have to sort by a ranking value wich is stored externally - so i need all results to sort them...
current (debugging) query: SELECT ArticleID FROM fullTextTable AS ft INNER JOIN CONTAINSTABLE(FullTextCatalog,SearchField,'"term*"') AS ftRes ON ftRes.[KEY]=ft.idEntry
Now if i check in the performance monitor: As soon as i run the query the 'Avg. Disk Read Queue Length' counter on disk D (SQL Data Files) jumps to the top, until the query has finished. Almost no read/write activity on C: where the Fulltext is stored...
If i rerun the query, after it finished once successfully - it takes place below 1-2 seconds, would be nice to get that result in first place :).
I've searched the forum and saw that this question has been asked several times. However, none of the responses point to any documentation about this limit. We've got different groups bickering over this ("There is no limit!" "Yes there is!" "No there isn't!") and it would help to have a link on the MSDN or a response from Microsoft to this question to put this to rest once and for all. Also, if there is a limit, is that per Sql Server instance or per physical server?
Secondly, we currently have over 400 catalogs. We've noticed it takes 4-5 minutes for new data to show up in search results. Does the number of catalogs affect performance, or is it the amount of data, irrespective of catalog count?
A New Monthly data is being loaded, checked and finally approved after 6 or 7 iteration before approval.Because of this iteration the monthly data set is being added then deleted then added then deleted few times.Because the table is big this process takes time, any thoughts on how to make the delete insert process faster.Keep in mind I cannot do much because it is a production table and is being access by other users to do other analysis.
Delete is done based on trx_date which is a year/month combo, like 201508.
The table has monthly sales by customer aggregated.
The table structure is:
CREATE TABLE [dbo].[Sales]( [batch_key] [int] NOT NULL, [Company_key] [int] NOT NULL, [customer_key] [char](22) NOT NULL, [Trx_Date] [int] NOT NULL, [account] [nvarchar](35) NOT NULL,
Hi gurus, I'm creating a web application where I will have a large number of tables (between 10k and 20k), this is done for the sake of scalability as tables will be moved to different database servers as the application grows and also for performance (smaller indexes). I'm worried though how having a large number of tables could affect the performance of SQL Server as the application will start on one single database server. I tried to find some resources on that on the internet but couldn't find any.
I would really appreciate if you can give me some advice and if you have any good links that would be great...
Hi gurus, I'm creating a web application where I will have a large number of tables (between 10k and 20k), this is done for the sake of scalability as tables will be moved to different database servers as the application grows and also for performance (smaller indexes). I'm worried though how having a large number of tables could affect the performance of SQL Server as the application will start on one single database server. I tried to find some resources on that on the internet but couldn't find any.
I would really appreciate if you can give me some advice and if you have any good links that would be great...
The DB I am working with has about 10 tables and some of the tables have 200,000 to 500,000 records. All tables have a clustered index on the primary key.
I performance during INSERT could be better I think - I add thousands of records at a time from many connections.
Is there a way to defer the update of indicies? So that, I can update the tables and then let the indicies regenerate ?
We are running SQL Server 2005 Ent Edition with SP2 on a Windows 2003 Ent. Server SP2 with Intel E6600 Dual core CPU and 4GB of RAM. We have an C# application which perform a large number of calculation that run in a loop. The application first load transactions that needs to be updated and then goes to each one of the rows, query another table get some values and update the transaction.
I have set a limit of 2GB of RAM for SQL server and when I run the application, it performs 5 records update (the process described above) per second. After roughly 10,000 records, the application slows down to about 1 record per second. I have tried to examine the activity monitor however I can't find anything that might indicate what's causing this.
I have read that there are some known issues with Hyper-Threaded CPUs however since my CPU is Dual-core, I do not know if the issue applies to those CPUs too and I have no one to disable one core in the bios.
The only thing that I have noticed is that if I change the Max Degree of Parallelism when the server slows down (I.e. From 0 to 1 and then back to 0), the server speeds up for another 10,000 records update and then slows down. Does anyone has an idea of what's causing it? What does the property change do that make the server speed up again?
If there is no solution for this problem, does anyone know if there is a stored procedure or anything else than can be used programmatically to speed up the server when it slows down? (This is not the optimal solution however I will use it as a workaround)
Hi I have created one store procedure which handles global updates I am using cursor to fetch one be one row for updating (It is required for implementing business logic)Now when i execute this store procedure ---it gives me dedlock error , I dont know why i m getting this error(Approx number of rows 1.5lakh)if then i removed unnecessary records from table (Approx -50000) it works fine,Is there any way to handle itI am calling this storeprocedure from my window service.please give me a good solution if possible
I am trying to enable database mirroring for 100 database. It goes error free till 59 databases (some times 60 databases) with the status (principal, synchronized) on principal. on the 60th or 61st database it gave the status (principal, disconnected). Also mirror starts acting abnormal. connection to mirror starts to give connection timeout and it is not enabling database mirroring on any more databases. I have SQL SERVER 2005 Enterprise with SP1 on the servers. witness is not included yet.
this are my test servers... i have more than 500 databases on my production servers.
principal and mirror both are using port 5022 for ENDPOINT communication.
I am trying to enable database mirroring for 100 database. It goes error free till 59 databases (some times 60 databases) with the status (principal, synchronized) on principal. on the 60th or 61st database it gave the status (principal, disconnected). Also mirror starts acting abnormal. connection to mirror starts to give connection timeout and it is not enabling database mirroring on any more databases. I have SQL SERVER 2005 Enterprise with SP1 on the servers. witness is not included yet.
these are my test servers... i have more than 500 databases on my production servers.
principal and mirror both are using port 5022 for ENDPOINT communication.
All of the databases are critical and all must be included in the Database Mirroring. so, after that I tried to implement database mirroring again...... System has 3 GB of RAM, SQL SERVER (Mirror) using 85 MB of RAM but still giving this error while trying to enable database mirroring for 37th Database.....
"There is insufficient system Memory to run this query"
in object explorer ,do right-click on database and is selecting preoperties and is selecting "files" page "use full-text indexing" ckeck box is disable. how can enabled this check box? thanks , mohsen
I am putting this question here but I am not limiting it to sql server 2005 express edition.
I am developing an app on a local machine (winxp with sql server 2000 personal edition) however I came to find out that full-text does not work in this setup unless I use a server type machine.
This fouls up my development somewhat and I would like to know if there is a) a work around for my sql server 200 setup b) does full-text serach work in sql server 2005 express edition which I have installed on my PC ?
I am about to heavily index a table and have to include atleast 3 to 4 olumns in the fulltext index for this table. The table is updated very frequently and the also the columns that are involve in the fulltext indexing undergo frequent updates. As of now, I can't avoid using full text indexing as these columns are very very lengthy and basically contail text. The users of the database will give some key words as the search criteria to get infomation as to what they are looking for. How frequently should I update my full text catalog. This is a scenario where the full text is operating on various tables and each of thses table might be containingaround 300,000 to 800,00 rows. I would appreciate an intelligent siggestion as I need it as soon as possible.
I have a table with 3M rows that contains a varchar(2000) field withvarious keywords. Here is the table structure:PKColumnImageIDFullTextColumnThere is an association table:ImageIDContractIDNow, I want to do a query where the ContractID = x and Contains someword in the FullTextColumn. There is an association table that mapsImages to Contracts - so I can't use the trick of putting the Contractcode in the FullTextColumn.I'm finding that first the FTS service is performing a search on theKeyword (which can take a long time if 100K rows are returned) thenjoining to the association table for the particular contract.Is there anyway to make this faster by telling the FTS service, onlysearch this subset of rows for the keyword based on the contract.Sorry if this sounds convoluted. Appreciate any help you can suggest.Thanks!
I installed SQL Server 2005 express with advanced services which is supposed to include full-text search capability but I can't get it to work. When I try to create a full-text catalog it gives me an error because it does not think the full-text service is installed or loaded. I can't seem to find a reference to the full-text search feature to enable or install it. any ideas?
the sql server documentation states that the use of wildcards is allowed by placing an '*' at the end of the search term. I can get this to work OK in the SQL Server 2005 query window, heres an example select ID, SUBSTRING(Title, 1, 100) AS Title, Implemented, Published from Table1 where contains(title,'"Therap*"') ORDER BY Title
this works OK and returns a list ot titles with the word Therapy in the title Im trying to implelemnt this functionalty in a web app with C#. The string is passed to a stored procedure. How on earth do I pass in the quotes ?? Ive tried building the string as normal then adding single quotes on the end, so I get something like retval = txt + "*"; //txt contains the partial word im searching for, then add the wildcard then retval = "'" + retval + "'"; // add the single quotes and pass txt as a string parameter to my stored procedure. It doesnt work. Can anyone tell me what im doing wrong ?? the same query works fine in the SQL query window.
I just installed SQL Server 2005 and need to create a full-text indexing. I looked up how to do it, but the full-text indexing option is ghosted so i don't even have the option to enable it...any ideas? I tried searching for hours with no luck.
Hi! I am moving this from the Transact SQL forum as I am not getting a reply. Hope it's OK! Hello: I have successfully enabled SQL Server 2005 Full Text (MSFTESQL) on my database, created the FT Catalog in Storage, and defined a FT Index for a table ( 1 table for testing).
I have also created an expansion and replacement entry in the Thesaurus (tsENU.xml) and removed the comments. I have restarted the server and the FTS Service, In addition, (per your comments to another user) I have used sqlcmd to confirm my default language of 1033.
I am unable to get any results from the Thesaurus file. I am able to run queries against the index using CONTAINS and FREETEXT. I get results - however I only get the same thing I would get from the CONTAINS portion of the query. Also ..SQLSERVER2995MSFTEUser$.. has permission on the FTData folder.
// This query //
SELECT name, description FROM table WHERE CONTAINS(description, ' FORMSOF(THESAURUS, blender) ')
// on this Thesaurus located in this C: folder MSSQL.1MSSQLFTData sENU.xml //
A search for blender returns all strings with the exact word blender in them - not even any plural forms. A search for blend or grind returns nothing. FREETEXT does a little better returning blender, blenders, blending. Still no grind,chop, or food process.
Hello, I have read on the multiple places that filter for full text search of PDF files using FTS2005 is included in the Reader 8 etc. However, I have not found any document or instruction etc on adobe documents, microsoft documents or web that details on how to actually configure the filter. Please help. thanks Kumud
1. Can SQL Server 20005 Express do full text searches?
2. If not, is there a way to use SQL Server 20005 Express to search a database column containing text data type?
Using Visual Basic 2005 Express, I would like to do a simple search with a search textbox and button that will return the entire contents of a field of database text when one or more words in the search text box are in the field of text in the database.
I have been playing in Visual Basic 2005 Express and using SQL queries (SELECT, FROM, WHERE) to output to DataGridView controls by using ID columns as filters in the query, etc. This I can do. But I have not been able to use a word or phrase in the search textbox as a filtered query to output the entire database field of text which contains the search word or phrase in the search textbox.
Thanks for any help in getting me started with this.
I am getting a really weird error message when executing a full-text query on SQL server 2005:
------------------------- Microsoft OLE DB Provider for SQL Server error '80040e14' The execution of a full-text query failed. "The form specified for the subject is not one supported or known by the specified trust provider." -------------------------
Just to give a bit of background: we recently moved our database from a machine with SQL Server 2003 to a different computer with SQL Server 2005. This is when the error started showing up.
The query is not particular complex:
SELECT * FROM myTable M INNER JOIN FREETEXTTABLE(myTable, *, 'keyword') ct ON ct.[KEY] = M.Resource_ID
It's the "Freetexttable" bit that creates the error message. I have done some research on google, but I can't seem to find a solution.
Has anybody come across this error before? Any ideas on how I could fix it?
Using SQL Server 2005 Express (Advanced SP2) I have created a Full-Text Search application in VB for distribution on CD for single PCs. Works fine on my local machine during development.
Although the SQL Server 2005 Express edition can be distributed freely, it does not seem to support Full-Text searches in the distributed version. Is this true? Or am I missing something with my deployment?
If I need another version of Sql Server for distribution of a Full-Text Search app, how do I go about obtaining the proper DB and permission for distribution? The DB size is about 600 MB.
SQL Server 2000 is pretty well documented with the limit of 256 (see http://msdn2.microsoft.com/en-us/library/aa214780(SQL.80).aspx) but I can find no documentation anywhere that discusses the limit on SQL Server 2005.
Hello, I use the full-text search utility in SQL Server 2005 to find word in PDFs document. This is my 'Documents' table: id (PK), data (VarBinary(max)), extension (nvarchar(4)) My full-text catalog on 'data' column works fine because when I search 'Microsoft', my document containing this word is returned as result.
SELECT * FROM Documents WHERE freetext([data], 'Microsoft'); 1 | 0x255044.... | .pdf
But I need to know how many times 'Microsoft' word appears in this document. Do you have any idea how can I retrieve this information? Thanks in advance!
I am installing a FTS system on an existing system (that used LIKE % queries!! hahaha)
Anyway, it is working pretty well (AND FAST!) but when I type in a common word like "damage" I get like 32,000 records. Now, the server handles those records in about one second but the ASP page that returns the results takes about one MINUTE to download. When I save the source, it is almost 12 MEGS!!
So, basically, I am streaming 12 megs across the pipe and I want to reduce that.
I would like the system to detect over maybe 500 records and cancel the search.
I have put a "TOP 500" into the search and that actually works pretty well but is there a better/smarter method?
I have this simple full text search query that works perfectly on my own computer using sql server 2005 express, however, on the production server(shared hosting)when I added the first 50+ rows, the full text search works perfect, but as the number of rows increases, the full text search can only see the first50+ rows, but not the new ones. Is there any quick solution for this or it's just a common mistake for developers for not properly indexed columns?Is there a way to re-indexed all rows without loosing data on the live server? search query: SELECT TOP 50 *FROM li_BookmarksWHERE FREETEXT(Keywords,@Keywords)
is there a rational explanation for which after some select statements, the rank returned by the full-text search engine is 0, knowing that just after the repopulation the rank is displayed correctly?
in other words, time and usage messes up the ranking. why?
Hi, My full text search on 2 millions records is taking time to show the result. I have created full text catalog in RAM drive to make the retrival process faster. But still its taking more than 1 minute to get the matching pattern. I am using SQL server 2005. I have 2 columns (id,text) in my table..
This is my unique index script
USE [SAMPLE] GO CREATE UNIQUE NONCLUSTERED INDEX [ui_productid] ON [dbo].[Products] ( [id] ASC )WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]..
This is my primary key index script.. USE [SAMPLE] GO ALTER TABLE [dbo].[Products] ADD CONSTRAINT [PK_Products] PRIMARY KEY CLUSTERED ( [id] ASC )WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]..
This is my query..
SELECT D.[id], D.productname FROM dbo.Products AS D WHERE CONTAINS(productname, 'ford')
What should i do to show the result in 3-4 seconds.
I have upgraded a SBS 2003 MSDE database (instance MSSQL$SHAREPOINT) to SQL Server 2005 Express Advanced Edition. This worked without a problem even when I enabled all the options for the upgrade including Full-text search.
I now want to have Full-text search on this instance of SQL 2005 with database name of STS_EVEREST_1.
I first tried to use the T-SQL command of "CREATE FULLTEXT CATALOG BBVisionCatalog AS DEFAULT;" I now know that the original database was created under SQL 2000 and therefore I needed to use SQL 2000 commands. So I used the following script:
USE STS_EVEREST_1 EXEC sp_fulltext_database 'enable' EXEC sp_fulltext_catalog 'BBVisionCatalog', 'create';
It produced the following ERROR messages:
(1 row(s) affected) Msg 7609, Level 17, State 2, Procedure sp_fulltext_database, Line 46 Full-Text Search is not installed, or a full-text component cannot be loaded.
I checked to see if the Microsoft Search Service was running. It was running.
For anyone with a larger number of databases (500+): How many do you have in a single instance. If you are using multiple instance on a single server, how many dbs per instance. This is why I'm asking
We are experiencing 701 "out of system memory" and temporary (usually) system freezes when the error occurs. We have 32bit 2005 version 9.00.2153.00, 32GB of memory, AWE enabled, quad dual-core 3GHz hyperthreaded server. Nether the bPool or VAS show any pressure when the "out of system memory error" occurs. Since this error usually indicates a VAS problem we tried increasing VAS to 1GB w/the -g flag. It made no difference. PSS has been working on the case for 3 weeks. They dont seem to be finding any evidince of memory pressure either. When I last spole to the escalation engineer yesterday it seemed that they are going to recommend reducing the number of databases on the server. I asked for clarification as to whether we are hitting a 32 bit barrior, an instance limitation, or both. I am awaiting the answer. How many databases do you have on your server? We had between 1700 and 1900 (the number varies) at times when the error occured. We are now at 1500, and have not had the error in the 2 days since reducing the number of databases...