SQL Server 2000: The Agent Is Suspect. No Response Within 10 Minutes
Apr 20, 2007
Hi,
I have two servers, one production server and one backup server which have transactional replication with a pull subscription.
When I configure replication, it works fine during our test weekends testing production load. After tests, replication looks fine for a random number of days. Then, all of a sudden, an error message is displayed on one of the agents: "The agent is suspect, no response within 10 minutes." This has happened a number o times. If I remove replication and configures it again, it always works. Sometimes it works by just updating one of the tables and the error message disappears. The last time (today) that did not work. Updating the database did not replicate and the error message remained.
Has anyone experienced this same problem and has a god solution. One thing that is common is that the error message appears after long times of inactivity on the servers, or perhaps after a restart but that I am not sure about.
Question 1: How can I prevent this error message?
Question 2: Are there any special things to think about when I need to restart the servers and replication is configured, e.g. after installing updates from Windows Update.
I would be very grateful for any answers regarding this.
I have recently installed SQL 2005 client tools with SQL Server Management Studio and accessing databases on a SQL 2000 server. The response I am getting is extremely slow. Should I go back to SQL 2000 client or are there methods by which I can improve the performance.
Please assist, the issue with SQL 2005 Browser and SQL 2000 Server Service is understood.
Our problem is with networked 2000 instances and SQL Express. The SQL 2000 machines (Standard & MSDE) do not have SQL 2005 or Express installed only 2000. When a SQL Express computer is put on the network with the SQL Browser service running almost all SQL 2000 machines lose sight of the other SQL 2000 instances. The second the SQL Browser is turned off on the SQL Express box the SQL 2000 machines can see each others instances. It appears that the response from the SQL Express SQL Browser causes the SQL 2000 machine to stop listening for responses. Once in a while one of the SQL 2000 instances will show up with the SQL Browser active on the network and it is my belief that it is because that response made it in before the SQL browser response. Please help as this does not appear to be a recognized issue. I'm assuming there aren't many sites running as many named instances on individual machines like we do.
Please note this appears to be a problem with SQL Express Browser not SQL 2005 Standard's which runs without problems on our network.
Environment: Sql Server 2000 Transaction Rep Distributor. Pub, Sub, and Distributor on separate machines.
Distributor Agent Gets: "Timeout expired (Source: ODBC SQL Server Driver (ODBC); Error number: S1T00)" and the Session Details of the the Distribution Agent says "The process is running and is waiting for a response from one of the backend connections." Of course, in true MS fashion, the message does not specify which connection it is waiting on. It could be waiting on the Publisher or it could be waiting on the Subscriber. Does anyone know how to tell? It would be quite useful to know which "backend" is indicated!
We get this event like clock-work at exactly the same times early every morning as well as at other random times during the day. I can not see any reason for it. Do you have any useful advice?
one of my sql database went in suspect mode can any one advice why it has happened is it the problem with sql server . This has happend for the second time in 4 months .
We are consistently getting the error message below on our subscribers that have blob images. Is there a way to increase a setting to avoid SQL to throw this error, or another suggestion? Thanks in advance.
John
Error messages:
The replication agent has not logged a progress message in 10 minutes. This might indicate an unresponsive agent or high system activity. Verify that records are being replicated to the destination and that connections to the Subscriber, Publisher, and Distributor are still active.
Hi, I have more than 80 databases on my publisher (SQL Server 2000 SP4), I tried to enable Transaction Replication on all of those databases instantly through some T-SQL programming and DTS Packages. Every thing works fine until the snapshot agents starts to take sanpshot from the publisher databases. As soon as their snapshot agents start for those 80 databases, they start giving the deadlock error. All 80 snapshot agents starts at the same time.
Error Message: Transaction (Process ID xxx) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Error Detail: Transaction (Process ID xxx) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction. (Source: Server_Distribution (Data source); Error number: 1205) ---------------------------------------------------------------------------------------------------------------
Hi I want to suspect database stop server first I try to rename C:Program Files (x86)Microsoft SQL ServerMSSQL.1MSSQLDatamsdbdata.mdf to msdbdata.sav and then start the server use command to check: SELECT status & 256 FROM master.dbo.sysdatabases WHERE name = database_name if the result is 256,it means the msdb is suspect,but the result is 0,it same as the normal status do you know how to set database suspect with this way, or do you know other way to suspect databse. absolutely,I could re-back my server noraml with your way Thanks
Guys, Got an issue with an DB where it turn itself to suspect with just some normal queries. Its SQL 2000 8.00.2175 on Win2K3 sp1. From SQL log, it got the following:
Any idea on what will be causing it and what can be done to fix it? We tried rebooting the instance with no luck and we finally restore the DB from last good backup. But we would like to find out the cause of it. Thanks
I have a Microsoft Cluster running on Server 2003 Entrprise. SQL 2000 8.00.2039 (SP4). 5gb physical memory installed.
With the databases online we run a test failover from the Cluster administrator. It takes about 30-40 seconds and completes without generating any Server Event log errors nor SQL log errors. Everything looks good from an administrative stand point.
However, when we test with running a series of queries to the databases, then failover, we notice that it can take up to 3 or 4 minutes before some of the databases will respond. Connections are not refused, they just sit there.
How can we troubleshoot this or does anyone have a similiar experience with this scenario?
Now I'm choosing Trasactional publication as my publication type. I now can successfully replicate the data from publisher to subscriber without any error.
But now my question is what is the usage of snapshot agent? When I go to Replication Monitor > Agents > Snapshot Agents > Start Agent but failed.
I need to check with you whether the way I start the snapshot agent was not correct? What is the effect if I drop snapshot agent during the replication and DBUpdate are running?
I've got a couple of jobs who have a odbc connection to a AS400 machine. But when these jobs run they won't stop anymore. I've got to stop these jobs manually so that the next day the jobs can start again as scheduled. The jobs did run all the packages succesfully. Does somebody know how this is possible? It did work fine but since a couple of weeks they just won't stop anymore. I hope you can help me! :S
I've my database, testdb, ended up in Suspect state. The SQL log shows " I/O error 38(Reached the end of the file.) detected during read at offset xxxxxxxxxx in file '<path> estdb_Data.MDF'" during recovery. I do not have backup to restore the database from. So to run DBCC CHECKDB, I tried to put the database in emergency(bypass recovery) mode using
update sysdatabases set status = 32768 where name = 'testdb'
DBCC CHECKDB showed some allocation and consistency errors and suggested "repair_allow_data_loss" as minimum repair level.
Now to run
DBCC CHECKDB('testdb', repair_allow_data_loss)
I've to put database in SINGLE USER mode. For that I started SQL server by command
sqlservr.exe -c -m
Now when I try to run DBCC CHECKDB with repair option it says "Attempt to BEGIN TRANSACTION in database 'testdb' failed because database is in BYPASS RECOVERY mode."
So it seems I need to change the status of database such that it will allow me to repair it. If I try to reset the status, the database again goes in Suspect state and it seems DBCC commands don't run on database in Suspect state. Does anybody know how to recover the database in this state? Is there any other way to repair it?
When I run sp_delete_job to delete the whole job inside, will the job steps, job schedules and job servers of that job will be deleted automatically also?
so, do i need to run sp_delete_job only? or need sp_delete_jobstep, sp_delete_jobschedule and sp_delete_jobserver also?
While I was out of the office the Lan Team moved one of my SQL Server2000 servers to a new network domain. Since then the maintenance jobhas not ran.The error log for the SQL Agents has the message listed in the subjectline. I have not found any useful articles on the MS SQL Serversite. Anyone know what might be wrong and how to fix it.HTH -- Mark D Powell --
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[table_Data]') AND type in (N'U')) DROP TABLE [dbo].[table_Data] GO /****** Object: Table [dbo].[table_Data] Script Date: 04/21/2015 22:07:49 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[table_Data]') AND type in (N'U'))
After differential restore I start Remedy service. It starts in few seconds.
After full restore the same service takes 15 minutes to start. Bothe the things are done through SQL service agent. Even manual restaring the service also takes 15 minutes after full restore. WHy is it happening this way?
Hi I would like to know if it's possible to use pop3 server protocol instead of MAPI application to cause SQL Server 2000 sends email to an Administrator when anormal event comes ? i.e, configure a pop profile to receive sql server 2000 error
Hi I would like to know if it's possible to use pop3 server protocol instead of MAPI application to cause SQL Server 2000 sends email to an Administrator when anormal event comes ? i.e, configure a pop profile to receive sql server 2000 error
We have replication setup on a sql server 2000. We encountered issue that the distribution agent goes down (distrib.exe stop running) in the event of network connection broken. We would like to know:
is this the expected result that the distribution agent will go down in the event of communication failure between the distribution server and subscriber server? if not, is there a way to programmatically control and restart the agent? Is there any sp in SQL server which can monitor the replication communication error message? is there any sp in SQL server which can be run to restart the agent? For the best practice, what do you think we can do to achieve an €˜event-driven€™ kind of mechanism so when an communication breaks, the agent can be restarted by the triggered event (or at least a simple way to automatically restart)?
I have a SQL Server 2000 runing for production. Recently it is frozened occasionally. Ath the time, no response from SQL server even I use Enterprise Manager, I can not connect to the server.
So there is no way to fixed only reboot the server. Aftrer that, I checked the error log and go the info as:
SQL Server terminating because of system shutdown. LogEvent: Failed to report the current event. Operating system error = 31(A device attached to the system is not functioning.).
After reboot, I checked the error log again, seems no special error except:
Attempting to initialize Distributed Transaction Coordinator. Failed to obtain TransactionDispenserInterface: Result Code = 0x8004d01b
What's the possible reason for this issue? how to figure out and slove this problem?
When I try to connect to a SQL server instance from Enterprise Manager, I'm getting a timout connection error. I have to change the timoeout parameter from 4 (the default) to 30 in order to work. Also I realize that some applications (like sharepoint) are having the same problem connecting to that server.
My question is:
Why is that happening?
It used to work fine, and I'm getting this issue a couple of days ago.
I have a problem using service broker, a send the message from server SSB1(initiator) and a receive this message on server SSB2(target), but I don't receive response to SSB1...
In my server SSB2 has this messages on Profiler: - This message could not be delivered because it is a duplicate. - Could not forward the message because forwarding is disabled in this SQL Server instance. - The message could not be delivered because it could not be classified. Enable broker message classification trace to see the reason for the failure.
Message from SSB1 Profiler:
- This message was dropped because it could not be dispatched on time. State: 1
I am working with a client that after every reboot of there SQL 2000 DB server, they experience slow response time for a couple of hours. The server has 12 GB RAM and a Dual 3.8 processor. It is believed that the slow response is due to as queries run after the reboot, they are re-building information in memory and after the memory is built up, it goes back to the normal performance utilizing the memory for speed. Is this an accurate assumption or is there something else to be looking at after the server is rebooted?
when I run a package from a command window using dtexec, the job immediately says success. DTExec: The package execution returned DTSER_SUCCESS (0). Started: 3:37:41 PM Finished: 3:37:43 PM Elapsed: 2.719 seconds
However the Job is still in th agent and the status is executing. The implications of this are not good. Is this how the sql server agent job task is supposed to work by design.
I need to know if I am able to use OLE DB connection strings without the username and password for Challenge/Response logins.
I have a web site that uses SSL and Challenge/Response to authenticate users, but my connection to the database is by embedding a generic username and password in the connection string.
I would like to leave that off and have the connection to the database use the challenge/response authentication when they first logged into the web site. This way I can control their permissions in SQL server.
I have 2 servers (say MAINSRV e SECSRV) running SQL2000 Standard SP3 on Windows 2000 Advanced within a NT (!) domain and each server is linked to the other.
My problem is that if I run a query returning few dozens of rows like:
SELECT * FROM MAINSRV.DbName.dbo.TblName TBLA WHERE Fieldx = 'anyval'
from a client connected to the SECSRV server, it takes something like 35 minutes to complete, while the same query completes in no time when run on clients connected to MAINSRV.
Even the simplest SELECT Count(*) FROM... takes more than one minute from SECSRV while completing in a fraction of second from MAINSRV.
I tried to change the linked server security options (SQL/Windows), but the remote query remains slow.
There are no locks active on the table, both the servers have almost no load (CPU less than 10%, when tested) and the query returns just a few KBytes, so communication overhead will not be the problem.
Any suggestions will be very appreciated, thank you!!!
One of my database (name XYZ) shows suspect status in EM but when i try to dig further by running below query i get only "OK" ( see query)
SELECT [name],status, case status when (status & 32) then 'Loading' when (status & 32) then 'Loading' when (status & 64) then 'Pre Recovery' when (status & 128) then 'Recovering' when (status & 256) then 'Not recoverd' when (status & 512) then ' Offline' when (status & 1024) then ' Single user' when (status & 1024) then ' Read only' when (status & 32768) then ' Emergency Mode' else 'OK' end as status FROM master.dbo.sysdatabases WHERE [name] NOT IN ('distribution', 'tempdb', 'model','Pubs','Northwind')
also i run this
select * from sysdatabases where databaseproperty(name,N'IsSuspect') <> 1
and here also i get all the database including "XYZ"...i guess if "XYZ" is suspect the resultset should not be including "XYZ"
Why if the EM shows suspect status FOR "XYZ" DATABASE it should come up when i check status column in sysdatabases table?
We just moved source server to newer, bigger box ... Windows 2003 and Active Directory ... Snapshot agent worked but distribution failed ... Same login as on older machine, login is sysadm, used DCOMCNFG to allow ability to launch process ... What are we missing?
I am trying to set the account the SQL agent uses to a domain account so I can use SQL mail. The account works ok for the SQL service but not the agent.
I have tried re-installing SQL 2000, and even in the installation it will not allow the agent to use the domain account.
Anyone have any ideas as to what could be causing this.
We have an application that has about 100 users at a time. Roughly once a day, we experience a complete slowdown on the server. All users notice it. The network seems fine because I can ping the server. Also, I can attach to drives on the server quite fast so I don`t think it`s server resources. When I manage to get in and do an sp_who, certain processes are blocking others. Talking to the users who were blocking, they were not doing anything out of the ordinary - one was even doing just a select. The error log is full of 17824 and 1608 errors. Is there some configuration setting that I should change? This is getting serious!