Recovery :: OS Memory Reservation For Two Node Cluster Having Three Instances
Nov 9, 2015
We are running with a 2 node windows cluster having three SQL instances on it.
OS: Windows server 2008R2 SP1
SQL : SQL server 2008R2 (10.50.6529)
Currently both nodes have 256 GB or memory and we are having multiple auto failover for resources. What will be the best practice for OS memory reservation (OS+tools) so that we can set SQL max memory settings accordingly?
What I have- Sql server 2012 (Standard Ed) Cluster on Windows 2012 R2 with both instances running on the same node- just to save on License, i.e. technically it’s Act/Pas cluster.
What I am looking for- how to configure cluster (e.g. via quorum, etc) to force both instances failed together? Means if for some reason 1-st instance will fail to node 2 another instance should follow (otherwise it will be Act/Act cluster and 2-nd license is required).
If there is no standard way (cluster configuration I mean) to do it I should create some custom process to monitor where each instance is running.
It is an active passive cluster which doesn’t allow any testing. All instances have to be failed over together, we aren’t allowed to just failover 1 even for testing purposes. Node 1 is the active node and we can failover to node 2 for 30 days free of charge but services have to then be failed back.
We need to run the cluster with node 1 as the primary node always and 2 just use for failover testing or for less than 30 day periods whilst performing cluster patch upgrades etc.
Now l am sure we could fail over 1 instances at a time for testing and diagnosing issues plus if add a new instance that's not production to get to the platform level as the rest of the instances this would avoid taking production down in the fail over process.
Cluster services gives the high availability needed - that is great.But I have never seen any discussion about what happens when a nodefails - what do you do to get everything back to the active-passivetandem.I imagine there is not much difference in terms of recovery procedurefor either active or passive node. So I'm just going to make up ascenario that we have encountered. The system hard drive (not theshared disk) on primary node fails. Cluster fails over to the passivenode. Following are the problems I have at hand:-After installing windows, I need to install driver and configure thepermission to access the SAN. There is no way I could do it since thesecondary node has exclusive access to the disks.-Imagine I got that working, is there anyway to install SQL so SQLwould know this server used to be the primary node and attach the DBand translog automatically-Finally, there is no proper way to apply SQL 2000 service pack 3a.Originally when the cluster was fully functional, the service pack wasapplied to active node and that automatically upgrades passive node.Now we have a machine without 3a and a machine with 3a alreadyinstalled. See any problem?Consider all of the above as this one big question: What is a properprocedure to restore a cluster when one of the node goes down? Whetherit's the active or passive node.
We have 2 clusters, 1 running SQL 2008 on Windows 2008 R2 server and 1 running SQL 2000 on Windows 2003 Server. Because of a disaster with the disks, each of the passive nodes had to be rebuilt and Ive been asked to install SQL on the nodes.
Ive not done this before. Does this mean simply adding a new node to the cluster through the wizard? Or do I need to reinstall the entire cluster?
I think SQL 2000 is too risky as its unsupported, so Im going to resist that. But how should I approach the SQL 2008 Instance?
In case of hardware unrecoverable issue, I have two msdn articles which states different things.
First one claims you remove the node from mscs.
[URL]
Second one claims you should remove it using sql server installation and links to the first link which says you should do it from mscs:
[URL]
Then this third article invalidates the second article. "To remove a node from an existing SQL Server failover cluster, you must run SQL Server Setup on the node that is to be removed from the SQL Server failover cluster instance."
[URL]
It is a hardware faillure where the secondary node is inaccessible.
So what is the proper way to evict a node you cannot access due to a hardware failure?
note: I don't plan on adding back the failed nodes after removing it. i.e. I am interested in the removing part.
We have two locations in US, I am thinking of having 2 node SQL cluster for Lync 2010, I alardy have One DB server running in one location, now we got new site where we are planning to have one more DB for redundancy.
I have a Windows 2008 R2 Always on Cluster with 3 nodes (two in the primary site and one in the DR site).
Primary Site: -Primary Site Server1 -Primary Site Server2
DR Site 1 (to be decommed): -DR Site Server1
Our company is planning on decommissioning the DR site. But before we do this, we want to add a 4th site to the cluster. Migrate the data...and then decommission the original DR Site.
Is it possible to have this configuration:
Primary Site: -Primary Site Server1 -Primary Site Server2
DR Site 1 (to be decommed): -DR Site Server1
DR Site 2 (NEW DR Site): -DR Site Server1
IF this is possible, do I simply add the new DR site to the existing cluster (same steps as adding the first DR node to the cluster when the cluster was originally configured? or are there special steps?
My environment has a 4 node cluster , 2 in primary and 2 in sec dc. Storage is sperate for both.
Need to setup always on for 4 Instances there on the 2 nodes of the primary dc. Is there any restriction in setting up always on for multiple instances for a cluster.
Is it possible to have more than one instance of SQL Server on a failover Active/Passive cluster? What are the concerns/ramifications if that indeed is possible?
Server : Windows server 2008 DB Server : SQL Server 2008 (SP1)
Here are the series of events which happened.
1.) Event ID: 1135 Cluster node 'XYZ' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
2.) Event ID: 1049 Cluster IP address resource 'SQL IP Address 1 (XYZ)' cannot be brought online because a duplicate IP address '10.9.8.113' was detected on the network. Please ensure all IP addresses are unique.
3.) Event ID: 1069 Cluster resource 'SQL IP Address 1 (XYZ)' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.
4.) Event ID: 1049 Cluster IP address resource 'Cluster IP Address' cannot be brought online because a duplicate IP address '10.9.8.112' was detected on the network. Please ensure all IP addresses are unique.
5.) Event ID: 1069 Cluster resource 'Cluster IP Address' in clustered service or application 'Cluster Group' failed.
6.) Event ID: 1066 Cluster disk resource 'Cluster Disk 25' indicates corruption for volume '?Volume{88552e6f-aea2-11df-9790-0026b92fffa7}'. Chkdsk is being run to repair problems. The disk will be unavailable until Chkdsk completes. Chkdsk output will be logged to file 'C:WindowsClusterReportsChkDsk_ResCluster Disk 25_Disk16Part1.log'. Chkdsk may also write information to the Application Event Log.
7.) Event ID : 1066 Cluster disk resource 'Cluster Disk 26' indicates corruption for volume '?Volume{88552e05-aea2-11df-9790-0026b92fffa7}'. Chkdsk is being run to repair problems. The disk will be unavailable until Chkdsk completes. Chkdsk output will be logged to file 'C:WindowsClusterReportsChkDsk_ResCluster Disk 26_Disk4Part1.log'. Chkdsk may also write information to the Application Event Log.
8.) Event ID: 1049 (Same message as point 2)
9.) Event ID: 1069 (Same message as point 3)
10.) Event ID : 1049 (same message as point 4)
11.) Event ID :1069 (same message as point 5)
12.) Event ID :1205 The Cluster service failed to bring clustered service or application 'Cluster Group' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.
13.) Event ID: 1069 Cluster resource 'Cluster Disk 17' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.
14.) Event D : 1049 (same message as point 2)
15.) Event ID: 1069 Cluster resource 'SQL IP Address 1 (XYZ)' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.
16.) Event ID : 1205 The Cluster service failed to bring clustered service or application 'SQL Server (MSSQLSERVER)' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.
first of all,I went through all the logs, and could not find the reason for fail-over initialization. There should be some thing logged why the failover happened? secondly after failover the service was not coming online due to duplicate IP address detection.
Later when we try to manually bring the service online from cluster management it comes online successfully. I don't understand how would duplicate IP address get resolved when we start manually.
Lastly we see few errors related to physical disk resource between failover retries, is this could be the correlated to failover error ?
On first Node A: The server has 16 GB of physical RAM. On second Node B: The server has 10 GB of physical RAM.
Now, this being Active Active, Node A can be clustered on failure onto Node B..Now reporting server is configured under these two nodes, with defined MAX and MIN server memory of 12 as MAX and 0 is min IN GB.Now with this setting on SQL whenever the cluster moves, such config make OS goes low on node for 10 GB.I am only left with option of switching this MAX and MIN to a default setting or is there any other alternative such as script which can change this setting accordingly when cluster moves to respective server.
Could someone give me an ideal of what it takes to upgrade the memory in my cluster. Is it as easy has upgrading the passive node, switching the nodes, then upgrading the other server. And do I need to re-configure the windows cluster. And will both server need to be down at the same time at any point.
Or is their more to the process? Thanks for any assistance.
But I'm not sure if I have to install SQL Server first on node 2, then add it to the cluster. Or does adding it to the cluster also install the software?
I'm contemplating running two availability groups on a two node WSFC. The WSFC is setup with a file share witness (i.e. no shared storage). Can I safely run 1 AG on one primary node, and the other AG on the other node (as primary). Each AG would have replicas on the passive node. This would effectively allow both servers to be in use at the same time. In a failover event, I understand that both workloads would transfer to a single server - so the box needs to be sized appropriately.
We are in the process of building a 3 node SQL Server Cluster (Server 2012/ SQL Server 2012), and we have configured the quorum so that all 3 nodes have a vote (no file share witness as we already have an odd number of nodes).
As I understand it, this should allow the cluster to run as long as 2 of the nodes remain online.
However, the validation report states that 2 node failures would be acceptable and, when we tested this by powering off two of the nodes, the cluster did indeed continue to run on a single node.
I configure Windows 2003 R2 and SQL 2005 two nodes Cluster. When I move cluster resource from one node to anther node it takes around 30 seconds to become online. So in that time if any query is running it stops responding.
During the installation of Adding node to a SQL Server failover cluster(On passive node) getting error like.. The MOF compiler could not connect with the WMI server. This is either because of a semantic error such as an incompatibility with the existing WMI repository or an actual error such as the failure of the WMI server to start.We run the below commands but didn’t get any resolution & got the same above error .
1<sup>st</sup> Method…
1. Open console command (Run->CMD with administrator privileges).
2. net stop winmgmt
3. Rename folder %windir%System32WbemRepository to other one, for backup purposes (for example _Repository).
I invoke xp_cmdshell proc from inside a stored procedure on a 2-node active/passive SQL 2005 SP2 Standard cluster. Depending on which server the xp_cmdshell gets executed on I need to pass different arguments in the shell command. I thought I could use host_name() function to get the runtime process server, however, I am finding that it's not behaving correctly. In one example I know my active node is server2, but the host_name() function is returning server1. The only thing that I could possible explain this is that the MSDTC cluster group is not always on the same active node as the SQL server group and in the case I am talking about the cluster groups are in this mode (differnet nodes). Does the xp_cmdshell get executed by the SQL active node or the MDTC active node? And what is the best way to find out which server is going to run my xp_cmdshell?
Thanks.
Edit:
Perhaps another by product of this is that if I run select host_name() from the Studio Management query window i get different results depending on which server I am running the Studio Management on. On server1 I get server1 and on server 2 I get server 2, all the while server2 is the active node. I need a different function that will always let me determine the correct server that'll be running the xp_cmdshell...
Edit 2: I guess I could determine the running host inside the command shell itself, but I am curious to see if i can do it (cleaner) from SQL.
We have a requirement to build SQL environment which will give us local high availability and disaster recovery to second site. We have two sites- Site A & Site B. We are planning to have two nodes at Site A and 2 nodes at Site B. All four nodes will be part of same Windows failover cluster. We will build two SQL Cluster, InstanceA will be clustered between the nodes at Site A Server and InstanceB will be clustered between the nodes at Site B, we will enable Always On Between the InstanceA and InstanceB and will be primary owner where data will be written on InstanceA and will be replicated to InstaceB. URL....Now we want we will have instanceC on the Site B and data will be writen from the application available on Site B, will be replicated to the instance on the Site A as replica.
Our management want to have two instances of SQL Server 2005 in Clustered environment.
First instance will serve existing application and Second instance will serve their new application.
As Microsoft suggests that, its not a good practice to have multiple instance on the same node, until you have any compelling reason to have such setup. The compelling reason what the mangement have is , some time ago they used same instance for both the application. What happened was, the secondary application took all the resources from primary(main) application, and server went down( this was past.).
So, now in clustering they want to have their first instance (which will serve our existing application) with enough CPU cores & memory so that it can run smoothly and then have second instance with the remaining CPU power & memory. So ,in case if second instance tries to eats up all the resources, it can eat what it have , it cannot take resources from primary instance.
But, what my idea here is:
Initially, dont install second instance for new application( in clustering environment). Instead ,use existing servers for the upcoming new appication where production & DR is there right now ( Standalone servers).
What i mean here is: After making sure that the existing (Primary) application is comfortable in clustering environment (in PRODUCTION) , until then, run the second application on the old PROD box & old DR will serve as its DR. So ,by going in this way , we are not installing initally two instances on the same node. When the life is good for the first instance, then based on those results, we can think for the second instance.
Ii wanted to know from you guys, what you think about this. Is this idea look feasible. Please let me know.
We are planning to change all IPs of PRODUCTION Failover Cluster Setup. In my cluster setup ... we have 2 Physical Nodes with windows-2008, Roles are MSDTC and SQL-2008R2.
IP change for:
1. Both Nodes(Physical) 2. MSDTC 3. SQL Server 4. windows Cluster
So Almost... All IPs are going to change.
Im DBA here, I need to take care of SQL cluster and MSDTC. But I haven't performed this activity before.So I'm worrying about Impacts and consequences of this change. steps how should I perform this activity.
We have (had) an active/active cluster. 2 physical machines, each running their own instance, clustered together. Node1/Ins1 and Node2/Ins2.
Node2 failed and Ins2 failed over to Node1 as it should. Node2 required that we rebuild the server (rebuild = reinstall O/S). Now we need to get Node2 back into the cluster and get Ins2 failed back over to Node2.
Does anyone know, for certain, the correct steps to accomplish this? Obviously, we could backup everything and completely destroy Ins2 and recreate it on Node2 then rejoin the cluster. But I'm looking for something less destructive.
Is it possible to reinstall SQL, then rejoin the cluster, and then fail Node2 over? Or will there be registry conflictions?
Any help would be appreciated. Also, if you have any links to some official documentation, that would be great too.
How do I add my second (secondary) node in my AlwaysOn Availability Group, after adding my head node, and the secondary node is a virtual machine. See based on the attached file if it is the correct way?
Is it possible to setup an cluster with sql server 2005 as single node cluster - and lets say in 4 month we add the second node to the cluster? - its because of budget and we do not want to setup then again.
We have been working on a project to upgrade the servers in our 2-node SQL Server environment. I evicted a node after removing it from the instance. We added the new node under a new server name. I then start the Add remove programs, choose to change the SQL 2005 environment, type the virtual server name. Choose to maintain virtual server, pick to add new node. All seems well, I enter all prompted questions, and when the install begins I get the error below.
Product: Microsoft SQL Server 2005 -- Error 1706. An installation package for the product Microsoft SQL Server 2005 cannot be found. Try the installation again using a valid copy of the installation package 'SqlRun_SQL.msi'.
So I copied the SS2K5 Enteerprise Edition software to the local C: drive, point it too the 'SqlRun_SQL.msi' in the setup folder and still get the error.
A Microsoft cluster, (SQL Failover cluster) with one node as the domain controller. The cluster was built off site and the domain name used is the same as our existing domain where we eventually need to install this cluster.
We need: (At least I think we need:)
To remove node 2 from the "cluster domain", DCPROMO node 1 and eliminate the "cluster domain". We then need to join the cluster (nodes) to the existing domain. We also need to recreate the accounts and groups used during installation.
Questions:
1) What will happen to the "domain accounts" used when installing SQL2005? (Other than they will go away. I mean what adverse impact will that have on the installation?)
2) Will I have to re-install SQL 2005?
3) Is my paranoia real or imagined? (Will Elvis live?)
Any prior experience with this would be greatly appreciated. In fact, a WAG is appreciated too.
Currently in my environment we are using SQL server 2012.We setup Alwayson with synchronous commit.Details of existing AlwaysOn: one primary and two secondary.
Primary: On-Premise server. Secondary1: On-Premise server. Secondary2: Azure VM. Requirement: We need to add Secondary3 New Azure VM on same AG with asynchronous mode or synchronous mode. Or We need to create one more AG on same DB and add the new replica with asynchronous.Is it possible above 2 option in this scenario? My cluster environment is Manual failover only not auto failover.
We have 2 nodes window Server 2012 R2 and SQL Server 2012 Enterprise Version cluster setup. We can switch roles and Node to one node to another and revert back to previous node with out any issues. But we are facing when one Node is restarted. We could not restart that Node in cluster Service start in Failover cluster Manager. Error Details is displayed as below inside double code."Cluster node NODE1 could not to join the cluster because it failed to communicate over the network with any other node in the cluster. Verify the network connectivity and configuration of any network firewalls."
I checked windows firewall. windows firewall is all of in Node1, Node2, SAN and DC.I have disabled and enabled the Internal and private network of Node 1. I have validated the cluster. it is showing no error though.
Node1: Public IP: 10.10.0.11 SubNet Mask:255.255.255.0 Default Getway: 10.10.0.1 Prefered DNS: 10.10.0.10 (Ip of DNS)
[code]....
Private Network: Not configured.pinging to each other ip is successful from one node to another.
We had an existing 2 node active / active cluster, 1 running a default instance of Sql Server 2005 Enterprise Edition 9.0.3152 (SP2 + Hotfixes) and the other running a named instance of the same version.
We recently added 2 new nodes to the cluster, they were successfully added and we tested the cluster group failover successfuly to the new nodes.
Last night we tried to install Sql Server 2005 Enterprise edition on the new nodes.
I followed to proper proceudure of modifying the installation for both instances and selecting the 2 new nodes to apply them to. This went 100%. Sql Server 2005 successfully installed for both instances on the 2 new nodes, all log files were successful.
We then tried to apply SP2, we tried the following:
1. We ran SP2 from the active node, but when we go to the screen to select what you want to apply SP2 too we could not select anything, if you clicked on database engine the message said that these instances were already at a later version and we could not proceed. This is how i successfully applied SP2 to the original 2 node cluster but it does not work for additional nodes to an exisitng cluster.
This is also what all the documentation we could find said, refer to SP2 release notes under the topic "Failover Cluster Installation", it is also the method we found when googling.
2. We then tried what is described under SP2 release notes "Rebuild a SQL Server 2005 SP2 Failover Cluster Node". We ran SP2 from the new nodes while they were passive, but when we got the screen where you select what to apply the SP2 too we could not select database engine the message at the bottom said that SP2 must be run from the active node and that we were attempting it from the passive node, this is what we tried in step 1 described above.
3. This was a last resort. We were advised to try failing over the instance to the new node and then running SP2. Personally i thought this was a bad idea, one should never fail over a instance of sql server to a node with incompatible binary versions and secondly when we installed sql server on the new nodes a warnng popped up before hand stating that the instances were at a later version and that the new nodes must be at this version before attempting fail over. I thought that sql would not even start, to my surprie we successfully failed over the sql group to the new node, when we ran SP2 it looked good we could select the database engine on the new node to apply SP2 too, BUT after clicking next after a few seconds the SP2 installation just closed, NO INFORMAITONAL MESSAGES NO ERRORS NO WARNINGS it just closed an never came back.
I had never seen this happen on a cluster before, needless to say this made me very nervous so we failed the sql group back to the original nodes and gave up.
PLEASE can some tell me how to apply SP2 to 2 new nodes in a 4 node cluster all methods descibed in SP2 release notes and other documentation as descibed above in step 1 and 2 do not work !
Does anyone know of a way within T-SQL to identify the cluster node that the current SQL Server is runinng on?
We have a cluster of two SQL 2000 servers - one active and one passive. For some reason the two nodes are not identical and some code needs to run differently on each node. We need to do a check before running some code to find out what node the SQL instance is running on?
At present the only way we can find this out is through the cluster managemenr on the server.