I had a hard time with my node replacement. And I am hoping to find out what
I did wrong and perhaps some answers to a few lingering questions, I want to
step through what I did
1. I was using a whole new name and IP address for the new node, so I place
it on the network the day before and attached it to the SAN, But did not
assign it to the storage group or attach LUNS or Any cluster environment.
The next day I first removed SQL from the Second node, It came up with thte
window that lets me pick the Node, However when I did this it removed my
Virtual SQL Cluster name and IP, which worried me tremendoulsy, is this
Normal?, then I evicted it from the cluster and it seemed okay.
I then turned off Node 1 and added the new Node to the storage group, added
all the appropriate SAN software, rebooted and when my Node 2 came up
assigned the same drive letters to the shared storage as on the first node,
then off the second node and restarted the first. waited for my drives to
appear Okay, then started the second node.
I had all my IP'S and Heartbeat configured and everything came up fine.
Then I added the new node 2 to the MS cluster and it went well,
Now I tried to load SQL and I was sure it wouldn't become part of any SQL
CLuster because it appeared it didn't exit anymore. I readded the SQL name
and IP to the cluster resources and went ahead and added it, But It didn't
seem to work right, Every time I fail over, It works and I can use the SQL
Virtual IP, But I have to re-attach the DB's every time I fail over.
I am not sure if I missed a step, Or if Node one should have been off when I
removed SQL from Node two, One thing I am questioning is whether the SQL
services hsould be running on both Nodes, even when one is passive
Can anyone pinpoint my issue or help me with Post-game analysis
Scott A Cummins
Sr. Systems Engineer
Equity Analytics
( A division of Merrill Lynch)
14614 N. Keirland Blvd
Scottsdale, AZ 85254
480-998-3515
It sounds like the uninstall didn't go quite right. It should have removed
SQL from the second node but not removed any resources from the SQL group.
After that, it was pretty much hosed.
Geoff N. Hiten
Senior SQL Infrastructure Consultant
Microsoft SQL Server MVP
"Scott Cummins" <scummins@.equitymethods.com> wrote in message
news:ABBD721F-1B05-4902-8C69-E22BA1A24408@.microsoft.com...
>I had a hard time with my node replacement. And I am hoping to find out
>what
> I did wrong and perhaps some answers to a few lingering questions, I want
> to
> step through what I did
> 1. I was using a whole new name and IP address for the new node, so I
> place
> it on the network the day before and attached it to the SAN, But did not
> assign it to the storage group or attach LUNS or Any cluster environment.
> The next day I first removed SQL from the Second node, It came up with
> thte
> window that lets me pick the Node, However when I did this it removed my
> Virtual SQL Cluster name and IP, which worried me tremendoulsy, is this
> Normal?, then I evicted it from the cluster and it seemed okay.
> I then turned off Node 1 and added the new Node to the storage group,
> added
> all the appropriate SAN software, rebooted and when my Node 2 came up
> assigned the same drive letters to the shared storage as on the first
> node,
> then off the second node and restarted the first. waited for my drives to
> appear Okay, then started the second node.
> I had all my IP'S and Heartbeat configured and everything came up fine.
> Then I added the new node 2 to the MS cluster and it went well,
> Now I tried to load SQL and I was sure it wouldn't become part of any SQL
> CLuster because it appeared it didn't exit anymore. I readded the SQL name
> and IP to the cluster resources and went ahead and added it, But It didn't
> seem to work right, Every time I fail over, It works and I can use the SQL
> Virtual IP, But I have to re-attach the DB's every time I fail over.
> I am not sure if I missed a step, Or if Node one should have been off when
> I
> removed SQL from Node two, One thing I am questioning is whether the SQL
> services hsould be running on both Nodes, even when one is passive
> Can anyone pinpoint my issue or help me with Post-game analysis
>
> --
> Scott A Cummins
> Sr. Systems Engineer
> Equity Analytics
> ( A division of Merrill Lynch)
> 14614 N. Keirland Blvd
> Scottsdale, AZ 85254
> 480-998-3515
|||Hey Geoff
Yeah, That is where I had my first indication I was going ot have trouble, I
was able to readd the SQL Clsuter Virtual Resources man ually and I have good
failover and the databases are attached and everything seems to be working
good.
I have actually two lingering Questions, My SQl services are running and on
"automatic" on Both nodes. Now when I failed over after the original set up,
I kept having to 'Attach" the databases manually. I thought this was due to
the fact that My services aren't actually part of the SQL Cluster, tried to
add them as generic sercives and it tells me It Can't because they are not
installed services, SO I haven't tried to fail them over again, so Tuesday
Night I will see what happens.
Also, I may have some DC trouble that I was unaware of, I have a remotes on
down in Mesa, for my production side and My PDC is here In Scottsdale, when I
attached the database to my new Node, and tried to add users SQL said "Cannot
determine if TMPEMWDPROD003 is part of the domain, That was aprox 24 hours
after I had installed the server and added it to the domain, I believe
Yesterday that had cleared up. I may have to reconfigure My DC's, all this
was done before I started working here.
Let me ask this, Should I go into SQL before doing anything and detach all
databases before unistalling SQL, even though they were attached to the Node
I was NOT replacing?
And thanks again Geoff
Scott A Cummins
Sr. Systems Engineer
Equity Analytics
( A division of Merrill Lynch)
14614 N. Keirland Blvd
Scottsdale, AZ 85254
480-998-3515
"Geoff N. Hiten" wrote:
> It sounds like the uninstall didn't go quite right. It should have removed
> SQL from the second node but not removed any resources from the SQL group.
> After that, it was pretty much hosed.
> --
> Geoff N. Hiten
> Senior SQL Infrastructure Consultant
> Microsoft SQL Server MVP
>
>
> "Scott Cummins" <scummins@.equitymethods.com> wrote in message
> news:ABBD721F-1B05-4902-8C69-E22BA1A24408@.microsoft.com...
>
|||You have a split-brain SQL install. In a cluster, the services should be
MANUAL and only started up on the currently active node by the cluster
service. You may need to detach the databases, uninstall SQL completely,
and reinstall the SQL clustered instance to get everything back the way it
is supposed to be.
Normally when you kick a node out, you don't have to detach anything. The
installer should simply remove the node from the possible owners of the
resource group and fix the internal SQL configuration. This is where
practicing on a virtual installation really pays off.
And yes, the DC problems may have contributed to the issue.
Geoff N. Hiten
Senior SQL Infrastructure Consultant
Microsoft SQL Server MVP
"Scott Cummins" <scummins@.equitymethods.com> wrote in message
news:2581DDE3-C54D-4A65-B0F4-B5B8EA07BC86@.microsoft.com...[vbcol=seagreen]
> Hey Geoff
> Yeah, That is where I had my first indication I was going ot have trouble,
> I
> was able to readd the SQL Clsuter Virtual Resources man ually and I have
> good
> failover and the databases are attached and everything seems to be working
> good.
> I have actually two lingering Questions, My SQl services are running and
> on
> "automatic" on Both nodes. Now when I failed over after the original set
> up,
> I kept having to 'Attach" the databases manually. I thought this was due
> to
> the fact that My services aren't actually part of the SQL Cluster, tried
> to
> add them as generic sercives and it tells me It Can't because they are not
> installed services, SO I haven't tried to fail them over again, so Tuesday
> Night I will see what happens.
> Also, I may have some DC trouble that I was unaware of, I have a remotes
> on
> down in Mesa, for my production side and My PDC is here In Scottsdale,
> when I
> attached the database to my new Node, and tried to add users SQL said
> "Cannot
> determine if TMPEMWDPROD003 is part of the domain, That was aprox 24 hours
> after I had installed the server and added it to the domain, I believe
> Yesterday that had cleared up. I may have to reconfigure My DC's, all this
> was done before I started working here.
> Let me ask this, Should I go into SQL before doing anything and detach all
> databases before unistalling SQL, even though they were attached to the
> Node
> I was NOT replacing?
> And thanks again Geoff
> --
> Scott A Cummins
> Sr. Systems Engineer
> Equity Analytics
> ( A division of Merrill Lynch)
> 14614 N. Keirland Blvd
> Scottsdale, AZ 85254
> 480-998-3515
>
> "Geoff N. Hiten" wrote:
|||Geoff
Yeah, That is what I told my Boss, That I may have to uninstall SQL entirely
and "rebuild" the SQL Cluster, I am going to go ahead and "Evict" the other
node and replace it, And as long as everything works fine, and the only
problem I have is the SQL services and the "Attaching" the databases,
everytime we fail over. I wiill schedule that, It should be fairly simple. If
I uninstall everything and remove the Virtual SQL name and IP from AD and
DNS, It should load like a first time SQL Cluster install......I HOPE!!
Scott A Cummins
Sr. Systems Engineer
Equity Analytics
( A division of Merrill Lynch)
14614 N. Keirland Blvd
Scottsdale, AZ 85254
480-998-3515
"Geoff N. Hiten" wrote:
> You have a split-brain SQL install. In a cluster, the services should be
> MANUAL and only started up on the currently active node by the cluster
> service. You may need to detach the databases, uninstall SQL completely,
> and reinstall the SQL clustered instance to get everything back the way it
> is supposed to be.
> Normally when you kick a node out, you don't have to detach anything. The
> installer should simply remove the node from the possible owners of the
> resource group and fix the internal SQL configuration. This is where
> practicing on a virtual installation really pays off.
> And yes, the DC problems may have contributed to the issue.
> --
> Geoff N. Hiten
> Senior SQL Infrastructure Consultant
> Microsoft SQL Server MVP
>
>
> "Scott Cummins" <scummins@.equitymethods.com> wrote in message
> news:2581DDE3-C54D-4A65-B0F4-B5B8EA07BC86@.microsoft.com...
>
|||You really need a test platform so you have all your steps practiced. MS
Virtual Server will allow you to build a "practice" cluster and walk through
these steps so you have a proven checklist. I use it heavily to test client
scenarios before touching actual live systems. Keeps the "oops" rate down.
Geoff N. Hiten
Senior SQL Infrastructure Consultant
Microsoft SQL Server MVP
"Scott Cummins" <scummins@.equitymethods.com> wrote in message
news:8D9CDB0C-9481-41C3-B46D-A32E2267A63B@.microsoft.com...[vbcol=seagreen]
> Geoff
> Yeah, That is what I told my Boss, That I may have to uninstall SQL
> entirely
> and "rebuild" the SQL Cluster, I am going to go ahead and "Evict" the
> other
> node and replace it, And as long as everything works fine, and the only
> problem I have is the SQL services and the "Attaching" the databases,
> everytime we fail over. I wiill schedule that, It should be fairly simple.
> If
> I uninstall everything and remove the Virtual SQL name and IP from AD and
> DNS, It should load like a first time SQL Cluster install......I HOPE!!
> --
> Scott A Cummins
> Sr. Systems Engineer
> Equity Analytics
> ( A division of Merrill Lynch)
> 14614 N. Keirland Blvd
> Scottsdale, AZ 85254
> 480-998-3515
>
> "Geoff N. Hiten" wrote:
|||Hey Geoff
I couldn't agree more, unfortuantely, we have nothing of the sort, SO I have
insisted my Boss get the equipment we need to build a new one and test these
procedures,We are a poor little compnay with a very stingy big brother (ML)
Until then..
What I did was added the SQL services manually to the SQL cluster group and
made them dependent on My database drive, Transaction Logs drive, SQL Virtual
IP and Virtual name. I put the services on my passive node to manual and
stopped them. I am going to fail it over tomorrow afternoon and see what
happens. I am not sure if there are any registry settings I need, so I am
going to try it, If it doesn't work I will limp along with what I have until
I can uninstall and reinstall everything or just build a whole new cluster
If I am lucky ,it will owrk as though it is a cluster, I get so angry when I
look at the properties of my SQL services in "SQL CONFIGURATION MANAGER" and
it says CLUSTERED = NO...OHHHHHHH
Well I will let you Know.
Scott A Cummins
Sr. Systems Engineer
Equity Analytics
( A division of Merrill Lynch)
14614 N. Keirland Blvd
Scottsdale, AZ 85254
480-998-3515
"Geoff N. Hiten" wrote:
> You really need a test platform so you have all your steps practiced. MS
> Virtual Server will allow you to build a "practice" cluster and walk through
> these steps so you have a proven checklist. I use it heavily to test client
> scenarios before touching actual live systems. Keeps the "oops" rate down.
> --
> Geoff N. Hiten
> Senior SQL Infrastructure Consultant
> Microsoft SQL Server MVP
>
>
> "Scott Cummins" <scummins@.equitymethods.com> wrote in message
> news:8D9CDB0C-9481-41C3-B46D-A32E2267A63B@.microsoft.com...
>
|||Find a desktop with 1-2GB of RAM and install Virtual Server 2005. I run it
on a high-end notebook.
GNH
Geoff N. Hiten
Senior SQL Infrastructure Consultant
Microsoft SQL Server MVP
"Scott Cummins" <scummins@.equitymethods.com> wrote in message
news:699110DB-4736-411A-88A1-8A073DC0BD5A@.microsoft.com...[vbcol=seagreen]
> Hey Geoff
> I couldn't agree more, unfortuantely, we have nothing of the sort, SO I
> have
> insisted my Boss get the equipment we need to build a new one and test
> these
> procedures,We are a poor little compnay with a very stingy big brother
> (ML)
> Until then..
> What I did was added the SQL services manually to the SQL cluster group
> and
> made them dependent on My database drive, Transaction Logs drive, SQL
> Virtual
> IP and Virtual name. I put the services on my passive node to manual and
> stopped them. I am going to fail it over tomorrow afternoon and see what
> happens. I am not sure if there are any registry settings I need, so I am
> going to try it, If it doesn't work I will limp along with what I have
> until
> I can uninstall and reinstall everything or just build a whole new cluster
> If I am lucky ,it will owrk as though it is a cluster, I get so angry when
> I
> look at the properties of my SQL services in "SQL CONFIGURATION MANAGER"
> and
> it says CLUSTERED = NO...OHHHHHHH
> Well I will let you Know.
>
> Scott A Cummins
> Sr. Systems Engineer
> Equity Analytics
> ( A division of Merrill Lynch)
> 14614 N. Keirland Blvd
> Scottsdale, AZ 85254
> 480-998-3515
>
> "Geoff N. Hiten" wrote:
|||Hey Geoff
Two last things, I have been pouring over TechNet ans MSDN looking for where
I made my mistake (and I am sure I did) and I came across a white paper that
talks about removing a SQL node by using the SQL 2005 setup and removing the
SQL instance at least that is what I think it is talking about) and basically
leaving the SQL application structure in tact. I of course went to
"add/remove programsn and selected my node and nuked it. Do you know anything
about this and is it a function osf the setup to allow you to basically
remove a clustered instance yet leave the Application (and hence, the
cluster) in tact?
this is the link to what I was reading and as you can see in line 6..Yeah, I
never got that "Maintain the Virtual Server" so perhaps I wasn't as thorough
in my research, and I paid the price
http://msdn2.microsoft.com/en-us/library/ms191545.aspx
Also, I do have a spare Dell 2950, I am thiniing of using VMware to create
two virtual servers and 'Cluster" them, I just need to figure out the Sahred
Storage" and my "heartbeat" I have never cluster VM sessions, I only have the
freeware, so If it costs money to buy a full blown version, I will have to go
hat in hand.
Geoff, I appreciate all your help on this and at least, it wasn't a total
disaster.
Scott A Cummins
Sr. Systems Engineer
Equity Analytics
( A division of Merrill Lynch)
14614 N. Keirland Blvd
Scottsdale, AZ 85254
480-998-3515
"Scott Cummins" wrote:
> I had a hard time with my node replacement. And I am hoping to find out what
> I did wrong and perhaps some answers to a few lingering questions, I want to
> step through what I did
> 1. I was using a whole new name and IP address for the new node, so I place
> it on the network the day before and attached it to the SAN, But did not
> assign it to the storage group or attach LUNS or Any cluster environment.
> The next day I first removed SQL from the Second node, It came up with thte
> window that lets me pick the Node, However when I did this it removed my
> Virtual SQL Cluster name and IP, which worried me tremendoulsy, is this
> Normal?, then I evicted it from the cluster and it seemed okay.
> I then turned off Node 1 and added the new Node to the storage group, added
> all the appropriate SAN software, rebooted and when my Node 2 came up
> assigned the same drive letters to the shared storage as on the first node,
> then off the second node and restarted the first. waited for my drives to
> appear Okay, then started the second node.
> I had all my IP'S and Heartbeat configured and everything came up fine.
> Then I added the new node 2 to the MS cluster and it went well,
> Now I tried to load SQL and I was sure it wouldn't become part of any SQL
> CLuster because it appeared it didn't exit anymore. I readded the SQL name
> and IP to the cluster resources and went ahead and added it, But It didn't
> seem to work right, Every time I fail over, It works and I can use the SQL
> Virtual IP, But I have to re-attach the DB's every time I fail over.
> I am not sure if I missed a step, Or if Node one should have been off when I
> removed SQL from Node two, One thing I am questioning is whether the SQL
> services hsould be running on both Nodes, even when one is passive
> Can anyone pinpoint my issue or help me with Post-game analysis
>
> --
> Scott A Cummins
> Sr. Systems Engineer
> Equity Analytics
> ( A division of Merrill Lynch)
> 14614 N. Keirland Blvd
> Scottsdale, AZ 85254
> 480-998-3515
|||Scott,
Yep, step 6 is exactly what you missed. A SQL Clustered Instance maintains
internal information on what nodes it can run on. The "Maintain the Virtual
Server" option is how you update this information. The Instance stays
intact, but the allowed nodes are changed by this process. This maintains
the separation between nodes and instances.
You can download Virtual Server 2005 R2 SP1 for free
http://www.microsoft.com/windowsserversystem/virtualserver/
Here is how to build a SQL 2005 test cluster using VS 2005.
http://support.microsoft.com/kb/891798
So, your basic requirement of zero cost is now met. All it takes is a bit
of time.
Geoff N. Hiten
Senior SQL Infrastructure Consultant
Microsoft SQL Server MVP
"Scott Cummins" <scummins@.equitymethods.com> wrote in message
news:C11D097B-B49E-4ED3-B805-EE25EFE5E556@.microsoft.com...[vbcol=seagreen]
> Hey Geoff
> Two last things, I have been pouring over TechNet ans MSDN looking for
> where
> I made my mistake (and I am sure I did) and I came across a white paper
> that
> talks about removing a SQL node by using the SQL 2005 setup and removing
> the
> SQL instance at least that is what I think it is talking about) and
> basically
> leaving the SQL application structure in tact. I of course went to
> "add/remove programsn and selected my node and nuked it. Do you know
> anything
> about this and is it a function osf the setup to allow you to basically
> remove a clustered instance yet leave the Application (and hence, the
> cluster) in tact?
> this is the link to what I was reading and as you can see in line 6..Yeah,
> I
> never got that "Maintain the Virtual Server" so perhaps I wasn't as
> thorough
> in my research, and I paid the price
> http://msdn2.microsoft.com/en-us/library/ms191545.aspx
> Also, I do have a spare Dell 2950, I am thiniing of using VMware to create
> two virtual servers and 'Cluster" them, I just need to figure out the
> Sahred
> Storage" and my "heartbeat" I have never cluster VM sessions, I only have
> the
> freeware, so If it costs money to buy a full blown version, I will have to
> go
> hat in hand.
> Geoff, I appreciate all your help on this and at least, it wasn't a total
> disaster.
>
> --
> Scott A Cummins
> Sr. Systems Engineer
> Equity Analytics
> ( A division of Merrill Lynch)
> 14614 N. Keirland Blvd
> Scottsdale, AZ 85254
> 480-998-3515
>
> "Scott Cummins" wrote:
No comments:
Post a Comment