SQL Server setup fails due to partitioned network warnings from cluster service

I was building a new SQL Server 2008 R2 failover cluster recently and encountered a problem that I hadn’t seen before (which is rare as I’ve seen A LOT of cluster setup problems in my time!). This time it was strange as it was an error before setup actually ran, it was when I was going through the dialogue boxes to configure setup.

The scenario was this:

1. Cluster was fully built and validated at a windows level, all resources were up and OK
2. I was about to run SQL Setup when I noticed the network binding order was wrong
3. I changed this and then decided to reboot both nodes as I always do this before a cluster setup
4. The nodes came back online OK and all resources came up as well
5. I ran setup but when I got to the cluster network configuration dialog box, there were no networks to select from, so you couldn’t go forward.

My first thought was that I must have done something dumb when changing the network binding order but checks on the network adapters showed that they were all up. I then went back through a few other things and noticed that the cause of the error was actually that the cluster service was having issues with connecting to one of the networks. There were 2 types of error / warning in the cluster logs and the system event logs:

Error

Cluster network ‘xxxxx’ is partitioned. Some attached failover cluster nodes cannot communicate with each other over the network. The failover cluster was not able to determine the location of the failure. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

Warning

Cluster network interface ‘xxxxx – xxxxx’ for cluster node ‘xxxxx’ on network ‘xxxxx’ is unreachable by at least one other cluster node attached to the network. The failover cluster was not able to determine the location of the failure. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

I had to engage the help of some network specialists as I couldn’t get to the bottom of this on my own. The networks actually appeared up and we could connect to them and use them independently outside of the cluster, but the cluster was convinced that they were partitioned. To cut a long story short, after checking many things we realised that the problem was down to the fact that one of the networks was actually a teamed network implemented using BASP virtual adapters, and this network team was not coming up fast enough after the node rebooted, before the cluster service tried to bind it in as a resource.

The fix was simple, in that we set the cluster service to delayed start and then everything was fine. We didn’t need to make any configuration changes beyond this. Once the cluster service was happy that the network was OK, SQL Server setup was able to continue just fine.

Good luck with your cluster builds!

Kista Arbetsmarknadsdag – Basefarm Competition winner

For those of you who came to our stand at KTH earlier in the week you may have noticed (and entered) our competition to win some very cool wireless headphone by guessing the Basefarm bandwidth we serve from our Stockholm data center.

As with all such calculations there are slightly different ways to calculate it depending on how often your sample size is and what period you average over and things like that, but the network team tell me that the correct answer is 546 Gbps (averaged on a daily basis over the year).

The lucky winner was Jennie Johansson (who guessed closest with 500), so watch out for her wearing her nice new headphones in coming days. The prize is in the post Jennie.

Follow up to Kista Arbetsmarknadsdag

As we mentioned earlier this week several Basefarm employees were onsite at the Kista Arbetsmarknadsdag yesterday speaking about careers in the IT sector. It was a really enjoyable day and we met some really interesting and intelligent people. We took away a good number of CVs and applications but we’d still like to receive more.

As a reminder if you want to apply through the website you can see the official list of open positions here (only in Swedish)

https://www.basefarm.com/sv/jobb/Lediga-tjanster-Sverige/

and you can see the referral address of applications on this page:

https://www.basefarm.com/sv/jobb/

Please note that you must send to the

rekrytering-se@basefarm.se

mail address to be considered for roles in Sweden.

The list of available roles does not officially mention internships and the like but as I explained to many of the attendees yesterday we do actively offer these type of positions and in fact we have an intern in the windows group at the moment in Stockholm, who we met at another careers day back in 2011. If you are interested in this type of position you should send in an application to the above address telling us what you are after.

We will aim to respond to all applications and CVs received, but please bear with us as we received quite a few and please don’t be disappointed if we can’t take you on, as we do have a limited number of positions available. Whatever happens I hope that our conversations were of interest and use to everyone, and good luck with your career searches. I hope to meet some of you again at the next stage of the process in the future 🙂

Compete with Basefarm on KAM 2012: Win headphones from Beats by dr. dre!

This Wednesday representatives from Basefarm will attend the career day Kista arbetsmarknadsdag (KAM) in Stockholm. We will arrange a competition in our booth where you studentens can compete to win awesome beatspro headphones from Beats by dr. dre. Come by our booth to find out how to win them! Wonder how you can find us? Just listen to the music and you will see us in our Basefarm t-shirts. We have created a playlist on Spotify that will be played in the booth. If you want to listen to the playlist already today, the playlist is called Musik@Basefarm.

My colleague Graham has previously written that he will hold a lecture on KAM about career opportunities in the IT-industry. The lecture starts at 2.30 PM, so be sure to attend the lecture if you want tips and advice in your career.

See you on Wednesday!

Basefarm will be at Kista Arbetsmarknadsdag next week – come and meet us

A selection of Basefarm employees will be at Kista Arbetsmarknadsdag next week on 28th March. Details are here:

http://kam.ictcontact.se/se/about-kam

We’ll be on the conference floor all day available to chat, but also myself and another colleague will be doing a talk about IT career development at 1430.

We regularly try to take on graduates for both internships and full time employment (in fact we have an intern working in the windows group currently who joined as a result of the talk that myself and Andreas did last year at KYH in Stockholm), so if you’re interested in seeing what we’ve got to say, please do pop along and say hello.

What does the new Basefarm Service Desk mean?

In January, we introduced new procedures for how we handle cases at Basefarm Service Desk. We thought that you as our customers are probably wondering what it will mean for you, and that’s why we are now giving you a more detailed explanation:

The new procedures mean that Basefarm Service Desk in Sweden now takes care of all cases (incident management & change management) for Swedish customers during daytime, from 7 AM to 5 PM. Other times (weekend & night) we have passive readiness where Basefram Service Desk in Sweden works in close collaboration with Basefarm Service Desk in Norway in order to solve the cases. Earlier, Sweden was staffed between 8 AM to 11 PM on weekdays.

Overall, for our customers it means that we are always available 24/7 both in Sweden and Norway. This result in benefits for you as a customer:

  • Accessibility increases on local level, 24/7-support also in Sweden
  • We are physically closer to you as a customer
  • Our new model contributes to shorter solution times
  • We will increase the ability to proactively work

This will mean that Basefarm Service Desk in Sweden have a wider overall perspective and can be more proactive and resolve cases faster than before. Norway can now focus on cases only from Norwegian customers and thus also increase the availability locally.

It’s mainly our Swedish customers that will notice the changes. Basefarm Service Desk in Norway already works with solving cases on a local level at daytime.

Do you have any questions regarding the changes? Post a comment below or contact your local Basefarm Service Desk.

Tracing select statements on specific objects in SQL Server without using Profiler

A developer of an application asked me an interesting question the other day. He wanted to know (for reasons not worth going into here) whether his application issued select statements against a specific table in his database. This database was in production and under heavy load, so although we could run a server side SQL Profiler here, and then read through the results, this could be a time consuming process which could generate an extremely large amount of data (and also a quite heavy load on the server). We also wanted to run this monitoring for a number of days if possible, so we needed something more lightweight if possible.

I thought about this for a while and realised that the best way to achieve this (assuming you are running SQL 2008 or later) would be through the new functionality SQL Audit. This uses the extended events framework as the basis for its tracing and therefore falls into the lightweight category.

Here’s an example of what I wrote, converted into simple test objects which you can try yourself. This example requires a table called dbo.audit_test to be present in a database named audit_db for you to test against.


USE master ;
GO
-- Create the server audit.
CREATE SERVER AUDIT test_server_audit
TO FILE ( FILEPATH =
'C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA' ) ;
GO

— Enable the server audit.
ALTER SERVER AUDIT test_server_audit
WITH (STATE = ON) ;
GO
— Move to the target database.
USE audit_db ;
GO

— Create the database audit specification.
CREATE DATABASE AUDIT SPECIFICATION audit_test_table
FOR SERVER AUDIT test_server_audit
ADD (SELECT , INSERT, UPDATE
ON dbo.audit_test BY dbo,guest, public, db_datareader)
WITH (STATE = ON) ;
GO

/*
do some work here
which will trigger the audit to record something
*/

/* uncomment these statements to turn off the audit at either DB or server level

–turn off the database audit
use audit_db
go
ALTER DATABASE AUDIT SPECIFICATION audit_test_table
WITH (STATE = OFF)

use master
go
–turn off the server audit
ALTER SERVER AUDIT test_server_audit
WITH (STATE = OFF) ;
GO

*/

Here’s the key things to note about the above example:

1. This one actually traces 3 type of table access SELECT, INSERT and UPDATE
2. It traces specific users and groups – you can change these as relevant to your example
3. It writes the output to the default DATA directory of a default 2008 R2 install – change as you see fit
4. You need to watch the potential file space this will take up as it can be very verbose in big systems
5. Watching the file space used in real time will not work, as it holds most of the data in memory and flushes when you stop the trace

Once you have the output you need (and you have turned off the audit – don’t forget!) you simply run something like this to view the data (you’ll need to locate the exact file name created each time you turn the audit on or off).


SELECT COUNT(*), statement
FROM sys.fn_get_audit_file ('C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\test_server_audit_7E707DDD-03F3-4FFA-B24B-BB0DDBF4D5F3_0_129714455341990000.sqlaudit',default,default)
group by statement
GO

As you can see the above does a simple count and aggregate of the results, but there are many columns in the output which you can write TSQL against (although since it’s backed by a file the access might be slow if you have large files or slow disks!).

I found this to be a most effective technique and it didn’t impact the performance of the server whatsoever.

You can read more about SQL Server Audit at MSDN.