Issue with Windows pass through authentication

It began with an error thrown by a linked server. The linked server is on our warehouse server and connects to our SSIS server using the “login’s current security context” option.

Helpful

The error was not hugely helpful, and internet searches kept coming back with partial matches mostly with additional text. Our old friend “cannot generate SSPI context”.

What changed?

When a working thing becomes a non-working thing my first question is “what changed?” In this case nothing had changed on either server, but as I scanned through the recent changes I could see that Windows patching had happened the previous night.

Windows patching and what was starting to look like some sort of AD authentication error.

To test the emerging theory I switched the linked server to use SQL authentication and it worked as expected.

The culprit

Digging in to the patching it turned out that a bunch of domain controllers had been patched with this patch.

The known issues include the following:

“After installing the November security updates, released November 9, 2021 on your Domain Controllers (DC) that are running a version of Windows Server, you might have authentication failures on servers relating to Kerberos Tickets acquired via S4u2self.”

A hotfix was released five days later and this fixed our linked server issue.

Posted in Uncategorized | Tagged | Leave a comment

Query the status of SQL Agent jobs

I’ve been involved in building a solution on top of SQL Server peer to peer replication (SQL 2014 SP1) and slowly building up a script library for general DBA tasks within that solution. One common task is to add a new table/article to the publication.

The basic steps for adding an article are outlined here, and the end of the process requires the distribution agents to be restarted.

I set about building a template script that I could drop my table creates into and run, and the script creates the table, adds the articles and then restarts the distribution jobs.

My first attempt to restart the jobs was:

exec msdb..sp_stop_job @job_name = 'my distribution agent job';
exec msdb..sp_start_job @job_name = 'my distribution agent job';

With this result:

Job 'my distribution agent job' stopped successfully.
Msg 22022, Level 16, State 1, Line 124
SQLServerAgent Error: Request to run job my distribution agent job (from User *****) refused because the job is already running from a request by User *****.

If you dig into the sp_stop_job procedure you find a call to sp_sqlagent_notify which in turn makes a call to master.dbo.xp_sqlagent_notify. This call goes out from the SQL Server service to the SQL Agent service and would appear to be asynchronous. You can see that by running something like the following which stops the job and then queries it’s state immediately afterwards.

exec msdb..sp_stop_job @job_name = 'my distribution agent job';
exec msdb..sp_help_job @job_name = 'my distribution agent job', @job_aspect = 'JOB';

The @job_aspect parameter means the procedure only returns a single row which includes a column showing the running status of the job – which in this case was still running.

The next thought is to send the stop job request and then loop until this request has completed and then run the start. A similar issue arises when querying the status of the jobs. My first preference is to be able to query the result from system views/tables; something like:

set nocount on;
declare @i int;

exec msdb..sp_stop_job @job_name = 'My distribution job';

while exists(
SELECT 1
FROM msdb.dbo.sysjobs_view job
JOIN msdb.dbo.sysjobactivity activity
ON job.job_id = activity.job_id
JOIN msdb.dbo.syssessions sess
ON sess.session_id = activity.session_id
JOIN
(	SELECT MAX( agent_start_date ) AS max_agent_start_date
    FROM msdb.dbo.syssessions
) sess_max
ON sess.agent_start_date = sess_max.max_agent_start_date
WHERE run_requested_date IS NOT NULL AND stop_execution_date IS NULL AND job.name = 'My distribution job'
)
begin
	select @i =1
end;

exec msdb..sp_start_job @job_name = 'My distribution job';

But this suffers the same problem with the asynchronous nature of xp_sqlagent_notify. It looks like there are still some completion actions that take place after the session has ended.

After much experimentation the most reliable way I could find of accurately determining the status (in close to real time) was by using another call out to the agent service: xp_sqlagent_enum_jobs. Loading a temp table with the results of a proc call and then querying those results was not as elegant a solution as I was looking for – but here it is:

declare @running bit = 1;
declare @jobs table(
	job_id					uniqueidentifier,
	[Last Run Date]			int,
	[Last Run Time]			int,
	[Next Run Date]			int,
	[Next Run Time]			int,
	[Next Run Schedule ID]	int,
	[Requested To Run]		int,
	[Request Source]		int,
	[Request Source ID]		varchar(100),
	[Running]				int,
	[Current Step]			int,
	[Current Retry Attempt]	int,
	[State]					int
);

declare @job_id uniqueidentifier = (select job_id from msdb..sysjobs where name = 'My distribution job');

exec msdb..sp_stop_job @job_name = 'My distribution job';
while (@running = 1)
begin
	insert @jobs
	EXECUTE master.dbo.xp_sqlagent_enum_jobs 1, dbo, @job_id

	select @running = running from @jobs
	delete @jobs 
end;
exec msdb..sp_start_job @job_name = 'My distribution job';
Posted in SQL Server, Uncategorized | Leave a comment

Drop down boxes not working in SQL Server 2012 Configuration Manager

After a data centre power outage a decision was made to switch the start up mode of a bunch of our SQL Servers to manual so that the on call DBA could manage the reintroduction of the servers in a controlled fashion.

I jumped onto the first of our Windows Server 2012 R2, SQL Server 2012 SP1 boxes and ran SQLServerManager11.msc to start SQL 2012 configuration manager. Select the SQL engine service, right click and select properties, click the service tab and click the start mode drop down:

NoValues

The drop down list has no values populated. (okay – I took the screen grab after I’d changed the setting but you get the picture)

Strangely when I opened SQL Configuration manager from the application screen I got the dropdown values as expected:

Values

 

After a bit of digging around I found that there were two versions of SQLServerManager11.msc installed. One in C:\Windows\System32 and the other in C:\Windows\SysWOW64. The one in System32 doesn’t populate the drop down, the one in SysWOW64 does.

The first value in the list for my path environment variable on the server is? … %SystemRoot%\system32;

Mystery solved (sort of – I still don’t understand why two different versions of the snap in are installed – and more particularly why one doesn’t work).

I toyed with a few different workarounds for this. I could just use the app screen to start configuration manager. In one experiment I added %SystemRoot%\SysWOW64; to the path environment variable – and renamed the snap in in the System32 folder. That worked – but felt like a hack.

Because I’m trying to learn powershell at the moment, and building up my script library of DBA tasks, I ended up making the configuration change with powershell scripts.

Check the settings of all local SQL services:

get-wmiobject win32_service | where{$_.Name -like "*sql*"}

Change the start mode of the default SQL Server engine:

set-service mssqlserver -startup manual

and the SQL Agent service:

set-service sqlserveragent -startup manual

Just a tip – if you are working with named instances you need to escape the $ sign in the name:

set-service mssql`$[instance name] -startup manual
Posted in Powershell, SQL Server | 2 Comments

Behaviour change for delete … output … into in SQL Server 2014

A colleague of mine discovered an interesting behaviour change from SQL Server 2012 to SQL Server 2014. We recently upgraded one of our servers from SQL 2012 to SQL 2014 and a job started failing. (Although I haven’t explicitly tested this behaviour on earlier versions – the job dates back to SQL 2005).

The error:

Msg 544, Level 16, State 1, Line 31
Cannot insert explicit value for identity column in table ‘#temp’ when IDENTITY_INSERT is set to OFF.

Fair enough – but the job, and procedure had not changed across the upgrade so why were we suddenly getting this error?

To test and try and isolate the issue I ran the following on a SQL 2012 SP2 and then a SQL 2014 RTM instance.

Create a table with an identity column:

use tempdb
go

create table ident_table (id int identity(1,1),val varchar(255))

insert into ident_table(val)
values ('a'),('b'),('c'),('d'),('e'),('f'),('g'),('h')

Create a copy of this table using the select into syntax:

select top 0 * 
into #temp
from ident_table

Verify that the identity property has copied across:

select * from sys.columns
where [object_id] = object_id('ident_table')

select * from sys.columns
where [object_id] = object_id('#temp')

Now we try and populate our temp table from our original table with an explicit insert:

insert 
into #temp(id,val)
select id,val from ident_table

Which fails as expected:

Msg 544, Level 16, State 1, Line 31
Cannot insert explicit value for identity column in table ‘#temp’ when IDENTITY_INSERT is set to OFF.

Populate the table using delete … output … into … :

delete ident_table
output deleted.id, deleted.val
into #temp(id,val)

On SQL 2012 this succeeds, on 2014 it fails with the error above!

I’ve tried a few different permutations of this test, including using a user database instead of tempdb and combining tables in a user database with tables in tempdb, or temp tables. In all cases the delete … output … into syntax seems to ignore the identity column in 2012 but throws an error in 2014.

There are a number of simple options for avoiding this issue – and I actually think that the 2014 behaviour makes more sense – but this still has me searching through our database code for other uses of the syntax.

I’m trying to encourage my colleague to create a Connect item for this – if/when he does – I’ll post the link to it here.

Posted in SQL Server, T-SQL | Tagged , , , | 4 Comments

The most recent backup problem

When I first began this blog, back at the beginning of my DBA career, it was purely a place for me to keep queries, things I was learning, processes and handy links. This is one of those posts, there’s nothing new or ground-breaking here – just an elegant simple solution to a common problem.

The problem: I’m using a simple maintenance plan to take nightly backups of some production databases to a UNC path within our network. I want to create a job that will pick these backups up and restore them on to a test server. Either on demand or on a schedule. I’m keeping seven days backup in the UNC target and I want to be restoring the most recent backup.

The options:

1. Use the backup history tables in msdb to retrieve the filename and path of the most recent backup. We don’t allow TCP/IP traffic between our production and test networks so I can’t query the production backup history tables in msdb from the test server

2. Parse the date from the backup filename and use that. This was definitely a possibility, the SQL Server maintenance plans append the date to the filename by default. But that option felt a little clunky and error prone to me. What if I wanted to take an out of band backup manually – and get that picked up?

3. Use the date created property from the file system to choose the youngest backup. This was the solution I wanted.

The solution: I looked for a way of finding the most recent backup using DOS commands but I couldn’t find a good single statement solution. One of the sys admins at work supplied this powershell solution:

(Get-ChildItem -Path [UNC path] | Sort CreationTime -Descending | Select Name -First 1).Name

Calling this from SQL using xp_cmdshell:

exec xp_cmdshell 'powershell.exe -c "(Get-ChildItem -Path [UNC path] | Sort CreationTime -Descending | Select Name -First 1).Name"'

Executing the above command returns a two row result set. One null row and one with the filename and extension of the youngest file in the given location. From there it was a straight forward task to pipe the output into a table variable, get rid of the null row, and construct the restore statement.

USE [master]

ALTER DATABASE [database] SET SINGLE_USER WITH ROLLBACK IMMEDIATE; 
ALTER DATABASE [database] SET MULTI_USER; 

DECLARE @files TABLE([file] VARCHAR(255))
DECLARE @filepath VARCHAR(256)
DECLARE @sql NVARCHAR(512)

INSERT @files
EXEC xp_cmdshell 'powershell.exe -c "(Get-ChildItem -Path UNC path | Sort CreationTime -Descending | Select Name -First 1).Name"'

DELETE @files WHERE [file] is null

SELECT TOP 1 @filepath = 'UNC path' + [file] FROM @files;

SELECT @sql = 'RESTORE DATABASE [database] 
FROM  DISK = ''' + @filepath + ''' 
WITH  FILE = 1,  
	MOVE N''Database'' TO N''D:\Data\Database.mdf'', 
	MOVE N''Database_log'' TO N''D:\Logs\Database_log.ldf'', 
	NOUNLOAD,  REPLACE,  STATS = 5'

EXEC dbo.sp_executesql @sql
Posted in Disaster Recovery, Powershell, SQL Server | 1 Comment

Space issues with SSIS temp files

The page came at a reasonable hour. All of the overnight ETL and processing jobs were just winding up before the business started in earnest for the day. One of the final jobs of the morning failed, a job that called an SSIS package that transferred some email logging to the data warehouse.

I jumped online and saw that one of the teams was in doing an early morning deploy. I scanned the errors from the job and saw the server was the one that the deploy team were working on and saw this message:

“The column status returned was: “DBSTATUS_UNAVAILABLE”.”

I decided that it was most likely deploy related and I’d rerun the job once the team was finished.

I did – and it failed again. I went back to the error output and after scanning through a lot of misleading or confusing SSIS error output I found this gem:

   Description: The buffer manager cannot extend the file “C:\Users\SQLSER~1\AppData\Local\Temp\DTB{3AA45D52-FFD2-479F-AB52-0F0F82402850}.tmp” to length 3670016 bytes.  There was insufficient disk space.

I ran the job again and sure enough the temp folder filled up, and the job failed.

I was dealing with a local physical drive here – so expansion was not an option. I needed to relocate the temp files or folders.

My first idea was to edit the environment variables.

  1. Log on to the server using the account the runs SSIS.
  2. Open control panel>system or type sysdm.cpl into a run prompt.
  3. Go to advanced>environment variables.
  4. Edit the TMP and TEMP variables (after making a copy of the original values)
  5. Restart the SSIS service.

I wasn’t keen to restart the service because a couple of jobs were running so idea number two was to create a symbolic link from the temp folder to another, roomier drive.

  1. Log on to the server
  2. Rename the current temp folder to temp_old
  3. Run the following in a command prompt. mklink /D [old temp location] [new temp location]
  4. Copy any contents from the temp_old folder to the new location
  5. Run the job
  6. Delete the shortcut created in the old location by the symbolic link.
  7. Rename temp_old back to temp.
  8. Copy back any files.

Luckily the SSIS jobs that were running were not making use of the temp folder so this all worked well.

Interestingly enough out of three failures (2 job runs and a step retry) the disk space error only appeared once. That error then flowed on to cause a whole bunch of other SSIS errors that were ultimately misleading.

I did manage to find a way to blame the whole thing on the deploy team – so all was not lost. It turned out that the changes they had made caused a lot more data than usual to be piped through SSIS – which meant the temp files took up more space 🙂

 

Posted in SQL Server, SSIS | Tagged | 6 Comments

Another great reason to enable remote admin connections

I recently came cross another great reason to enable the remote admin connections server configuration on SQL server.

If you don’t know what the DAC is I’m not going to go into detail here so check out these posts here, here and here.

The DAC is great tool and a light for any DBA in dark places. By default it is enabled from the host that is running the SQL service. To enable remote access to the DAC you need to enable remote admin connections.

exec sp_configure 'remote admin connections',1
go
reconfigure
go

There’s a bunch of compelling reasons to enable this setting not least of which is the fact that if a server is in such bad way that you need to access it using the DAC – there’s a good chance you either wont be able to or wont want to access the box using an RDP session.

But the best reason of all is that – if you have SQL server clustered instances the DAC will not work locally! Not from the virtual IP, not from the host node, not using IP addresses and specific ports. No how, no way.

Enabling the remote admin connections setting is dynamic, no service restart is required, and a dynamic TCP port is assigned as soon as the reconfigure is run. If you do have any security or firewall restrictions it is possible to set a static port for the DAC but this requires a registry edit and a service restart.

Edit this key:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.X\MSSQLServer\SuperSocketNetLib\AdminConnection\Tcp

Note The “X” in “MSSQL.X” is a number that indicates the directory where the instance is installed for SQL Server 2005 or the instance name for SQL Server 2008. (from http://support.microsoft.com/kb/823938/en-us)

 

DacPortRegKey

 

And restart the service.

Often when you use the DAC you’ll use a single threaded command line tool like SQLCMD but it is possible to use management studio.

You can’t connect using the object explorer because that takes more than one connection. Remember the DAC is a single dedicated scheduler – it is only possible to have one DAC connection at a time. But you can open a query window in SSMS, right click and choose connect (or change connection if the new query window has already connected). Connect to ‘admin:[servername]\[instance name]’.

Management studio will throw this error:

DAC_error

 

But you can see at the bottom that the connection has actually worked. I suspect that the error is probably SSMS trying to open another background connection for something.

Posted in Disaster Recovery, Security, SQL Server | Leave a comment

Tracking down CPU spikes using process explorer and DMVs

What is it about 4:45 pm on a Friday afternoon that makes CPUs spike, drives crash, databases corrupt? I wish I knew but I don’t. I do know that I have been on the receiving end a few times. Just shutting down the last applications when a call or a page comes in. Next minute you’re phoning home and dinner’s in the warmer.

On one such Friday afternoon, in the not too distant past, one of the DBA team noticed that CPU was running hot on our main production server. I could see from looking at task manager on the server that the SQL process was responsible for the CPU usage. But this is a busy OLTP production server and I needed to isolate the process or processes responsible. Looking at sys.sysprocesses it wasn’t immediately obvious what session could be responsible.

Next step was to run process explorer on the server to narrow down which CPU’s were spiking. If you haven’t already discovered it, process explorer is a great free lightweight diagnostic tool from the sys internals team. The executable can be downloaded and run without any installation.

http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx

Open process explorer and click on one of the small graphs near the top to get a detailed view.

procexp1

 

Tick the “Show one graph per CPU” option.

procexp2

 

Hovering over the individual graphs will give the CPU id.

procexp3

Note: The screenshots above are not from the actual issue. At the time I was too busy trying to fix the problem to take screenshots for later use 🙂

Using process explorer this way I found that CPU usage was jumping up and down but there were two CPUs that were sitting at 100% consistently. Armed with the ids I hit SQL Server and some DMVs.

Using sys.dm_os_schedulers with the ids of the two rogue CPUs gave me the scheduler addresses.

select scheduler_address from sys.dm_os_schedulers
where cpu_id in 
	([id1],[id2])

Putting the scheduler addresses into sys.dm_os_workers gave me task addresses.

select task_address from sys.dm_os_workers 
where scheduler_address in 
	([scheduler_address_1]
	,[scheduler_address_2])

And finally putting the task addresses into sys.dm_os_tasks gave me the session ids.

select session_id from sys.dm_os_tasks 
where task_address in 
	([task_address_1]
	,[task_address_2])

Or if you want to put it all together you get something like.

select
	s.cpu_id,
	w.[state],
	t.session_id 
from sys.dm_os_schedulers s
left join sys.dm_os_workers w
	on s.scheduler_address = w.scheduler_address
left join sys.dm_os_tasks t
	on w.task_address = t.task_address
where s.cpu_id in ([id1],[id2],...)
and w.[state] = 'RUNNING'

There’s a bunch of other columns that you might be interested in from these DMVs – but in my case I just wanted the session ids.

From there I could go back to sys.sysprocesses and see that in this case it was a system service broker process that was pegging the CPU. Restarting the service broker queue cleared the issue instantly.

ALTER QUEUE [my_queue]
WITH STATUS=OFF
 
ALTER QUEUE [my_queue]
WITH STATUS=ON
Posted in Service Broker, SQL Server | Tagged , , , | 3 Comments

The full text error and why I was happy to see it.

The Situation: I was recently involved in the migration of a group of core databases from three SQL Server 2005 instances to three SQL Server 2012 instances. There were quite a few moving parts that needed to be considered as part of the upgrade including replication, CLR, service broker and full text search.

The basic approach was to build the 2012 instances in a DR site while the main site continued to run on SQL 2005. Then log-ship the databases from the 2005 instances to the 2012 instances. This way the upgrade of the databases from 2005 to 2012 is postponed until cut-over when recovery is run. (As an aside – this is the reason that you can’t use with standby on the log restores; because this means the database has to go through recovery and hence would upgrade.)

The Error: Before the cutting over the main SQL instances I was working on a single instance upgrade of a few smaller databases. Still going from 2005 to 2012, still using a bunch of technologies including full text search and still using log-shipping to migrate the databases. The morning of the cut-over arrived and I recovered one of the databases…

Converting database ‘my_database’ from version 611 to the current version 706.
Database ‘my_database’ running the upgrade step from version 611 to version 621.
Database ‘my_database’ running the upgrade step from version 621 to version 622.
Database ‘my_database’ running the upgrade step from version 622 to version 625.
Database ‘my_database’ running the upgrade step from version 625 to version 626.
Database ‘my_database’ running the upgrade step from version 626 to version 627.
Database ‘my_database’ running the upgrade step from version 627 to version 628.
Database ‘my_database’ running the upgrade step from version 628 to version 629.
Database ‘my_database’ running the upgrade step from version 629 to version 630.
Database ‘my_database’ running the upgrade step from version 630 to version 631.
Database ‘my_database’ running the upgrade step from version 631 to version 632.
Database ‘my_database’ running the upgrade step from version 632 to version 633.
Database ‘my_database’ running the upgrade step from version 633 to version 634.
Database ‘my_database’ running the upgrade step from version 634 to version 635.
Database ‘my_database’ running the upgrade step from version 635 to version 636.
Database ‘my_database’ running the upgrade step from version 636 to version 638.
Database ‘my_database’ running the upgrade step from version 638 to version 639.
Database ‘my_database’ running the upgrade step from version 639 to version 640.
Database ‘my_database’ running the upgrade step from version 640 to version 641.
Database ‘my_database’ running the upgrade step from version 641 to version 642.
Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally.
Msg 30097, Level 23, State 100, Line 1
The fulltext catalog upgrade failed because of an inconsistency in metadata between sys.master_files and sys.fulltext_catalogs for the catalog ID xx in database ID x.
Try to reattach this database. If this fails, then the catalog will need to be dropped or recreated before attach.

During the course of setting up log-shipping and planning for the migration I had recovered the databases a number of times and had never seen this error before. The database was in a recovery pending state and the only way forward seemed to be to drop the full text catalogs at the source, take a fresh full backup and restore that backup. (Don’t be fooled by the suggestion in the message to reattach the database – I couldn’t find a way to make this work in this situation.)

That approach worked for the smaller database but it wouldn’t be an option for the main site. In this case we had several terabytes of database, we couldn’t drop full text, backup, copy and restore in any reasonable length outage window.

The Theory: As well as log-shipping from the main site to DR we also log-shipped into our data warehouse every night. The data warehouse server was also a SQL 2012 box. I was curious about why I hadn’t seen this error on the data warehouse server – after all we recovered the main database onto it every morning. Then one morning the recovery failed and we got the same error.

The theory was – if a change was made to the full text schema this would break the recovery. I needed to be able to reliably reproduce this – and then solve it before we attempted to migrate the main site.

The Reproduction: I set up a VM with a SQL 2005 instance and a SQL 2012 instance and log shipped from the 2005 to the 2012 instance. I made a change to one of the full text indexes (dropped a column) and allowed a log backup and restore to happen. Then I tried to recover the 2012 database and I got the error. I’d never been so happy to see an error in my life. I did it again, and got the error again. Now that I could reliably reproduce the error I could test ways of avoiding it.

The Solution: The first and most obvious solution was to put a change freeze on any full text schema changes. These are fairly rare so that wasn’t too much of a hardship. The other solution was to drop the full text indexes and catalogs at the source database, allow a log backup to occur (even a final tail of log backup). Allow the log backup to restore and then recover the database. Then rebuild the full text catalog and indexes.

For completeness dropping and recreating a full text index and catalog:

-- disable and drop the index and catalog
ALTER FULLTEXT INDEX ON [my_table] DISABLE;
GO
DROP FULLTEXT INDEX ON [my_table]
GO
DROP FULLTEXT CATALOG [my_table];
GO
 

-- now recreate the catalog and index
CREATE FULLTEXT CATALOG [my_table]
    IN PATH N'C:\some_path' --no effect from SQL2008
    WITH ACCENT_SENSITIVITY = OFF
    AUTHORIZATION [dbo]
GO
 
CREATE FULLTEXT INDEX ON [my_table](
    [my_column], [my_other_column]
    )
KEY INDEX [PK_my_table] ON [my_table]
WITH CHANGE_TRACKING AUTO
GO
Posted in Database options, Disaster Recovery, SQL Server | Tagged , , , | 2 Comments

Migrating SQL Server logins

Once upon a migration I would use sp_help_revlogin to migrate logins and remap the SQL server logins using sp_change_users_login. During some recent migration work I realised that SQL Server Management Studio could do most of the work of sp_help_revlogin. I still needed to use sp_change_users_login – but I came across a couple of other tips to solve a couple of issues.

Note: The terminology can be a bit ambiguous and confusing here. When I talk about logins I mean all security principals that can be used to access the SQL Server instance, either Windows or SQL principals. When I talk about SQL server logins – I mean logins that use SQL authentication, as opposed to logins that use Windows authentication.

SSMS: To script out all logins from a SQL server instance simply select the logins node in the security node of object explorer and open the object explorer details window ([F7]). This window allows you to shift click to multi-select logins and right click to script out as desired. One caveat here is that the script to create the SQL server logins will generate random passwords – these need to be replaced with the correct passwords.

logins_node

logins

Mapping SQL logins: Logins based on Windows AD accounts will remap automatically across SQL instances in the same domain because the SIDs are the same, set by Windows as opposed to SQL Server. But SQL server logins will be assigned a random SID when created. Because the mapping between a SQL server login and a database user is based on SIDs the Windows logins will remap, but the SQL server logins will not. The well known procedure sp_change_users_login can be run in a database context to identify these ‘orphaned’ logins and remap them.

Creating SQL logins with specific SIDs: I had a situation recently where I migrated a data warehouse server. Production databases were restored to the server each night, data scrubbed and then set to read-only. I needed a login on the SQL Server to be mapped to a database user so that queries could be run against the warehouse. In this case the login I had was a SQL server login – so when I migrated instances the login SID did not match the SID of the user in the production databases. Sp_change_users_login won’t work on a read-only database because the metadata can’t be written. My options where to remap the database users to the login after every restore but before setting the database to read-only or to create the login on the server with a SID matching the SID on the production boxes. It turns out that a SQL login can be created with a specific SID. The following query can be run against the source server master database and will produce create scripts for all logins with explicit creation of SIDs – which can be run as required on the destination server.

SELECT 
'create login [' + p.name + '] ' 
+ case 
	when p.type in('U','G') then 'from windows ' 
	else '' 
end 
+ 'with ' 
+ case 
	when p.type = 'S' then 'password = ' 
		+ master.sys.fn_varbintohexstr(l.password_hash) 
		+ ' hashed, ' 
		+ 'sid = ' 
		+ master.sys.fn_varbintohexstr(l.sid) 
		+ ', check_expiration = ' 
		+ case 
			when l.is_expiration_checked > 0 then 'ON, ' 
			else 'OFF, ' 
		end 
		+ 'check_policy = ' 
		+ case 
			when l.is_policy_checked > 0 then 'ON, ' 
			else 'OFF, ' 
		end 
	else '' 
end 
+ 'default_database = [' 
+ p.default_database_name 
+ case 
	when len(p.default_language_name) > 0 then '], default_language = [' 
		+ p.default_language_name 
		+ ']'
	else ']' 
end
FROM sys.server_principals p
	LEFT JOIN sys.sql_logins l
	ON p.principal_id = l.principal_id
WHERE p.type in('S','U','G')

Forgotten passwords: Another issue I came across in a recent migration was passwords that had fallen through the cracks and been forgotten. After spending some fruitless hours trying to get hashcat to work properly – and failing – I revisited the query above. Turns out I didn’t need to recover passwords – I could recreate the logins using the password_hash. To convince myself of this I create a login to test with.

USE [master]
GO
CREATE LOGIN [lostpassword] 
WITH PASSWORD=N'I_f0rg0t!'
, DEFAULT_DATABASE=[master]
, CHECK_EXPIRATION=OFF
, CHECK_POLICY=OFF
GO

Run the login script from above which for [lostpassword] gives:

create login [lostpassword] 
with password = 0x02002c4b16cfbc80f3d07ab3d67042262a87370fc7f0a6300badcddd999ec304c9e69137d5f90b307a230330a25e41ca3e9d3a2664d2782aafe1611608afedfabf67b44ef22e hashed
, sid = 0xed5cd87e54a8bc43b7ff9435f052e2cd
, check_expiration = OFF
, check_policy = OFF
, default_database = [master]
, default_language = [us_english]

Drop the login that I originally created.

DROP LOGIN [lostpassword]

And recreate the login using the generated script.

create login [lostpassword] 
with password = 0x02002c4b16cfbc80f3d07ab3d67042262a87370fc7f0a6300badcddd999ec304c9e69137d5f90b307a230330a25e41ca3e9d3a2664d2782aafe1611608afedfabf67b44ef22e hashed
, sid = 0xed5cd87e54a8bc43b7ff9435f052e2cd
, check_expiration = OFF
, check_policy = OFF
, default_database = [master]
, default_language = [us_english]

And confirm that the recreated login’s password matches the original.

SELECT name FROM sys.sql_logins 
WHERE PWDCOMPARE('I_f0rg0t!', password_hash) = 1 ;

Which it does:

logins_password

Another tool for the migration toolkit.

Posted in Security, SQL Server, T-SQL, Uncategorized | Tagged , | 8 Comments