Quantcast
Channel: SQLServerCentral » SQL Server 2014 » Development - SQL Server 2014 » Latest topics
Viewing all articles
Browse latest Browse all 3145

Strange performance of simple query: sometimes fast, sometimes slow

$
0
0
I've run into a weird situation with a query that sometimes runs blazing fast (<1ms) but at other times can take around 500ms. For a less frequently executed query this wouldn't be such a problem but, unfortunately, this one's used a lot.[code]SELECT [ta0_astd].[ApplicationStatusTrnGuid], [ta0_astd].[ApplicationGuid], [ta0_astd].[StatusKey], [ta0_astd].[IsCurrent], [ta0_astd].[UpdatedBy], [ta0_astd].[UpdateDate], [ta0_astd].[CreatedBy], [ta0_astd].[CreatedDate] FROM [app].[STATUS_TRN] AS ta0_astdWHERE ta0_astd.[ApplicationGuid] = @ApplicationGuid;[/code]The primary key column is ApplicationStatusTrnGuid, and is non-clustered. The table has a clustered index on ApplicationGuid, since this is what we most commonly query against. Both of these columns are UNIQUEIDENTIFIERS. The use of UNIQUEIDENTIFIERS is obviously controversial but, in this case, it is what it is - there are simply too many dependencies in the app layer to switch to INTs.There are around 1.6 million rows in the table, and the above query can return anywhere between 0 and 25ish rows, so it's quite selective. Execution time does not correspond to the number of rows returned: sometimes the query is slow when no (or only a few rows are returned). Conversely it's also often fast when a couple of dozen rows come back.The table has a number of other indexes which have been added to cover various query scenarios. Here's the full definition:[code]CREATE TABLE [app].[STATUS_TRN]( [ApplicationStatusTrnGUID] [uniqueidentifier] ROWGUIDCOL NOT NULL DEFAULT (newsequentialid()), [ApplicationGUID] [uniqueidentifier] NOT NULL CONSTRAINT [DF__STATUS_TR__Appli__0D44F85C] DEFAULT ('00000000-0000-0000-0000-000000000000'), [StatusKey] [int] NOT NULL CONSTRAINT [DF__STATUS_TR__Statu__4737FD87] DEFAULT ((0)), [SourceKey] [int] NULL, [UpdateDate] [datetimeoffset](4) NULL DEFAULT (sysdatetimeoffset()), [AssignedUserKey] [int] NULL, [ReasonKey] [int] NULL CONSTRAINT [DF__STATUS_TR__Reaso__10216507] DEFAULT ((0)), [ReferRoleKey] [int] NULL, [UpdatedBy] [int] NULL, [CreatedDate] [datetimeoffset](4) NULL DEFAULT (sysdatetimeoffset()), [CreatedBy] [int] NULL, [IsCurrent] [bit] NOT NULL CONSTRAINT [DF_STATUS_TR_IsCurrent] DEFAULT ((0)), CONSTRAINT [PK_STATUS_TRN] PRIMARY KEY NONCLUSTERED( [ApplicationStatusTrnGUID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY]GOCREATE CLUSTERED INDEX [IX_STATUS_TRN_ApplicationGUID] ON [app].[STATUS_TRN]( [ApplicationGUID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]GOCREATE NONCLUSTERED INDEX [IX_STATUS_TRN_ApplicationGUID_ApplicationStatusTrnGUID_UpdateDate_StatusKey] ON [app].[STATUS_TRN]( [ApplicationGUID] ASC, [ApplicationStatusTrnGUID] ASC, [UpdateDate] ASC, [StatusKey] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]GOCREATE NONCLUSTERED INDEX [IX_STATUS_TRN_StatusKey] ON [app].[STATUS_TRN]( [StatusKey] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]GOCREATE NONCLUSTERED INDEX [IX_STATUS_TRN_StatusKey_ApplicationGUID_ApplicationStatusTrnGUID_UpdateDate] ON [app].[STATUS_TRN]( [StatusKey] ASC, [ApplicationGUID] ASC, [ApplicationStatusTrnGUID] ASC, [UpdateDate] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]GOALTER TABLE [app].[STATUS_TRN] ADD CONSTRAINT [FK_STATUS_TRN_APPLICATION_MST] FOREIGN KEY([ApplicationGUID])REFERENCES [app].[APPLICATION_MST] ([ApplicationGUID])GOALTER TABLE [app].[STATUS_TRN] CHECK CONSTRAINT [FK_STATUS_TRN_APPLICATION_MST]GOCREATE NONCLUSTERED INDEX [IX_STATUS_TRN_ApplicationGUID_StatusKey_UpdateDate] ON [app].[STATUS_TRN]( [ApplicationGUID] ASC, [StatusKey] ASC, [UpdateDate] ASC)WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]goCREATE NONCLUSTERED INDEX [IX_STATUS_TRN_ApplicationGUID_with_UpdateDate] ON [app].[STATUS_TRN]( [ApplicationGUID] ASC)INCLUDE ( [UpdateDate]) WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]GOCREATE NONCLUSTERED INDEX [IX_STATUS_TRN_IsCurrent_StatusKey_ApplicationGUID] ON [app].[STATUS_TRN]( [IsCurrent] ASC, [StatusKey] ASC, [ApplicationGUID] ASC)WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]GOCREATE NONCLUSTERED INDEX [IX_STATUS_TRN_ApplicationGUID_IsCurrent_StatusKey] ON [app].[STATUS_TRN]( [ApplicationGUID] ASC, [IsCurrent] ASC, [StatusKey] ASC)INCLUDE ( [ApplicationStatusTrnGUID], [UpdateDate]) WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]GO[/code]There are 216 rows in the table that have the default ApplicationGUID value of '00000000-0000-0000-0000-000000000000'. Values of ApplicationGUID are assigned on the app.APPLICATION_MST table using newsequentialid(). Applications in this table can change status through a workflow until they reach one of several final states (Live, Cancelled, etc.). Older applications don't tend to change status much because they're likely to already be in a final state.The database has RCSI enabled, and is currently relatively quiet since it's only being used for testing. Operations on the table are mostly reads, along with regular INSERTs. Rows are never UPDATEd.None of the indexes are fragmented. Here are the stats for them:[code]Schema Table Index Avg. Frag % Page Count===============================================================================================================================app STATUS_TRN IX_STATUS_TRN_ApplicationGUID 0.0267451190157796 18695app STATUS_TRN PK_STATUS_TRN 0.0342504852152072 8759app STATUS_TRN IX_STATUS_TRN_ApplicationGUID_ApplicationStatusTrnGUID_UpdateDate_StatusKey 0.0443616360571378 11271app STATUS_TRN IX_STATUS_TRN_ApplicationGUID_IsCurrent_StatusKey 0.0436262106273449 11461app STATUS_TRN IX_STATUS_TRN_ApplicationGUID_StatusKey_UpdateDate 0.0611246943765281 8180app STATUS_TRN IX_STATUS_TRN_ApplicationGUID_with_UpdateDate 0.0540248514316586 7404app STATUS_TRN IX_STATUS_TRN_IsCurrent_StatusKey_ApplicationGUID 0.541842263696568 6644app STATUS_TRN IX_STATUS_TRN_StatusKey 0.419189566837448 6441app STATUS_TRN IX_STATUS_TRN_StatusKey_ApplicationGUID_ApplicationStatusTrnGUID_UpdateDate 0.221906621693591 11266[/code]Query performance, as viewed from the client Web API application, typically looks as shown in [b]IntermittentSlowAppStatusQuery1.PNG[/b] and [b]IntermittentSlowAppStatusQuery2.PNG[/b] (attached). Note that this data was collected with ANTS Performance Profiler, so queries are shown in order of decreasing execution time, not order of execution.[img]http://www.sqlservercentral.com/Forums/Attachment18351.aspx[/img][img]http://www.sqlservercentral.com/Forums/Attachment18352.aspx[/img]You can see that roughly half the queries are very fast, whilst half are slow. The query is exactly the same every time, with only the value of the parameter varying. The query is generated via the [url=https://github.com/Paymentsense/Dapper.SimpleLoad]Dapper.SimpleLoad[/url] library, and executed using the [url=https://github.com/StackExchange/dapper-dot-net]Dapper microORM[/url].The execution plan for the worst performing instance, which doesn't appear to show anything amiss, is shown in [b]IntermittentSlowAppStatusQueryExecutionPlan.PNG[/b] (also attached).[img]http://www.sqlservercentral.com/Forums/Attachment18353.aspx[/img]When this query is run over all values of ApplicationGUID using SQL Server Management Studio directly on the SQL Server box (i.e., RDPing into the SQL Server box and running SSMS actually on the box) the observed behaviour differs. The query is still, around 20-25% of the time, intermittently slow, but the effect is much less pronounced. Generally it runs in <1ms, but when it's slow, it takes 30-170ms.The fact that it runs slow less often and, moreover, is less slow, when run on the box, suggests that the observed problem in the web app may in fact be nothing to do with SQL Server, or the web app, but could be an infrastructure issue. I'm particularly suspicious about a networking issue.Here's an overview of the infrastructure...SQL Server 2014 is running on bare metal, dual socket hex core Xeons E5-2643 v3s clocked at 3.4GHz, with 128GB of RAM, of which 100GB is allocated to SQL Server. Storage wise we have 15k SAS internal storage divided into 4 volumes:[b]OS[/b] - RAID 1, 275GB usable, 208GB free[b]DATA1[/b] - RAID 10, 1.08TB usable, 917GB free[b]LOGS[/b] - RAID 10, 1.08TB usable, 1.04TB free[b]TEMPDB[/b] - RAID 10, 1.08TB usable, 1.02TB freeData, logs, and tempdb are stored as the volume names indicate.Separately we also have a SATA array in a single volume:[b]SATA[/b] - RAID 6, 7.27TB usable, 3.20GB free - used for FILESTREAM files (customer documents), and backupsStorage connectivity is 10 gigabit iSCSI.The client is a .NET Web API application running in IIS 8.5 on Windows Server 2012 R2 Datacenter Edition, backed by SQL Server 2014 Standard Edition.The web servers comprise two VMs with 4 virtual cores running on VMWare ESXI 6 with 12GB RAM allocated, also running Windows Server 2012 R2 Datacenter. The physical processors are 10 core Xeons (E5-2660 v2) running at 2.60Ghz. They are fronted by a load balancer (I'm not sure what the spec is).Both the web server hosts and the SQL Server box are installed in the same datacentre, in the same physical rack, with gigabit connectivity between. They are, however, in separate firewalled VLANs.I also plan to run SQL Profiler on the box to see if anything obvious comes up there when this query runs. I'm not expecting to see anything particularly interesting, but it pays to make sure.Can anyone see anything obvious that might be causing this issue? Is it likely to be something to do with the VLAN or firewall configuration, or am I just clutching at straws with that?Any help gratefully received!Thanks,Bart

Viewing all articles
Browse latest Browse all 3145

Trending Articles