New Perfmon Counters With TFS 11 Beta
“Can configure and interpret Perfmon counters”. Not one of the top three skills you’ll put or see on someone’s resume however, it’s a good skill to have. I am the TFS Product Manager for Accenture’s CIO group in the Enterprise Workforce and haven’t seen Perfmon listed on a resume to date. Since I have experience using them it would be an interesting interview discussion.
In addition to my role I am also in the VS ALM Rangers program and have had the opportunity of working on the TFS Upgrade Guidance and TFS Planning Guidance. In the Planning Guidance I included a section on the Perfmon counters that are going to be in the next release of TFS this year.
Before I go much further I’d like to thank Erin Rakickas from Accenture, Grant Holliday from Microsoft, Prasanna Ramkumar from Magenic and Willy Schaub from Microsoft for their review and input on this post’s content. Initially it was looking like a dump from a help file and with their input I think it turned out much better.
I’ve used Team Foundation Server since early 2006 when I first installed it for a client. Since then I’ve worked with TFS quite a bit and have had questions raised to explain how TFS is performing. This has had me look to the Perfmon counters that are available and how they may give clues into the systems performance. In addition to just the TFS Perfmon counters I’ve used SQL Server counters to get the overall view of what is going on:
- Is there correlation to the Current Disk Queue Length that is indicating slow disk performance and this is leading to slow query performance?
- Is the number of user connections indicating that connections are being left open and SQL is being overloaded?
I’m a numbers collection type of person like digging into the details to try and determine if there’s a pattern that occurred or led up to a performance slowdown.
Whether you are a small company running TFS or a large Enterprise there may come a time for you too where performance of your TFS system may come into question and you will need to determine what to look at when “TFS seems to be running slow”. You should proactively be establishing a baseline of performance for your environment and Perfmon counters can help. Having this baseline will be extremely important in understanding the differences between what has changed from when the application was running properly and the state when it is not. I can attest to this from experiences I have had with when load testing a financial system a few years ago before it went live. We started capturing Perfmon counters prior to processing transactions and then ramped up the users/transactions created. We were able to identify and address memory and SAN issues before the system went live. Much better than doing this in a production environment and affecting real users.
With the upcoming release of TFS there will be additional counters to assist in identifying what may be affecting TFS. My intention with this post is to raise your awareness of what options you have and not to guide you through determining which value is indicating a performance issue. That task is left to you to measure against what they were when the system was running smoothly and what they are when it is not. If you need some help in setting up counter collection you can take a look at Grant Holliday’s post on Querying Perfmon data from SQL. He has another good post on Large TFS Performance characteristics. They are a couple years old however, they are still very relevant.
TFS Perfmon counters have been around since TFS2005. To give you an idea of how they have changed between versions I am using the TFS Service counters as an example. X in the column indicates the counter is available in that version.
Counter Name |
2005 |
2008 |
2010 |
TFS 11 Beta |
Current Events In Process |
X |
X |
_ | _ |
Current Notifications Queued |
X |
X |
_ | _ |
Current Events/Sec |
X |
X |
_ | _ |
Current Link Queries/Sec |
X |
X |
_ | _ |
Current Registration Queries/Sec |
X |
X |
_ | _ |
Total Number of Failed retry sequences | _ | _ | _ |
X |
Total Number of Throttling Events | _ | _ | _ |
X |
Total Number of SQL Batches | _ | _ | _ |
X |
Current SQL Executions/Sec | _ | _ | _ |
X |
Current SQL Notification Queries/Sec | _ | _ | _ |
X |
Current Task Executed/Sec | _ | _ | _ |
X |
Active Team Project Collection Service Hosts | _ | _ | _ |
X |
Active Application Service Hosts | _ | _ | _ |
X |
Active Deployment Service Hosts | _ | _ | _ |
X |
Current SQL Execution Retries/Sec | _ | _ | _ |
X |
Current SQL Connection Retries/Sec | _ | _ | _ |
X |
Current SQL Connection Failures/Sec | _ | _ | _ |
X |
Average SQL Connect Time | _ | _ | _ |
X |
Active SQL Connections | _ | _ | _ |
X |
Average Response Time | _ | _ |
X |
X |
Current Requests/Sec | _ | _ |
X |
X |
Current Server Requests | _ | _ |
X |
X |
As the application has evolved so has the use of counters to assist in measuring system performance.
All of the information I am presenting below is from the TFS 11 Beta version. I hope it raises your awareness of what is going to be available and when you implement the next version of TFS you take my advice of baselining the system.
TFS Proxy Server Services
The principles behind the TFS Proxy Server counters are to be able to track the size of the cache and downloads that are occurring. Two new Perfmon Counters have been added with TFS 11 Beta. These new counters will be beneficial in knowing the number of files being downloaded and the rate at which they are. If you have established a baseline when the Proxy Server was performing well then you should be able to compare the download values to see if there’s a noticeable difference in the download rates. To this point you can also find out the size of the files being downloaded and who is doing the downloads by reviewing the IIS log detail on the TFS Proxy Server.
* = New in TFS 11 Beta
Counter Name | Description |
* Current File Downloads/Sec | Current File Downloads/Sec is the rate that files are being downloaded from the proxy service |
* Current File Downloads | Current File Downloads indicates the number of files currently being downloaded from the proxy service |
Total Files in Cache | The total number of files available in the cache |
Total Cache Hits | Total number of download requests served from the file cache |
Total Download Requests | The total number of download requests that comes to the file cache |
Current Cache Size(Bytes) | Current Cache Size in Bytes |
Not every TFS installation will make use of the TFS Proxy Server. The benefit of using a TFS Proxy Server with a remote location is when there are many developers working on the same project and the connection to the TFS Server does not have much bandwidth. You can have a Proxy Server in the remote location. Putting your TFS Proxy Server on your TFS Server won’t help a remote location. The TFS Proxy Server is to cache retrieved code. There is still communication going on with the TFS Server itself so the network latency will still have an impact on the TFS connectivity. Authorization checks and check-ins still go to the TFS Server.
TFS Services
The Perfmon counters for TFS Services had a huge increase in what is captured. With TFS2010 there were only three counters and now there are 17. SQL Server and Service Host counters have made their appearance and this is due to the use of TFS in the cloud.
The SQL counters can be measured against the connections you may track in SQL’s sys.sysprocesses for open connections. Knowing if there are any problems occurring with SQL Server connectivity should help in determining the root cause of problems.
If your TFS Server has many TPCs on it then the Active Team Project Collection Service Hosts counter will have a correlation to the amount of memory in use. This counter indicates TPCs that are loaded (in use) so a decrease in the number of TPCs should show a corresponding decrease in memory in use.
Seeing a high rate of retries can be an indicator that there is latency problem that is causing timeouts to occur and retries start occurring. This would be where you need to get the network involved for monitoring connectivity between servers/clients
These new counters will help in determining if there are database connection issues.
* = New in TFS 11 Beta
Counter Name | Description |
* Total Number of Failed retry sequences | Number of failed retry sequences |
* Total Number of Throttling Events | Number of Throttling events with SQL |
* Total Number of SQL Batches | Number of interactions with SQL (number of batches) |
* Current SQL Executions/Sec | Current SQL Executions is the rate at which SQL Queries are being performed |
* Current SQL Notification Queries/Sec | Current SQL Notification Queries is the rate at which Queries for SQL notifications are being performed |
* Current Task Executed/Sec | Current Task Executed is the rate at which tasks are being executed |
* Active Team Project Collection Service Hosts | Number of active team project collection service hosts |
* Active Application Service Hosts | Number of active application service hosts |
* Active Deployment Service Hosts | Number of active deployment service hosts |
* Current SQL Execution Retries/Sec | Current SQL execution retries is the rate at which SQL command execution is being retried |
* Current SQL Connection Retries/Sec | Current SQL connection retries is the rate at which SQL connection retries are being attempted |
* Current SQL Connection Failures/Sec | Current SQL connection failures is the rate at which SQL connection attempts are failing |
* Average SQL Connect Time | Average time needed to open a new SQL connection |
* Active SQL Connections | Number of SQL connections in any state used by TFS services |
Average Response Time | Average Response Time is the time, on average, that the server took to process a single request |
Current Requests/Sec | Current Requests/Sec is the rate at which requests are being processed by the server |
Current Server Requests | Current Server Requests indicates the number of currently active requests being processed by the server |
TFS Version Control
The Perfmon counters for TFS Version control actual have seen a decrease in the number available, having gone from 11 in TFS 2010 to seven in TSF 11 Beta. The Cache counters have been removed. The Version Control counters that are left will help you identify if there is a correlation between the numbers of files be downloaded and observed slowness in TFS.
Counter Name | Description |
Average Response Time | Average Response Time is the time, on average, that the Version Control service took to process a single request |
Current Requests/Sec | Current Requests/Sec is the rate at which requests are being processed by the Version Control Service |
Current Server Requests | Current Server Requests indicates the number of currently active requests being processed by the Version Control service |
Current File Downloads/Sec | Current File Downloads/Sec is the rate that files are being downloaded from the Version Control service |
Current File Downloads | Current File Downloads indicates the number of files currently being downloaded from the Version Control service |
Current File Uploads/Sec | Current File Uploads/Sec is the rate that files are being uploaded to the Version Control service |
Current File Uploads | Current File Uploads indicates the number of filescurrently being uploaded to the Version Control service |
TFS Work Item Tracking
With TFS 11 Beta there were three new counters added to the ones for tracking work item queries. The counters that were already there should be helpful in identifying if work item queries were impacting TFS performance.
When looking at these counters I would compare the total number of requests being processed against a baseline. Is the number of requests much higher than normal and leading to a load on the application tier? This would be an indicator of adding additional processors and/or memory. If the queries being run are a spike and not normal it may be something you need to discuss with the end users to plan appropriately.
* = New in TFS 11 Beta
Counter Name | Description |
* Latency Window Starts/Sec | Latency Window Starts/Sec is the rate at which server responses initiate latency time window per user – used if replication is enabled |
* Write Access Elevations/Sec | Write Access Elevations/Sec is the rate at which requests are elevated to Write access |
* ReadLatest Access Elevations/Sec | ReadLatest Access Elevations/Sec is the rate at which requests are elevated to ReadLatest access |
Active GetQueryAccessControlList Requests | The number of access control list query requests currently executing |
Active GetStoredQueries Requests | The number of stored queries requests currently executing |
Active GetStoredQuery Requests | The number of stored query requests currently executing |
Active GetMetadata Requests | The number of cache updates currently executing |
Active Update Requests | The number of updates currently executing |
Active GetWorkitem Requests | The number of work Item requests currently executing |
Active Paging Requests | The number of paging requests currently executing |
Active Query Requests | The number of queries currently executing |
TFS Lab Management
Perfmon counters for TFS 11 Beta didn’t change going from TFS 2010 to TFS 11 Beta. I haven’t used Lab Management in any of my TFS experiences. I would be interested to hear how you may have used these counters.
Counter Name | Description |
Current Requests | Number of Lab Management requests currently active inside server |
Requests/Sec | Number of Lab Management requests processed per second |
Current Lab Environment Creations | Number of lab environments Team Foundation Server is creating on this server |
Current Operations | Number of lab environment or template operations (create, start, stop, snapshot, delete, and so forth) in progress on this server |
Total Operations | Number of lab environment or template operations (create, start, stop, checkpoint, delete, and so forth) completed since the last reboot. This number also includes unsuccessful operations |
Total Lab Environment Creations | Number of lab environments created since the last reboot |
Total Lab Environment Creation Failures Due To Lack Of Resources | Number of unsuccessful lab environment creations because of a lack of resources since the last reboot |
Powershell Cmdlets/Sec | Number of PowerShell cmdlets executed per second |
Runspaces Created | Number of runspaces created |
Kvp Data Cache : # entries | Number of entries in Kvp data cache |
Kvp Data Cache : hit ratio | Hit Ratio of Kvp cache |
Kvp Data Cache : hit ratio base counter | Hit Ratio Base counter of Kvp cache |
Kvp Data Cache : # trims | Number of entries removed either invalid or to make space for others in Kvp cache |
VMMS Cache : # entries | Number of entries in the SCVMM cache |
VMMS Cache : hit ratio | Hit ratio of the SCVMM cache |
VMMS Cache : hit ratio base counter | Hit ratio base counter of the SCVMM cache |
VMMS Cache : # trims | Number of entries removed either they were not valid or to allocate space for other entries in the SCVMM cache |
CS Cache : # entries | Number of entries in ‘computer system objects’ cache |
CS Cache : hit ratio | Hit Ratio of ‘computer system objects’ cache |
CS Cache : hit ratio base counter | Hit Ratio Base counter of ‘computer system objects’ cache |
CS Cache : # trims | Number of entries removed either invalid or to make space for others in ‘computer system objects’ cache |
VMM Object Cache : VM & Template : # entries | Number of entries in the VM and Template cache |
VMM Object Cache : Location : # entries | Number of entries in the Host, Host Group, and Library Share caches |
VMM Object Cache : Snapshot : # entries | Number of entries in Snapshot cache |
VMM Object Cache : Task : # entries | Number of entries in Task cache |
VMM Object Cache : Profile : # entries | Number of entries in the Hardware profile, OS profile, and User role profile caches |
Since this information was based off of the TFS 11 Beta release I will update it as the RC and RTM versions come out with changes.
What’s Next?
As Erin stressed in his feedback to me “Baselining your system during healthy periods is very important” and I completely agree. With the new Perfmon counters that will be available, you should plan how to incorporate them into monitoring your TFS environment. If you have an existing environment you should get a head start and start looking at performance 🙂