skip to main content

Doctors in training employment data processing

Purpose

Combine data from Turas People and SWISS to report accurate placement data for medical doctors in training (DiT) in the NHSScotland Workforce official statistics publication.

Process

The two extracts from Turas People and SWISS are merged each month within NES' Workforce Data Warehouse.

The first part of this process is to apply a series of checks to the Turas People data. Records or values which fail to meet certain criteria are identified and removed. For example, records with a missing board of placement are removed, and any record which has a staff age of less than 20 or greater than 75 is removed. Further checks are then carried out to identify any system-level data issues. For example, counts of missing or duplicate values in certain fields.

The next part of this process is to merge the Turas People data with SWISS. There is no complete, unique identifier in across these data. The data are therefore joined based on combinations of fields. 

  1. Pay Number is the most reliable method for matching. If Pay Number is complete and matches in both data sets for the current census, this is used to join both data sets.
  2. If a Pay Number is missing from the Turas extract, previous Turas extracts are checked to see whether a Pay Number was previously detailed. If it was, then this Pay Number is used to join both data sets. 
  3. If no previous pay number can be found in Turas People, then a combination of National Insurance Number, Date of Birth and Surname in both extracts for the current census is used to join the data sets. This is less reliable because National Insurance Number could be temporary number (which is a concatenation of date of birth and sex), and surname is subject to spelling errors or changes across systems.
  4. The final option is to join the data sets using a combination of Date of Birth and Surname for the current census.

We cannot merge approximately 150 (2%) of records in Turas People with SWISS because they fail all attempted matches. There are several reasons why this might happen, for example, missing data, data entry errors in either data set (different spellings of names), or the systems may be updated at different times.

The output of the checking and matching are reviewed by a data analyst.

For records that are matched, data taken from Turas People include Date of Birth, Sex, Registration Body, Specialty, Medical Grade, Group Code, Main Cost Centre, Employer Code, Base Location, Start Date.

Records in Turas People that remain unmatched are discarded since we assume that people employed in NHSScotland will be in SWISS as this is linked to payroll. If we include unmatched Turas People records, then we risk double-counting people.

Medical DiT records in SWISS that remain unmatched are used, but may report the wrong board of placement or specialty.