The Measurement Properties of Quantitative Sensory Testing Measuring Vibration Thresholds in the Orthopaedic Trauma Population: A Systematic Review
Abstract
Background: Neuropathic Pain (NP) is prevalent in the Orthopaedic Trauma (OT) population. Quantitative Sensory Testing (QST) measuring Vibration Thresholds (VT) is recommended to facilitate its diagnosis. However, the measurement properties of QST measuring VT in the OT population are currently unknown.
Objective: To establish the reliability, validity and responsiveness of QST measuring VT in the OT population.
Methods: Three electronic databases were systematically searched. Methodological quality was evaluated using the COSMIN (Consensus-based Standards for the Selection of Health Measurement Instruments) 4 point-scale (Terwee et al., 2012). The COSMIN quality criteria and levels of evidence were used to determine the strength of evidence for each measurement property (Terwee et al., 2011). A narrative synthesis was conducted due to a lack of homogeneity across studies.
Results: The search strategy returned 448 articles. Following the study selection process, four studies were included in the review. Of those, three evaluated reliability and one investigated validity. No study was retrieved for responsiveness. There was a moderate level of evidence supporting reliability and a limited level of evidence for validity.
Conclusion: Insufficient evidence is available to draw conclusions regarding the reliability, validity and responsiveness of QST measuring VT in the OT population. Further studies of high methodological quality are required to confirm its measurement properties.
Implications of key findings: The use of QST measuring VT to diagnose NP in the OT population cannot be recommended.
Table of Contents
Title page………………………………………………………………………………………………..p.1
Abstract…………………………………………………………………………………… p. 2
Table of Contents………………………………………………………………………… p. 3-5
- Introduction………………………………………………………………………. p. 6-9
- Rationale…………………………………………………………………………. p.5-8
- Objectives…………………………………………………………………………p. 9
- Methodology………………………………………………………………………. p. 10-14
- Protocol and registration………………………………………………………….. p. 10
- Eligibility criteria…………………………………………………………………p.10-11
- Information sources……………………………………………………………… p. 11
- Search……………………………………………………………………………. p. 11
- Study selection…………………………………………………………………… p. 11-12
- Data collection……………………………………………………………………p. 12
- Data items………………………………………………………………………. p. 12
- Methodological quality of individual studies …………………………………. p. 12-13
- Summary measures……………………………………………………………… p. 13
- Synthesis of results…………………………………………………………p. 13-14
- Results………………………………………………………………………………. p. 15-23
- Study selection…………………………………………………………………… p. 15
- Study characteristics………………………………………………………………p. 15
- Study design………………………………………………………………. p. 15
- Participants…………………………………………………………………p. 15
- Measurement Properties…………………………………………………………. p. 17-18
- Reliability…………………………………………………………………. p. 17-18
- Validity……………………………………………………………………. p. 18
- Responsiveness……………………………………………………………. p. 18
- Quantitative Sensory Testing……………………………………………………. p. 18-21
- Vibration Thresholds………………………………………………………p. 18
- Equipment…………………………………………………………………p. 19
- Testing procedure…………………………………………………………. p. 19-20
- Standardisation methods…………………………………………………. p. 21
- Methodological quality of individual studies……………………………………. p. 22
- Synthesis of results………………………………………………………………. p. 23
- Discussion…………………………………………………………………………. p. 24-29
- Summary of evidence……………………………………………………………. p. 24-28
- Measurement Properties…… ………………………………………………p. 24
- Methodological issues………………………………………………………p. 25-28
- Sample size…………………………………………………………p. 25
- Standardisation ….…………………………………………………. p. 25-27
- Participants ………………………………………………………p. 27-28
- Clinical Implications ………………………………………………………p. 28
- Research recommendations ………………………………………………. p. 29
- Limitations………………………………………………………………………. p. 29
- Conclusion…………………………………………………………………………. p. 30
- Introduction
- Rationale
Pain is prevalent in the Orthopaedic Trauma (OT) population. Presenting itself as the most common symptom in the acute phase (Clay et al., 2010), it later develops into chronic pain in 25% of cases (Castillo et al., 2006). Pain can be classified into 4 different states: nociceptive, inflammatory, centralised and neuropathic (Vardeh et al., 2016). Neuropathic pain (NP), caused by a lesion or disease of the somatosensory system (Jensen et al., 2011), is particularly common in the OT population (Ciaramitaro et al., 2010). The somatosensory system is responsible for carrying information related to the modalities of touch, vibration, temperature, pain and kinaesthesia via specific pathways of the peripheral and central nervous system (Arezzo, 1992). It is highly susceptible to damage when musculoskeletal injuries cause compression, stretching or ischaemia to adjacent nerves (Noble, 1998). For example, spinal fractures can lead to spinal cord injury (SCI) causing NP in 40% of patients (Siddall et al., 2003; McLain et al., 2004; Werhagen et al., 2004). Similarly, ankle fractures are frequently combined with injury to the superficial peroneal nerve resulting in significant pain (Redfern et al., 2003). Example that’s not fractures?
The burden associated with NP in OT is substantial (Robinson, 2000). This is due to its high prevalence and poor response to treatment in this patient group (Harden and Cohen, 2003 & Ciaramitaro et al., 2010). Indeed, seventy-two percent of SCI patients report NP as a large problem in their daily life, describing it as severe, debilitating and refractory to treatment (Werhagen et al., 2004 & Wrigley et al., 2009). The failure of NP treatment is currently attributed to inadequate diagnosis, poor understanding of the mechanisms involved and the inappropriate use of outcome measures (Harden and Cohen, 2003). Quantitative Sensory Testing (QST) has recently receive attention because it has the potential to improve NP treatment by addressing the above issues.
QST is a psychophysical outcome measure recommended to facilitate the diagnosis of NP (Backonja et al., 2013). It assesses the function of the somatosensory system by measuring thresholds to a range of sensory stimuli, including pressure, heat and vibration (Sia and Cros, 2003). Vibration Thresholds (VT) are particularly relevant to OT because they are strongly suggestive of NP (Backonja et al., 2013). An elevation in VT is the first sign of nerve pathology, occurring when large A-Beta fibres, responsible for mediating the sensation of vibration, are subject to ischaemia (Martina et al., 1998 & Greening et al., 2003). Thus, QST measuring VT facilitates the early diagnosis of NP in the OT population, recognised as the first step to effective treatment (Kaki et al., 2005). Two methods are commonly employed in QST: the method of levels and the method of limits. In the method of limits, the stimulus intensity is altered until the participant reports detection or disappearance of a sensation. In the method of limits, a predefined stimulus is applied and the participant reports whether the stimulus is perceived or not. The threshold is then calculated as the mean of the values obtained (Hansson et al., 2007).
QST measuring VT offers several advantages for the management of NP over other measurement tools currently used in clinical practice. In contrast to nerve conduction studies and the standard bedside neurological exam, QST measuring VT assesses the integrity of the entire somatosensory system from receptor to cortex (Chong and Cros, 2004 & Backonja et al., 2013). It also measures both gain and mild decreases in nerve function (Hayes et al., 2002) and utilises a calibrated vibration stimulus following standardised instructions (Backonja et al., 2013). In this way, it offers a more comprehensive and accurate characterisation of somatosensory function (Martinez et al., 2008), providing vital information to inform the diagnosis and monitoring of NP as well as furthering the understanding of underlying pain mechanisms. As a result of these improvements, better treatment outcomes are expected for NP patients (Rolke et al., 2006; Yarnitsky and Granot, 2006; Hansson et al., 2006 & Uddin and MacDermid, 2016).
Despite the advantages of using QST measuring VT for the OT population, concerns on its measurement quality have hindered its use in clinical practice (Backonja et al., 2013). The quality of a measurement tool is judged through its measurement properties consisting of reliability, validity and responsiveness (Gadotti et al., 2006). According to the COSMIN (consensus-based standards for the selection of health measurement instruments) taxonomy, reliability informs clinicians about the degree to which the measurement is free from measurement error. Validity indicates the degree to which the instrument measures the construct it purports to measure. Lastly, responsiveness determines whether the instrument is able to detect change over time (Mokkink et al., 2010). Some aspects of the QST method have been identified as impacting on its quality, such as participant mental fatigue and confusion (Backonja et al., 2013), as well as a lack of standardisation of protocols and environmental conditions (Rolke et al., 2006 & Pavlaković and Petzke, 2010).
In response to the above concerns, recent research evidence has strived to improve the quality of QST. In 2006, the German research network acted on the lack of standardisation by providing a protocol and rater training (Rolke et al., 2006). In addition, several studies have demonstrated that QST measuring VT is reliable [ICC for intra-rater=0.55 to 0.99 & ICC for inter-rater=0.32 to 0.88) (Peters et al., 2003), valid (Bird et al., 2006) and responsive (Chong and Cros, 2004) in the asymptomatic and diabetic population. Indeed, due to its excellent reliability, validity and responsiveness (Bird et al., 2006 & Mythili et al., 2010) in the diagnosis and monitoring of somatosensory changes in diabetic neuropathy, several guidelines now endorse the use of QST measuring VT for diabetic patients (Kahn 1992 & Shy et al. 2003). However, it is important to note that the majority of QST research is focused on the asymptomatic population and diabetes (Greening et al., 2003 & Moloney et al., 2012). This means that only a small number of studies have investigated the measurement properties of QST measuring VT in the OT population.
These efforts to improve the quality of QST also tie in with a wider research aim to implement evidence-based practice in health-care (Fineout-Overholt et al., 2005). To facilitate evidence-based instrument selection, The COSMIN initiative has recently developed a critical appraisal tool to evaluate methodological quality in systematic reviews of measurement properties of health measurement instruments (Mokkink, 2010).
There is currently no systematic review investigating the measurement properties of QST measuring VT in the OT population. Although QST measuring VT is currently recommended in the diagnosis of NP, until its measurement properties are established in the OT population, there is a serious risk that it may render imprecise or biased results in this patient group (Terwee et al., 2007).
- Objectives
The aim of this systematic review is to establish the measurement properties of QST measuring VT in the OT population, using a validated measure of methodological quality.
The following questions will be answered:
- Is QST measuring VT a reliable tool in the OT population?
- Is QST measuring VT a valid tool in the OT population?
- Is QST measuring VT a responsive tool in the OT population?
- Methodology
- Protocol and registration
A systematic review was conducted according to the COSMIN protocol for Systematic Reviews of Measurement Properties (Terwee et al., 2011) and a predefined protocol reported in line with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement (Liberati et al., 2009).
- Eligibility criteria
Appropriate PICOS elements (Participants, Intervention, Comparators, Outcome measures and Study) were used to define clear inclusion criteria (CRD, 2009):
- Participants: Studies investigating subjects over the age of 19 with an OT injury. OT was defined as injuries to the musculoskeletal system. This may include injuries to bone (fractures and dislocation), connective tissues (sprains and strains) or soft tissue (haematomas and contusions) (Maryniak, 2011).
- Outcome Measure: Studies were included if they measured VT. This includes the Vibration Perception Threshold (VPT) (when the vibration sensation is first perceived), the Vibration Disappearance Threshold (VDT) (when the vibration sensation first disappears) or the average of both (Peters et al., 2003).
- Measurement Property: Any measurement property was included. Measurement properties include reliability, validity and responsiveness. Reliability can be measured over time (test-retest), by the same person on different occasions (intra-rater) or by different people on the same occasion (inter-rater) (Mokkink et al., 2010b). Validity encompasses: content validity (the degree to which the instrument is an adequate reflection of the construct to be measured), criterion validity (the degree to which the instrument compares to the ‘gold standard’) and construct validity (the degree to which the instrument performs against pre-defined hypotheses) (Mokkink et al., 2010b). The latter can be subdivided into convergent validity (the extent to which two or more instruments that purport to be measure the same construct agree with each other) and discriminative validity (the extent to which measurement scores distinguish between individuals or populations that would be expected to differ) (McDowell, 2006, p. 711).
- Studies: All studies looking at the measurement properties of QST measuring VT in the OT population.
Exclusion Criteria:
Studies were excluded if they were not written in English or recruited participants that were either not human or under the age of 19.
- Information sources
Three electronic databases [Medline (Ovid), CINALH plus and AMED (Ebesco)] were systematically searched from inception to November 2017 (Terwee et al., 2011). Hand searching for key journals and scanning of the grey literature were also conducted to ensure that no relevant articles were missed (CRD, 2009). All final searches were carried out on the 19th of December 2017 by one independent reviewer.
- Search
The search strategy used relevant key words and Medical Entry Search Terms (MESH) related to the PICOS elements defined above. These were combined using relevant Boolean terms (Terwee et al., 2011). The search terms were initially developed for the Medline database (see appendix 1) and revised appropriately for each database.
- Study selection
To increase the reliability of decisions, the study selection process was carried out by two independent reviewers (CRD, 2009). A practice trial was also completed to ensure consistency between reviewers (Liberati et al., 2009). As defined in the protocol, the second independent reviewer completed the required 10% of the first phase of the study selection process. Any disagreement was resolved through discussion. The first stage of the study selection process consisted of an initial screening of titles and abstracts against the predefined eligibility criteria. In the second stage, the full text version of the articles identified as possibly relevant in the initial screening were reassessed for eligibility (CRD, 2009).
- Data collection process
A data extraction form was piloted on one of the 4 articles and subsequent changes were indicated in the protocol (Liberati et al., 2009). As stated in the protocol, authors were not contacted to investigate missing information and the data collection was completed by one independent reviewer.
- Data items
The PICOS method was used to extract data from the selected studies (Terwee et al., 2011). Information was retrieved from each study regarding (1) the participants (general demographics and type of trauma); (2) the outcome measure (type of VT, equipment, testing procedure and standardisation methods); (3) the measurement properties (type, statistical analysis and result); (4) the study (year of publication, country and design). Studies including SCI patients with a mean participant age below 50 years old were assumed to have recruited participants with a traumatic rather than non-traumatic SCI. This is because the mean age of non-traumatic SCI patients is much higher (61 years old) than traumatic SCI patients (38 years old) (McKinley et al., 1999; New et al., 2002; Scivoletto et al., 2003; Guilcher et al., 2008 & Cosar et al., 2010).
- Methodological quality of individual studies
The methodological quality of individual studies was assessed using the validated COSMIN framework (Mokkink et al., 2010a). Using the COSMIN four-point scale and the “worst score counts” principle, each article was rated as either ‘poor’, ‘fair’, ‘good’ or ‘excellent’ (Terwee et al., 2012). As predefined in the protocol, one independent reviewer assessed methodological quality.
- Summary measures
All statistical test result used to measure reliability, validity or responsiveness were extracted. Using the quality criteria developed by Terwee et al. (2011), the results of each individual study were rated as either negative (-), positive (+) or indeterminate (?) (see appendix 2). Concerning the measure of reliability, Intra-class Correlation Coefficients (ICC) were also interpreted according to the Shrout and Fleiss (1979) criteria, whereby a score <0.4 is considered poor, 0.40-0.59 is fair, 0.60-0.75 is good and >0.75 is excellent.
- Synthesis of results
The pooling of results from diverse non-randomized study types is not recommended, therefore due to a lack of homogeneity across studies, a narrative synthesis was conducted (Sterne et al., 2008). The overall strength of evidence for each measurement property was synthesised according to the “levels of evidence for the quality of the measurement property” designed by Terwee et al. (2011). Based on the results of the COSMIN quality criteria and 4-point scale, the overall level of evidence for each measurement property was rated as either “strong”, “moderate”, “limited”, “conflicting” or “unknown” (table 1). Results were then summarised by measurement property.
Table 1. Levels of evidence for the quality of the measurement property (Terwee et al., 2011).
Level of evidence |
Rating |
Criteria |
Strong |
+++ or – – – |
Consistent findings in multiple studies of good methodological quality OR in one study of excellent methodological quality |
Moderate |
++ or – – |
Consistent findings in multiple studies of fair methodological quality OR in one study of good methodological quality |
Limited |
+ or – |
One study of fair methodological quality |
Conflicting |
+/- |
Conflicting findings |
Unknown |
? |
Only studies of poor methodological quality |
- Results
- Study selection
The search strategy returned 516 articles. After 68 duplicates were removed, 401 articles were excluded from their titles and abstracts. 47 full-text articles were then retrieved for further review. Of these, 43 articles were excluded (see appendix 3 for primary reasons of exclusion). No additional articles were included from other sources. Therefore, 4 papers were included in this systematic review. The number of articles excluded at each phase of the study selection process is summarised using the PRISMA flow chart (figure 1).
- Study characteristics
- Study design
Included studies consisted of a single-blinded case control observational study (Rushton et al., 2014), a double-blinded within days inter and intra-reliability study (Tyros et al., 2016) and two studies without a stated design (Krassoukiov et al., 1999 & Felix and Widerström-Nega, 2009).
- Participants
Studies used varying sample sizes and recruited participants with different types of OT injuries. Samples ranged from 26 to 42 participants, totalling 135 participants. 46 participants with Chronic Whiplash Associated Disorder grade 2 (CWAD II) were recruited across two studies (Rushton et al., 2014 & Tyros et al., 2016). The remaining studies investigated traumatic-SCI, including 43 participants (Krassoukiov et al., 1999 & Felix and Widerström-Nega, 2009). These can be further separated into 31 incomplete SCI (where sensory and/or motor function below the neurological level and the lowest sacral element is partially preserved) and 12 complete SCI (where sensory and motor function is absent in the lowest sacral segment) (Waters et al., 1991 & Maynard et al., 1997). Study design and participant characteristics are summarised in table 2.
Figure 1: Study selection flow diagram (From Moher et al., 2009)
Records identified through database searching
(n = 516)
Medline (n=415), CINALH (n=94), AMED (n=7)
Additional records identified through other sources
(n = 0)

Records after duplicates removed
(n = 448)

Records excluded
(n = 401)
Records screened from using the title and abstract
(n = 448)
Full-text articles assessed for eligibility
(n = 47)
Full-text articles excluded
(n =43)
Reason for exclusion:
- Under the age of 19 (1)
- Not Orthopedic trauma participants (26)
- Not QST with VT (22)
- Not evaluating measurement properties (5)
Studies included in the synthesis
(n = 4)
Table 2: Study design and participant characteristics
Study |
Country |
Design |
Experimental Group |
Control Group |
Rushton et al., 2014 |
UK |
Single-blinded case control observational study |
20 CWAD II (median age 28.5, 13 females, 7 males) |
22 controls with no history of whiplash injury/ neck pain, headaches or upper quadrant injuries (median age: 26, 9 females, 13 males) |
Krassoukiov et al., 1999 |
Canada |
Not stated |
21 incomplete traumatic SCI (median age 38.9, 6 females, 15 males) |
14 healthy able-bodied subjects (median age 33.9, 8 females, 6 males) |
Felix and Widerström-Nega et al., 2009 |
USA |
Not stated |
22 traumatic SCI with NP (10 incomplete, 12 complete), (median age 41.7, 3 females, 19 males) |
10 non-disabled participants (median age: 30.4, 4 females, 6 males) |
Tyros et al., 2016 |
UK |
Double-blinded within days inter and intra-reliability study |
26 CWAD II (median age 29.9, 18 females, 8 males) |
no control |
UK = United-Kingdom, USA = United-States of America, CWAD II = Chronic Whiplash Associated disorder grade 2; SCI = Spinal Cord Injury
- Measurement Properties
- Reliability
Three studies evaluated reliability. Of those, one study looked at inter and intra-rater reliability (Tyros et al., 2016) and two studies investigated test-retest reliability (Krassoukiov et al., 1999 & Felix and Widerström-Nega, 2009). Krassoukiov et al. (1999) did not specify which type of reliability it intended to measure. However, the testing was conducted on two separate occasions, suggesting test-retest reliability was evaluated. Time between successive testing sessions ranged from 1 day to 4 weeks (Felix and Widerström-Nega, 2009). All studies used the ICC, with variation in the model chosen. The ICC is the most appropriate statistical tool reflecting variance due to multiple sources of error such as the instrument, the subject or the tester (Shrout and Fleiss, 1979). According to the quality criteria developed by Terwee et al. (2011) and the interpretation of ICC values by Shrout and Fleiss (1979), the results of all three studies were rated as positive and excellent. Table 3 summarises the results for each individual study.
- Validity
One study evaluated discriminative validity using logistic regression analysis (Rushton et al., 2014).It was rated with negative quality (Terwee et al., 2011).
- Responsiveness
No study was retrieved for responsiveness.
- Quantitative Sensory Testing
Table 4 summarises the information related to QST for each included study.
- Vibration Thresholds
The type of VT measured varied across studies. Two studies measured the VPT (Krassoukiov et al., 1999 & Felix and Widerström-Nega, 2009).) and one study the VDT (Tyros et al., 2016). The remaining study measured both the VDT and the VPT (Rushton et al., 2014). All studies completed three measurements and calculated their mean value.
Table 3: Results of individual studies