INTRODUCTION
Lower back pain (LBP) remains a leading cause of disability and morbidity in today’s society. Even up to 70% of the population experiences LBP throughout their lives [5]. With an estimated societal cost of LBP at 85 billion dollars annually, only in the United States in 2008, and expected several-fold increase in the next decades, it constitutes an enormous burden to the healthcare system [7].
The aetiology of LBP is multifactorial with both genetic and environmental factors contributing to its development. Intervertebral disc (IVD) degeneration is considered to be a crucial component in the aetiology of this condition. The IVD is formed by gelatinous, centrally located nucleus pulposus (NP), which is surrounded by annulus fibrosus (AF). Morphologically, the NP consists of collagen II fibres and elastin randomly arranged in highly hydrated, aggrecan-based gel, which also contains low concentration of chondrocyte-like cells [15]. The AF can be divided into an inner AF, which can be viewed as a transition zone and the outer AF, which in turn is formed by distinct, highly organized lamellae consisting of collagen I fibres, intertwined with elastin, lubricin and collagen VI fibres [15]. Moreover, the IVD is bound caudally and rostrally by IVD endplates, which separate intervertebral bodies from the IVD. The highly hydrated NP, which is constrained both by the AF and the endplates, distributes mechanical loads evenly, dissipates energy and allows for the movement of the vertebral column [15]. Deterioration in the function of IVD is associated with the changes in the content of extracellular matrix of the NP, which occur with age and degeneration [10]. These include loss of water content, degradation of proteoglycans and collagen as well as upregulation of inflammatory cytokines [10]. The deterioration of the NP leads to irreversible structural changes of the IVD and its surrounding. A common macroscopic characteristic of degeneration is the presence of clefts and tears within the IVD and the loss of demarcation between the NP and the AF [11]. As such, the IVD loses its mechanical bearing properties [17] with a transfer of pressure exertion point from NP to AF [1]. This can result in NP bulging, herniation, compression syndrome and effectively low back pain.
There are several grading systems used to assess the degree of degeneration based on the modality used. A grading system to assess morphologic changes due to IVD degeneration was proposed by Thompson et al. [14] in 1990. Moreover, the magnetic resonance imaging (MRI) has been widely used to study and assess IVD degeneration. The signal intensity loss in T2-weighted images correlates with the degree of degeneration [4]. Pfirrmann et al. [9] classification is a popular grading system for the degenerative changes of the lumbar spine observed in MRI. It is broadly used to study lumbar degeneration, but has also been adopted by neurosurgeons and orthopaedic surgeons in the perioperative setting. It has been validated in multiple studies [12, 16]; however, the assessment of the correlation between macroscopic Thompson’s and radiologic Pfirrmann’s grading systems has not been studied comprehensively.
Therefore, the aim of this study was to compare the macroscopic appearance of the lumbar spine specimens with their MRI appearance and check the reliability of the popular Pfirrmann classification of the degenerative changes in the lumbar spine.
MATERIALS AND METHODS
Specimen collection
The study protocol was approved by our institutional Bioethics Committee. Moreover, study strictly adhered to ethical principles for medical research involving human subjects set by the Declaration of Helsinki.
Full spinal columns (vertebrae L1–S1 and IVD between them) were harvested from fresh cadavers through an anterior dissection. Inclusion criteria were as follows: 1) age 18–80, 2) possibility to dissect specific lumbar column. Any donors that were deceased due to trauma or had a visible spinal trauma, spinal surgery, spinal tumours, ankylosing spondylitis were excluded from this study.
Intervertebral discs that became damaged during dissection or had artefacts in MRI scans that did not allow for full and reliable assessment were excluded from further analyses.
Magnetic resonance imaging
Magnetic resonance imaging scans of the harvested spinal columns were conducted with the use of Philips Achieva 3.0T TX apparatus. Two independent reviewers assessed the IVD degeneration according to Pfirrmann et al. [9] scale. In summary, Pfirrmann grading system assesses changes in T2 spin-echo weighted images on a scale from 1 to 5, with grade 1 describing healthy disc (homogeneous with bright hyperintense white signal intensity and normal disc height), while grade 5 describing heavily degenerated disc (disc space is collapsed, inhomogeneous with a hypointense black signal intensity) (Fig. 1) [9].
Moreover, the MRI scans were assessed for Modic type endplate changes [8]. Type 1 changes were defined as decreased signal intensity on T1-weighted images and increased signal intensity on T2-weighted images. Type 2 changes were defined as increased signal intensity on T1-weighted images and isointense or slightly increased signal intensity on T2-weighted images [8]. Modic type III changes showed decreased signal intensity on both T1- and T2-weighted images. Any radiologic findings, such as Schmorl’s nodes, disc bulging or herniation were also noted.
Morphologic assessment
All vertebral columns were cut in the midsagittal plane. High resolution images of each column were taken and used for later assessment. The IVD degeneration was graded on a scale from 1 to 5 based on criteria developed by Thompson et al. [14] by two independent reviewers. In summary, each grade is determined through assessing specific morphologic changes of nucleus pulposus, annulus fibrosus, IVD end-plates and adjacent vertebral bodies with grade 1 being healthy IVD, while grade 5 being heavily degenerated disc (Fig. 1) [14].
Moreover, any macroscopic alterations in the structure of lumbar columns were noted and included the following: osteophytes, Schmorl’s nodes, IVD clefts, tears, bulging and herniation.
Statistical analysis
All statistical analyses were conducted using STATISTICA (v.13.3) and PQStat (v.1.8.0). Frequency distribution, mean and standard deviation were used to characterise study group and degeneration grades. Spearman’s rank correlation coefficient statistic was conducted to assess the relation between the age and degeneration. Moreover, in order to determine agreement between specific Pfirrmann and Thompson grades, weighted Cohen’s kappa coefficient was utilised. This statistic assigns weights to disagreement values, with the higher the degree of disagreement the higher the weight. A kappa value of 1 indicates perfect agreement, while value of 0 indicates agreement equivalent to chance. A p-value of < 0.05 determines statistically significant agreement between the two scores. Subgroup analysis on the agreement between degeneration grades for specific IVD levels was also conducted.
RESULTS
Study group
One hundred lumbar spine columns (L1–S1) were harvested from male cadavers. Mean age of the donor was 42.2 ± 12.3 years. There were 54 IVDs which visualisation did not allow for full and reliable assessment, therefore authors decided to exclude them from the analysis.
Degeneration assessment
A total of 446 IVDs were included in the analysis of the degeneration grade. Radiologic assessment using the Pfirrmann grading system classified 44.2% of discs as grade 2, 32.1% as grade 3, 16.8% as grade 4, 5.8% as grade 1 and 1.1% as grade 5. Morphologic Thompson scale graded the majority of discs as grade 2 and 3 (44.2% and 32.1%, respectively), followed by grade 4 (16.8%), grade 1 (5.8%) and grade 5 (1.1%).
There were 42 discs (9.4% of all discs) that showed Modic type endplate changes, with 8.7% of all discs grades as Modic type 2 and 0.7% as Modic type 1.
The analysis on the effect of age on degeneration revealed significant, although moderate, positive correlation with both Thompson (r = 0.38, p < 0.001) and Pfirrmann (r = 0.36, p < 0.001) average grade.
Table 1 summarises subgroup analyses of the Thompson and Pfirrmann grades based on the spinal level.
Intervertebral disc level |
Thompson |
Pfirrmann |
L1/L2 |
Grade 1 — 7% |
Grade 1 — 4% |
|
Grade 2 — 39% |
Grade 2 — 63% |
|
Grade 3 — 50% |
Grade 3 — 24% |
|
Grade 4 — 4% |
Grade 4 — 9% |
|
Grade 5 — 0% |
Grade 5 — 0% |
L2/L3 |
Grade 1 — 7% |
Grade 1 — 3% |
|
Grade 2 — 32% |
Grade 2 — 59% |
|
Grade 3 — 50% |
Grade 3 — 31% |
|
Grade 4 — 10% |
Grade 4 — 7% |
|
Grade 5 — 1% |
Grade 5 — 0% |
L3/L4 |
Grade 1 — 5% |
Grade 1 — 3% |
|
Grade 2 — 25% |
Grade 2 — 57% |
|
Grade 3 — 45% |
Grade 3 — 23% |
|
Grade 4 — 21% |
Grade 4 — 16% |
|
Grade 5 — 4% |
Grade 5 — 1% |
L4/L5 |
Grade 1 — 9% |
Grade 1 — 5% |
|
Grade 2 — 21% |
Grade 2 — 31% |
|
Grade 3 — 37% |
Grade 3 — 43% |
|
Grade 4 — 27% |
Grade 4 — 21% |
|
Grade 5 — 6% |
Grade 5 — 0% |
L5/S1 |
Grade 1 — 6% |
Grade 1 — 13% |
|
Grade 2 — 23% |
Grade 2 — 22% |
|
Grade 3 — 36% |
Grade 3 — 34% |
|
Grade 4 — 22% |
Grade 4 — 27% |
|
Grade 5 — 13% |
Grade 5 — 4% |
Inter-grading system agreement
A total of 446 pairs of Thompson and Pfirrmann grades for specific IVDs were compared. Analysis showed weighted Cohen’s kappa equal to 0.61 (p < 0.001), which suggests significant and substantial agreement between the two grading systems. The highest percentage agreement was achieved for grade 2 (67.2% of discs). All other grades showed an agreement in less than half of the cases. The highest percentage disagreement was observed for Thompson grade 1 with 70.0% of discs graded as Pfirrmann grade 2. Most of the disagreement occurred due to a 1 grade difference (91.5%), whereas only 8.5% due to a 2 grade difference.
In summary, Pfirrmann scale tended to underscore degeneration when compared to Thompson grades. Majority of Thompson grades 5 were scored as Pfirrmann grades 4–5 (83.3%). Thompson grades 4 were scored as Pfirrmann grades 3–4 in 86.6% of cases, Thompson grades 3 as Pfirrmann grades 2–3 in 87.9% of cases, Thompson grades 2 as Pfirrmann grades 1–2 in 79.8% of cases.
A subgroup analysis based on the spinal level revealed weighted Cohen’s kappa ranging from 0.40 to 0.70, with the highest value for L5/S1 discs (Table 2). Percentage agreement ranged from 41% to 56%, however majority of disagreement occurred due to a one grade difference.
Intervertebral disc level |
Weighted Cohen’s kappa coefficient |
P value |
Percentage agreement |
L1/L2 |
0.40 |
< 0.05 |
48.0% |
L2/L3 |
0.53 |
< 0.001 |
56.0% |
L3/L4 |
0.54 |
< 0.001 |
41.0% |
L4/L5 |
0.59 |
< 0.001 |
44.0% |
L5/S1 |
0.70 |
< 0.001 |
48.0% |
DISCUSSION
The IVD degeneration is commonly classified using the Pfirrmann grading system when assessed with MRI. There is a scarcity of studies [3] correlating morphological appearance of degeneration with MRI appearance in cadaveric samples. The reliability of the popular Pfirrmann scale has not been comprehensively validated on a large sample using 3 T MRI against the morphological Thompson scale so far. Therefore, the aim of our study was to assess morphological and radiological characteristics of the IVD degeneration and assess the correlation between the Pfirrmann and the Thompson grading systems.
The results of this study showed that overall there is a significant and substantial agreement between morphological and radiological degeneration scales. However, when analysed by IVD levels considerable variability was observed in terms of kappa coefficients, with values as low as 0.4. Moreover, there was more disagreement in lower grades of degeneration as compared to higher grades, which tended to show more agreement. This suggests better reflection of the stage of degenerative disc disease and as such the clinical applicability of Pfirrmann scale for patients with more degenerated discs. While in vast majority the disagreement between the scales occurred due to a one grade difference, the fact that Pfirrmann scale underscores majority of grades when compared to morphological scale warrants its thoughtful use in
a clinical setting. Clinicians should remain careful when following up the patients and relying solely on the descriptions of the MRI exams in the assessment of the progression from lower to higher grades. In such cases, MRI scans should always be evaluated.
The original Pfirrmann grading system was applied to 1 T MRI [9] and further analysed with 1.5 T MRI
[6, 16] as well as with 3 T and high-resolution 9.4 T MRI in a pre-clinical research [12]. Our study incorporated 3 T MRI, which allowed for detailed visualisation of spinal columns. Previous studies have repeatedly shown that T2-signal intensity loss, as one of the few radiological characteristics, is associated with morphologically observed degeneration in cadaveric samples [3, 13]. Moreover, T2-signal intensity correlates strongly with water and proteoglycan content of the disc [2, 18], thus its loss should represent the chemical changes that occur within the disc during the degeneration. The T2-signal intensity loss is the main criterion employed in the Pfirrmann scale. Similarly to previous research, the results of this study showed indirectly that the T2-signal intensity loss reflects the process of degeneration, especially for patients with late stages of IVD degeneration.
The use of Thompson grading system in the assessment of the morphology of IVD degeneration has an inherent limitation. The use of only one sagittal section allows only for a limited evaluation of the IVD and might not represent full degree of degeneration throughout the whole IVD. Nonetheless, the midsagittal section provides visualisation of all tissues of the disc structure (NP, AF, endplates, adjacent vertebral bodies) as well as degenerative changes that occur both in coronal and horizontal planes and as such is the most likely to establish the most authentic grade of the degeneration [14]. Moreover, this plane was utilised in the assessment of degeneration using Pfirrmann scale and thus allowed us to directly compare the two grading systems.
The limitation of this study was the use of only male specimens. However, along with the large sample size, this study provides a focused and more representative image of IVD disease for this sex. Further studies should be performed with female patients in order to evaluate any possible sexual dimorphism.
Moreover, the MRI has been performed post-mortem, with absent normal metabolism of tissues. However, only such methodology allows to compare MRI data with full direct macroscopic assessment, and therefore provide reliable and comparable view.
CONCLUSIONS
With the aging population and with the increase of the prevalence of IVD disease, reliable grading systems of IVD degeneration are crucial for spine surgeons in their clinical assessment. The results of this study showed that overall there was a significant and substantial agreement between radiological Pfirrmann and morphological Thompson grading systems. Nonetheless, clinicians should remain careful when using Pfirrmann scale as the grades tend to deviate from the morphological assessment. Thus, the knowledge of the proper assessment of MRI scans is crucial for spine surgeons.
Acknowledgements
This research was supported by governmental funds for research in 2016–2019 (Polish Ministry of Science and Higher Education, Diamond Grant, 0182/DIA/2016/45). We would like to acknowledge all the donors and their families, whose contribution allowed us to conduct this research.