Introduction
New algorithms based on artificial intelligence (AI), more precisely deep learning (DL), a branch of machine learning, designed for an improvement of medical imaging have been developed in recent years [1]. These novel methods, aimed to improve image quality, have been introduced also to positron emission tomography/computed tomography (PET/CT) — an imaging method that is widely used in many clinical applications, especially oncology, neurology, and cardiology [2, 3]. The long way from the idea of an image enhancement method through its validation up to the implementation of different DL algorithms in PET/CT into clinical use has been described elsewhere [4, 5].
Since the advent of the PET technique, a lot of solutions have been proposed to shorten the time of acquisition and, even more importantly, to reduce the radionuclide activity injected into the patients. Any change of either parameter always impacted image quality, quantitative data, and clinical confidence of the readers reporting the scans [6, 7]. The application of machine learning in nuclear medicine image analysis is very promising [8] and seems to be a perfect solution for the reduction of the administered radiopharmaceutical activity and, subsequently, patient radiation exposure without compromising the image quality and diagnostic performance of PET/CT [9]. That is why the interest in the application of AI-based methods is growing [10], with special attention to image enhancement [11] and improvement of image quality in low-dose acquisition [12].
An algorithm recently developed by Subtle Medical showed promising results in validation studies performed in Europe and on other continents [13, 14] and is now registered and commercially available. Its routine use in PET/CT imaging may potentially reduce the acquisition time or patient’s radiation dose. Based on these encouraging results, our institution decided to use it in its 3 PET/CT centers. The purpose of this study is to determine, if the published early results can be replicated in daily clinical routine and if the DL image enhancement can be used to reduce the acquisition time or the injected radiotracer activity.
Material and methods
Subsequent, randomly assigned patients undergoing a routine PET/CT with fluorine-18-deoxyglucose ([18F]FDG) for oncological indications were qualified for the study. After the initial evaluation, images with no apparent [18F]FDG-positive lesion (normal PET/CT scans) were disqualified. As a result, only patients with at least one [18F]FDG-positive, pathological finding, were qualified.
All patients were injected with the standard dose of 3.5 MBq/kg [18F]FDG and the images were acquired approximately 60 min post-injection. The images were obtained with the GE Discovery IQ PET/CT scanner (GE Healthcare, Milwaukee, WI, USA). The scanner is using the standard OSEM reconstruction, as well as the QClear™ enhanced lesion detection reconstruction [15, 16]. Patients underwent a low-dose computed tomography (CT) prior to PET-processed acquisition, for attenuation correction and anatomical correlation of PET findings. The whole-body PET/CT exams were performed with a standard acquisition time of 1.5 min per bed position. Emission data was corrected for randoms, dead time, scatter, and attenuation and was reconstructed iteratively by an ordered-subsets expectation maximization (OSEM) algorithm. The original OSEM reconstructed images (100% counts, 1.5 min/bed) were sent from the modality to the SubtleEdge server for processing using deep learning (DL) based software (SubtlePET™, Subtle Medical, Menlo Park, CA, USA). SubtlePET™ reconstruction software was developed using 2.5D encoder-decoder U-Net with the main purpose of denoising the images. It employs a convolutional neural network (CNN)-based method in a pixel’s neighborhood to reduce noise and increase image quality [17, 18].
To simulate images obtained with a shorter acquisition time or lower [18F]FDG activity, the 1 min/bed (equivalent to 66% of acquired counts or 2.33 MBq/kg of injected activity) were also sent to the DL software, for comparison. As the manufacturer offered two different versions of the software, both versions were used for processing: the original SubtlePET version 1 (SP1), and an upgraded SubtlePET version 2 (SP2).
Additionally, to test the performance of the DL processing in case of a reconstruction algorithm other than OSEM, the 1 min/bed images reconstructed with the QClear™ (GE, Milwaukee, USA) reconstruction algorithm were also sent for the processing, however with the version 2 of the software (SP2) only. To summarize, the following datasets were obtained and analyzed:
- 1.5 min/bed OSEM (the original PET images)
- 1 min/bed SubtlePET™ v1 over OSEM (SP1/OSEM)
- 1 min/bed SubtlePET™ v2 over OSEM (SP2/OSEM)
- 1 min/bed SubtlePET™ v2 over QClear™ (SP2/QClear).
An experienced, board-certified nuclear medicine physician reviewed the standard acquisition PET images in the Advantage Workstation (GE, Milwaukee, WI, USA) and identified possible lesions. 30 mm radius spherical volumes of interest (VOIs) of the lesions and reference VOIs placed in the reference organs: in the liver, brain, bladder, and mediastinum were drawn on the standard acquisition images. The same lesions, VOIs, and reference VOIs were subsequently copied and reviewed on the AI-enhanced images (SP1/OSEM, SP2/OSEM, and SP2/QClear). For each VOI, SUVmax values normalized to lean body mass were calculated at each type of reconstruction.
Differences in values between the standard study (1.5 min/bed OSEM) and copy of the VOIs for additional datasets, both, in SUVmax and as percentage (%) were statistically evaluated: average difference, median and standard deviation for each dataset, reference region and all the lesions together were compared. In order to test the diagnostic performance of AI-enhancement in the case of lesions with relatively low [18F]FDG uptake, a similar analysis of a subgroup of scans presenting lesions with SUVmax < 4.0 was performed separately.
Statistical analysis
Statistical analysis was performed for two ranges of results: 1) all measured lesions [with four methods, i.e. 1.5 min/bed OSEM and three methods for 1 min/bed acquisition time: SP1/OSEM, SP2/OSEM, and SP2/QClear) and 2) for lesions with low uptake (SUVmax < 4) also with four mentioned methods. Descriptive statistics for categorical variables were presented as relative/absolute frequencies, while those for continuous ones as the median (range).
As all measured values had a non-normal distribution, we performed the Friedman test with the calculation of the Kendall correlation coefficient and later Spearman rank correlation for correlation between the base dataset (OSEM 1.5 min/bed) and datasets reconstructed with reduced time.
For lesions with SUVmax < 4, a normal distribution was found, but without preserved sphericity. Therefore, the Friedman test was performed, followed by the r-Pearson linear correlation coefficient calculation. Next ICC intraclass correlation was calculated between OSEM 1.5 min/bed and each of the AI datasets. ICC was interpreted according to the Landis interpretation scale [19] (0.0: poor; 0.0–0.20: slight; 0.21–0.40: fair; 0.41–0.60: moderate; 0.61–0.80: substantial; 0.81–1.00: almost-perfect reproducibility).
Results
Thirty-seven subjects (21 males, 16 females, aged 16–80 years; median age 67 years) with at least one [18F]FDG-avid lesion were qualified for the study. Patients’ weight varied from 36 to 109 kg (median 73 kg) and body mass index (BMI) ranged from 15.1 to 37.2 (median 25.2). Patient characteristics and clinical indications for the [18F]FDG PET/CT imaging are shown in Table 1.
Patient characteristic |
Number [%] (N = 37) |
Female |
16 (43.2%) |
Male |
21 (56.8%) |
Median (range) |
|
Uptake time |
59 min (49–74) |
Body height |
165 cm (151–190) |
Body weight |
73 kg (36–109) |
Body mass index |
25.2 kg/m2 (15.2–37.3) |
Age |
67 years (16–80) |
Tumor characteristic |
Number [%] (N = 37) |
Hodgkin/non-Hodgkin lymphoma |
13 (35.1%) |
Lung cancer |
11 (29.7%) |
Colorectal cancer |
5 (13.5%) |
Esophageal cancer |
3 (8.1%) |
Breast cancer |
3 (8.1%) |
Laryngeal cancer |
2 (5.4%) |
We identified altogether 252 VOIs: 104 lesion VOIs (at least one lesion, no more than four lesions per patient) and 148 reference VOIs (37 for each of four reference regions — liver, brain, bladder, and mediastinum). SUVmax values obtained with all the image reconstructions and enhancement methods were compared with the SUVmax values measured at the original images (1.5 min/bed OSEM). The differences in SUVmax values in absolute numbers and percentage are presented in Table 2. SUVmax values on AI-enhanced images were lower than on unenhanced standard OSEM images but using SP2 the difference was smaller (median difference for SP1 was −11.89%, for SP2 −4.82%; Tab. 2). Still, all new reconstruction methods showed a strong positive correlation to the original OSEM 1.5 min/bed data. For the images reconstructed with QClear™, the trend was reverse — SUVmax values were higher than on unenhanced OSEM images (median 6.12%). For lesions with SUVmax values below 4.0 — the decrease of measured SUVmax was much less significant with SP2 (for liver reference median −5.6%, for lesions median −6.04%). Statistical analysis showed no difference between OSEM 1.5, SP2/OSEM, and SP2/Qclear, with the highest Pearson r and ICC intraclass coefficients for data reconstructed with SP2 (Tab. 3).
1 min/bed |
Reference regions |
Lesions |
|||||
Liver |
Brain |
Bladder |
Mediastinum |
SUVmax < 4 |
All SUVmax values |
||
SP1/OSEM |
Median |
−0.15 (−6.49%) |
−0.17 (−2.04%) |
0.07 (0.58%) |
−0.19 (−9.58%) |
–0.61 (–11.51%) |
−0.50 (−11.89%) |
SP2/OSEM |
Median |
−0.14 (−5.60%) |
−0.17 (−2.03%) |
−0.15 (−0.86%) |
−0.10 (−5.28%) |
–0.31 (–6.04%) |
–0.25 (–4.82%) |
SP2/QClear |
Median |
−0.45 (−16.55%) |
−0.15 (−1.81%) |
0.95 (3.62%) |
−0.29 (−14.29%) |
0.63 (11.31%) |
0.38 (6.12%) |
Lesions |
|||
SUVmax < 4 (normal distribution) |
All SUVmax values (inconsistent with normal distribution) |
||
SP1/OSEM |
r-Person correlation coefficient |
0.9058 |
– |
ICC intraclass correlation |
0.9498 |
– |
|
Rank-order correlation R Spearman |
– |
0.9863 |
|
Kendall coefficient |
– |
0.9931 |
|
SP2/OSEM |
r-Person correlation coefficient |
0.9166 |
– |
ICC intraclass correlation |
0.9545 |
– |
|
Rank-order correlation R Spearman |
– |
0.9913 |
|
Kendall coefficient |
– |
0.9956 |
|
SP2/QClear |
r-Person correlation coefficient |
0.8459 |
– |
ICC intraclass correlation |
0.8746 |
– |
|
Rank-order correlation R Spearman |
– |
0.9710 |
|
Kendall coefficient |
– |
0.9855 |
Statistical analysis showed that for lesions with SUVmax range 2.0 to 4.0, the SP2 version strongly correlates with OSEM; r-Person correlation coefficient is 0.9166 and ICC intraclass correlation of 0.9545 (almost perfect reproducibility), it is also clearly visible on Bland–Altman plot (Fig. 1).
Also for lesions with SUVmax < 4, SP2/OSEM results showed the best correlation with the original OSEM data (Fig. 2).
In the qualitative evaluation, generally, good image quality was found in SP1/OSEM, with no apparent artifacts. The images were not equal to the original OSEM ones — some lesions found in mediastinum were hard to detect but they were still identifiable. SP2/OSEM provided more detailed images (less smoothing effect) and almost the same quality as the original 1.5 min/bed standard reconstruction (much better than 1 min/bed OSEM reconstruction) (Fig. 3).
Discussion
The study was performed in parallel with the implementation of SubtlePET™ reconstruction in clinical routine, based on a previous assessment performed by Katsari et al. [13], and performed in a similar timeframe, already published by Weyts et al. [17] and Bonardel et al. [18]. The main goal was to reduce the dose applied to the patient (by 1/3), with no negative effect on the image quality and clinical confidence. Reduction of the [18F]FDG injected activity could also translate to beneficial cost-effectiveness of PET/CT procedure. In contrast to very encouraging published data, we faced some negative feedback from physicians reporting PET/CT examinations. The physicians mostly complained of the low visibility of small changes and lower SUVmax values, that, in some cases, had an impact on the clinical interpretation of the lesions. We decided to perform quantitative analyses using normal dose (time) and to simulate lower dose retrospectively by reconstructing the images with the time per bed shorter by 1/3, i.e. 1 min/bed instead of 1.5 min/bed.
The initial results obtained with version 1 of the software (SP1/OSEM) confirmed some of the observed problems, such as excessive smoothing of the images, low SUVmax value, and a potential lower detectability of small [18F]FDG-avid lesions. Once the manufacturer developed a new version of the algorithm (SP2), addressing most of the raised issues, we decided to reevaluate the data with the new version.
We were able to reprocess previously acquired data with the new algorithm, applying it also to 1 min/bed data reconstructed with QClear™ reconstruction that is routinely used in our center, and to repeat all the measurements for all four types of data.
After a direct comparison of original clinical data and the reduced time DL algorithm outputs, the comparison of multiple outputs was conducted. We could see not only the impact of the AI-based algorithm on the lower count/dose or time, but also the differences between the two versions of the algorithm, and the impact of the reconstruction algorithm used on the final image.
It has to be pointed out that the DL algorithm is applied to already reconstructed and corrected (for attenuation, scatter, well counter, etc.) images, working on already statistically modified images and not on the original raw data.
The final number of patients in the cohort was a little lower than originally planned but still, it was large enough to provide a good representation of different clinical diagnoses, age, sex, and body mass index of the subjects.
The main limitation of the study is its one center/one PET/CT scanner setting. We are planning to perform a larger study on data obtained from 3–4 scanners in different centers. Another limitation is related to the comparison method as only one dose of [18F]FDG could be administered in each patient due to the radiation protection restrictions. In contrast to the study by Katsari et al. [13], and similarly to Weyts et al. [17] and Bonardel et al. [18], we decided to keep standard dose and acquisition time and to reconstruct the images retrospectively with a shorter time (1/3 reduction) to simulate lower dose, rather than to administer lower activities and acquire for a longer time to simulate normal dose. It gave us also the possibility to check if the subjects had any lesions visible on the standard OSEM images before reprocessing images with the DL algorithm, in order to ascertain the presence of lesions feasible for the evaluation.
It should be also pointed out how the DL algorithm in question was qualified by us as users. While in other modalities (like CT or MR), AI provides already some algorithms supporting the clinical decisions of the readers, the SubtlePET algorithm is still just an image enhancement tool with no support for the clinical interpretation.
We also need to remember that we were working on the already approved and registered algorithm, while many other centers try to develop and test their own methods [11, 12, 20].
Conclusions
The evaluated AI-based image enhancement can be used to accelerate PET acquisitions by one-third without compromising quantitative SUVmax values and image quality as compared to the acquisitions with standard duration. Respectively, it can also be used to optimize PET images obtained with a radiopharmaceutical activity reduced by one-third. Subsequently, routine use of the image enhancement may increase PET/CT scanner throughput or reduce the patient’s radiation burden. Furthermore, the analyzed AI enhancement can be used in PET images processed with different reconstruction algorithms. However, as shown by our experience with different AI algorithms, one should be aware of its impact on SUVmax values and lesion detectability.
Conflict of interests
Authors do not claim any conflict of interests.
Funding
No funding was necessary.