Page 138 - 2018_10-Haematologica-web
P. 138

H.M. Schoemans et al.
Table 3. Graft-versus-host disease (GvHD) assessment accuracy and timing results.
Results for the complete GvHD test package (median)
Correctly diagnosed vignettes
Correctly scored vignettes
Results for acute and chronic GvHD (median)
Correctly scored acute GvHD vignettes
Correctly scored chronic GvHD vignettes
Time needed to complete the whole GvHD test package
Mean time to complete all vignettes (minutes)
APP (n=37)
10
(IQR 1; range 5-10) 9
(IQR 2; range 2-10) APP (n=37)
4
(IQR 0; range 2-4) 5
(IQR 1; range 0-6) APP (n=37)
48.84
(Std dev: 10.3; range 31-67)
No APP (n=40)
6.5
(IQR 3; range 2-9) 4.5
(IQR 3; range 1-7) No APP (n=40)
2
(IQR 2; range 0-4) 3
(IQR 2.25; range 0-5) No APP (n=40)
25.27
(Std dev: 9.76; range 9-54)
n: number; IQR: Interquartile Range; Std dev: standard deviation.The maximum number of correct answers for the whole package was 10 (4 for acute GvHD and 6 for chronic GvHD).
The lack of consensus and knowledge of the most recent guidelines was perhaps due to the low number of HCT patients seen per week, and probably partly explains the lower results obtained by the group using traditional meth- ods. However, this also highlights the need to standardize GvHD evaluation within the HCT community, as recently advocated by a panel of GvHD experts.1 It is precisely in this context of lack of confidence and expertise in GvHD assessment that e-Tools, such as the eGVHD App, have the potential to increase the quality of data collection by allow- ing easy, reliable, user-friendly and intuitive access to the most up-to-date guidelines to any healthcare professional. Regrettably, we were unable to test the effect of the App specifically in smaller Belgian centers, as they declined the invitation to participate in this study. We are, therefore, unable to speculate on the generalizability of this tool in centers with lower transplantation volumes.
The limited number of vignettes also makes it challenging to make any meaningful conclusions on specific subgroups or at the organ level. The significant difference in improved accuracy for aGvHD scoring compared to cGvHD scoring is probably simply due to the fact that each of the four aGvHD severity levels was evaluated by a single clinical vignette (instead of two per severity level for cGvHD). For instance, in the ‘late acute GvHD grade II’ clinical vignette, the largely incorrect final severity evaluation reported by the “No APP” group was partially conditioned by the fact that the distinction between acute and chronic GvHD had not been made in the first place. Moreover, the MAGIC cri- teria were not the standard reference for aGvHD for the majority of the participants, which could explain the excep- tionally poor results for the grade IV aGvHD vignette when evaluated by the “No APP” group.
The limited number of observations also restrict our abil- ity to draw any conclusions on the potential impact of using the App in the clinical setting to decide upon starting treat- ment, as the threshold to start therapy is linked to much broader categories than the ones described above (typically, any grade above or equal to ‘aGvHD grade II’ or ‘cGvHD moderate’ would qualify for treatment, depending on the general health status of the patient15-17). Treatment adap- tions rely also on specific response criteria,18,19 which were
not investigated in this project. Future studies, therefore, need to evaluate the use and impact of the eGVHD App in the clinic. This will also allow the evaluation of the App in situations where the patient does not present with GvHD, considering that the test package studied here only evaluat- ed the tool in the context of GvHD-afflicted patients, pre- cluding the evaluation of detection measures such as predic- tive values, sensitivity and specificity.
Further limitations of this study are the lack of repeated measures and the unnatural setting of clinical vignettes, which are unable to perfectly mirror the wide variations in GvHD presentation in real life and their relative incidence. This particular experimental design was chosen to simplify logistics, optimize healthcare professional participation, avoid patient stress, and keep respondent burden to a min- imum. It also allowed for multiple experts to validate the GvHD assessment. Such an expert consensus is rarely obtained in clinical practice, but was considered to be the best gold standard available to date to serve as reference for the accurate scoring during GvHD assessment.
So, it remains to be determined whether the App will also improve accuracy when being used in real life circum- stances. Yet, even in this artificial setting, the low sponta- neous GvHD scoring accuracy obtained in this evaluation with traditional methods (obtaining a median of 4.5 cor- rectly scored vignettes out of a maximum of 10) is in line with the results of a previous validation study carried out in a more real-world setting. This study included actual patient examinations and showed that only 50-75% of freshly trained clinicians actually agreed with experts on the overall severity score of the evaluated chronic GvHD patients.6 Mitchell et al. concluded that a single training ses- sion was not sufficient to achieve consistently acceptable inter-rater agreement between novice healthcare practition- ers and GvHD experts. Clinical training in GvHD physical examinations may thus be necessary to achieve repro- ducible severity assessment with high inter-rater reliability in practice. By ensuring the systematic assessment of all organs potentially affected by GvHD, the App can also serve as a training tool, aimed at making healthcare profes- sionals ultimately independent of technological assistance.
The eGVHD App is currently limited to a calculator func-
1704
haematologica | 2018; 103(10)


































































































   136   137   138   139   140