Page 135 - 2018_10-Haematologica-web
P. 135

Impact of the eGVHD App on GvHD assessment
Figure 1. Study Design. APP: eGVHD App; GvHD: graft-versus-host disease. *E.g. own knowledge, 'fast facts' sheets, scoring sheets, standard operating procedures, copies of original guideline publications, or any other chosen resource.
Pre-test user current standard practice and technology access/acceptance
The Glucksberg10 and the NIH 2014 criteria9 were the most frequently referenced GvHD assessment guidelines being used in clinical practice as reported by healthcare pro- fessionals (Table 2). Most professionals reported basing their usual GvHD evaluation on their own knowledge (n= 44, 57%), the NIH 2014 GvHD evaluation sheet9 (n=17, 22%), and/or a self-designed scoring paper document (n=16, 21%). The use of standard criteria to assess GvHD was reported as important (median score of 7 on a Likert scale of 1 to 10, IQR: 4, range 1-10), but performed with a relatively low level of confidence (median score of 5 on a Likert scale of 1 to 10, IQR: 4, range 1-9). The top four GvHD assessment problems spontaneously reported were: lack of knowledge or experience (n=23), time constraints (n=16), lack of data in the medical files (n=7), and the com- plexity of the guidelines (n=5).
During the workshop, the “No APP” group planned to rely essentially on their own knowledge (n=24, 62%), the NIH 2014 GvHD evaluation sheet9 (n=9, 23%), the NIH 2005 GvHD evaluation sheet11 (n=6, 15%), a self-designed scoring document (n=6, 15%), and/or other methods (n=7, 18%) (Table 2).
Accuracy of GvHD assessment
The total number of correctly evaluated clinical vignettes was higher in the “APP” group compared to the “No APP” group (Table 3). More specifically, participants in the “APP” group had a median of 10 correct answers for diagnosis (IQR 1; range 5-10), compared to a median of 6.5 (IQR 3; range 2-9) in the “No APP” group for the whole GvHD test package (the maximum obtainable score was 10). For sever- ity assessment, the “APP” group scored a median of 9 vignettes correctly (IQR 2; range 2-10) compared to a medi- an of 4.5 (IQR 3; range 1-7) in the “No APP” group.
Individual results for each vignette are shown in Online Supplementary Table S3. As a result, the odds of being correct were 6.14 (95%CI: 2.83-13.34) and 6.29 (95%CI: 4.32-9.15) times higher in favor of the “APP” group for diagnosis and scoring, respectively (P<0.001).
All pre-specified sub-analyses were performed as planned. The GvHD assessment of the “APP” group remained superior for both acute and chronic GvHD sepa- rately with a significantly stronger effect in acute GvHD (OR=17.89, 95%CI: 8.47-37.79) compared to chronic GvHD (OR=4.34, 95%CI: 2.79-6.74) (P<0.001), and for all levels of severity scoring, except for aGvHD grade I. The effect of the App was more apparent for higher levels of severity (P=0.034) for both aGvHD and cGvHD. The strength of the effect did not significantly depend on center (Online Supplementary Figure S1) or professional background (Online Supplementary Figure S2). Similarly, neither the age of user (Online Supplementary Figure S3), the number of GvHD patients seen per week (Online Supplementary Figure S4), or self-reported comfort with using GvHD guidelines (Online Supplementary Figure S5) seemed to mitigate the superior performance of the “APP” group.
Agreement between participant results and the expert gold standard diagnosis and severity scoring are highlighted in the diagonal of Tables 4 and 5, showing the superior per- formance of the “APP” group. For diagnosis, the most con- sistent errors of the “No APP” group were seen for case- vignettes relating to ‘Overlap cGvHD’ and ‘Late aGvHD’, which both tended to be confused with ‘Classic cGvHD’. The highest discrepancies between the “No APP” group and expert acute GvHD severity scoring results were seen in ‘grade II’ (which tended to be graded according to the cGvHD criteria) and ‘grade IV’ aGvHD (which was essen- tially mistaken for ‘grade III’). Inconsistencies in chronic GvHD severity scoring were seen across all grades. The most frequent error in the “APP” group was a slight overes-
haematologica | 2018; 103(10)
1701


































































































   133   134   135   136   137