The HAQ-DI is a validated assessment of function that effectively discriminates active treatment from placebo [20] and predicts key RA outcomes, including work disability and mortality [2]. It is frequently used in RA clinical trials, observational studies, and daily patient care, and is considered the gold standard measurement of function in rheumatology [1, 4]. Over the years, there have been many approaches to determining a clinically significant improvement in HAQ-DI. Many of these approaches have used anchor-based assessments involving either subjective (eg, patient’s view of their overall disease status) or objective (eg, documented work disability) measures [6, 9, 12, 13, 19]. Some analyses were based on population-based means [8], whereas others were based on between-patient differences [6, 10, 12]. HAQ-DI MCIDs range widely in value depending on the specific study and there is concern about the accuracy of calculations based on an ordinal rather than interval scale [14].
Our approach to determining a valid criterion for HAQ-DI improvement is different from previous efforts: our goal was to establish a change in HAQ-DI that exceeded long-term random fluctuation within an individual patient on stable therapy. Long-term changes encompass short-term measurement variability as well as nonsystematic changes in disease activity during stable therapy. The short-term test-retest reliability of the HAQ-DI is quite high, as indicated by an intraclass correlation of 0.897 (95% confidence interval, 0.855–0.927) for two assessments taken 1 to 2 days apart [21]. However, patients in rheumatology clinical care are typically seen at 3- to 6-month intervals, so long-term variability is more relevant to outcomes observed during clinical care.
We found that the degree of change required to exceed normal long-term variation in a discovery cohort (N = 1645) on stable therapy with moderate disease activity and a mean disease duration of 10.9 years was a HAQ-DI improvement (decrease) of ≥0.68 points. Of the various MCIDs previously reported, the dcrit value is closest to the 0.74 “really important difference” determined from objective reports of work disability [9]. In the full patient cohort (N = 2740), 22.1% achieved a HAQ-DI-dcrit response at month 6 after initiation of adalimumab therapy. Approximately 70% of patients who achieved a HAQ-DI-dcrit response at month 6 retained it at months 12 and 24. The stability of the HAQ-DI-dcrit criterion over 18 months is especially noteworthy given that disease-related deterioration in function occurs over time in patients with RA [22]. In contrast, patients in the small improvement subgroup showed considerable variation in HAQ-DI responses at subsequent time points, with some improving and some deteriorating.
Our observation that achievement of a HAQ-DI MCID of 0.22 is in some cases due to random variation, rather than an improvement in function, is in keeping with a previous study by Wolfe et al. involving 50 patients with RA followed over approximately 16 years [23]. This study found that the HAQ-DI within-patient variation between assessments (approximately one per year) was 0.436, only slightly below the between-patient variation of 0.596, and almost twice as large as an MCID of 0.22. It is likely that the extensive within-patient variation contributes to the high rates of HAQ-DI MCID achievement observed in some clinical trials. In one recent study, 43% of patients in the placebo arm of a randomized trial achieved a HAQ-DI MCID of 0.22 at 3 months (prior to being switched to active treatment) [7].
An examination of baseline patient characteristics based on the magnitude of HAQ-DI change at month 6 showed that the subgroup achieving a HAQ-DI-dcrit improvement at month 6 had a lower mean age, lower BMI, and shorter disease duration than patients in the subgroups with a small HAQ-DI improvement (between the frequently used MCID of 0.22 and 0.68) or no improvement (< 0.22). Baseline mean HAQ-DI scores were somewhat higher in the HAQ-DI-dcrit subgroup than in the other subgroups, perhaps because responder criteria are easier to achieve with high baseline disease activity [24].
Because the derivation of the HAQ-DI-dcrit was based on statistical parameters and not on patient-centered anchors, it was critical to evaluate whether a HAQ-DI-dcrit response was associated with clinically relevant outcomes. We found that patients achieving a HAQ-DI-dcrit response at month 6 not only had higher rates of HAQ-DI remission at months 6 and 12, but also markedly higher rates of DAS28 remission and therapeutic responses for DAS28, pain, fatigue, and patient global health than patients in the other subgroups. Similarly, mean values for the objective assessments of tender and swollen joint counts were lower in the group achieving a HAQ-DI-dcrit response. It is perhaps not surprising that a more stringent functional response criterion is associated with better function at later time points. However, the association between the HAQ-DI-dcrit criterion and other outcomes, such as DAS28 remission and improvement in patient-reported outcomes, indicates that HAQ-DI-dcrit functional improvements are linked to meaningful differences in subsequent patient clinical status compared with the small improvement and no improvement groups.
Using a stepwise regression model, we identified change in pain from month 0 to month 6 as the most important predictor of change in HAQ-DI during the first 6 months of adalimumab therapy; this variable accounted for > 25% of the HAQ-DI change variance observed in this model. High baseline pain was a negative predictor for HAQ-DI improvement. Other studies concur on the impact of pain on function [16, 23, 25, 26]. Pain has been identified as the largest component of HAQ-DI [23] and an explanatory variable for all subdimensions of this functional assessment tool [26]. In addition to being correlated with function, pain is also strongly associated with DAS28; 68% of patients achieving a DAS28 therapeutic response, as assessed by the DAS28-dcrit, also achieved a significant improvement in pain [16]. Together, these data suggest that pain is an important driver of therapeutic outcomes. We further identified high baseline HAQ-DI as a positive predictor for improvements in HAQ-DI from month 0 to month 6, likely due to the greater window for improvement in patients with high baseline scores. As others have observed, one of the most important drawbacks of HAQ-DI as a functional assessment is a floor effect in which patients with low baseline HAQ-DIs cannot experience significant HAQ-DI decreases despite clinical improvement [1].
This study has several important limitations. Although the HAQ-DI-dcrit was derived from a large sample size, the discovery cohort was limited to German patients preparing to initiate adalimumab therapy. Accordingly, patients with different ethnicities or milder or earlier disease may have a different HAQ-DI-dcrit limit than the one reported here. As our data indicate, the HAQ-DI-dcrit for patients with baseline HAQ-DI < 1 is 0.597, rather than the higher number we used as a conservative value in this study. It is therefore possible that the HAQ-DI-dcrit used in the study reported here is too high for patients with milder RA. We hope our statistical methods will be applied to varied groups of patients in other countries to provide insights into variations in HAQ-DI-dcrit values in different populations and with different disease severities. In addition, it is important to note that individual patients may experience meaningful benefits with HAQ-DI improvements lower than the statistically determined HAQ-DI-dcrit. However, as we have shown in this study, on a population-wide basis lower HAQ-DI improvements may be due to random fluctuation and are unlikely to be as clinically relevant or as stable as a HAQ-DI-dcrit response. We acknowledge that patients who initiate treatment with good physical function are not well suited for this measure because of the fairly large change required to achieve a HAQ-DI-dcrit response; we excluded patients who were in functional remission (HAQ-DI < 0.5) from our analyses. As noted previously, floor effects (the inability of patients with low baseline HAQ-DIs to experience significant HAQ-DI decreases despite clinical improvement) are an issue with the HAQ-DI, and this tool is not appropriate for detecting change within the range of normal physical function [1].