Our paper was recently accepted for publication in IEEE OJEMB! 🥳 Here I will summarise our main findings and the impact of our work.
Note: This article is best viewed in ‘light mode’!
In recent years, the emergence of consumer digital technologies such as smartphones which are used for healthcare applications, has opened the possibility of developing rich, continuous, and objective measures of disease that can be administered remotely outside of standard clinical settings. For instance, consumer smartphone devices can objectively characterise ambulation and fatigue in people with multiple sclerosis (PwMS) which is one of the main symptoms of disease. Importantly, it has been shown that earlier identification of changes in PwMS impairment are important to identify and provide better therapeutic strategies.
In this work, we demonstrated how PwMS’ ambulatory patterns can be captured from smartphone inertial sensor-based measurements recorded during a daily “two-minute walk test” (2MWT), a remotely administered walking test as part of Roche’s seminal FLOODLIGHT study in MS. Deep networks offer state-of-the-art and data-driven approaches that are capable of learning smartphone sensor-based features capturing ambulatory information. Our study subsequently explored how deep convolutional neural networks (DCNNs) could be used to distinguish healthy controls (HC) from PwMS that have mild and moderate disability, longitudinally, over the 24-week study.
Our results indicated that smartphone-based ambulatory severity outcomes could accurately estimate MS level of disability, as measured by the clinician-administered EDSS score. Furthermore, longitudinal severity outcomes were shown to accurately reflect individual participants’ level of disability over the study duration. Anecdotally, we even observed the possibility of these measurements to remotely capture MS relapse events, where an increase in model-estimated symptom severity coincided with when two participants had self-reported that a relpase had occured using their smartphone application.
The symptoms of neurodegenerative diseases, such as multiple sclerosis (MS), frequently fluctuate over time, and between patients, ensuring
that it is notoriously difficult to quantify effective therapeutic interventions and disease management techniques. Current in-clinic
assessments are often too infrequent to track changes in MS impairment over time. People with MS (PwMS) experience a progressive decline in physical function and quality of life and over time—this often leads to disability and difficulty to perform many tasks of daily life (think about going shopping, getting dressed, taking your dog on a walk, or caring for your children or grandchildren). Currently, the gold-standard methods of measuring the impact of MS on daily life rely on infrequent clinical visits that may often occur every 3–4 months, with assessments depending on a combination of subjective clinician-determined scores—in MS, clinicans use the Expanded Disability Status Scale (EDSS)
Although MS follows a highly heterogeneous and subject-specific disease course, the disease profiles have been grouped into four clinical phenotypes which are based on disease progression
In recent years, there has been a shift towards the a adoption of body worn sensors to objectively evaluate ambulatory performance in PwMS, circumventing the need for resource-intensive and expensive gait laboratory equipment, but also opening up the possibility to measure physical function outside of standard clinical settings. These technologies can provide new data-driven metrics for clinical decision-making during in-clinic visits
This study builds upon our previous investigations
The FLOODLIGHT (FL) proof-of-concept (PoC) app was trialled in a 24-week, prospective study in PwMS and HCs (NCT02952911) to assess the feasibility of remote patient monitoring using smartphone (and smartwatch) devices
The foundation of this work employs a Deep Convolutional Neural Network (DCNN) to estimate participants MS severity from the smartphone inertial sensor data was recorded while participants performed a daily, at home, two minute walk test (2MWT). The raw accelerometer sensor data from each 2MWT were then partitioned into multiple vector sequences (epochs), where a DCNN was then trained to classify a given epoch as having been performed by a HC, PwMSmild or PwMSmod participant. The DCNN model implemented has previously been introduced in Creagh et al. (2021)
The network outputs are interpreted as \(\hat y_k(\mathbf{x}_n,\mathbf{w}) = p(y_{k} = k|\mathbf{x}_n)\). As such, $\hat y_k$ can be thought of as the probability that a given epoch \(\mathbf{x}_n\) belonged to class $k$. A continuous estimate of severity (i.e., the predicted level of MS disability) can then be captured by taking an average of all epoch predictions over a test for a given assessment day, $d$ such that: \(\begin{align} \hat{y}_d= \frac{1}{N}\sum_{n=1}^{N} \underset{k}{\mathrm{argmin}} (p(\hat{y}_k=k | \mathbf{x}_n)) \end{align}\) where $N$ are the number of windowed epochs for a given test date, $d$, and $k$ lies in an ordinal range of $[0, 1,…K]$. Therefore $\hat{y}_d$ will be continuous such that $0\leq \hat{y}_d \leq K$ and can conceptualised as a naive estimate of MS disease severity, mapping a predicted level of disability ranging from healthy to mild to moderate.
Longitudinal trends of specific participants were examined as a time-series by considering the severity estimates $\hat{y}$ of repeated 2MWTs over all their available data for the duration of the FL study. The goal of this work was to perform longitudinal analysis of participants severity and visualise the average severity trends over time. First, missing 2MWT DCNN-based outcomes were first imputed using piecewise linear interpolation (PLI), by considering $\hat{y}$ as a time-series to impute missing test severity observations on a given date. Note: imputed 2MWTs were only included for calculation of average trend estimation for individual participants and not for model evaluation. Next, a simple trend estimation was applied to the time sequence of severity estimates ($\hat{y}$) across days ($d$) using a $d-$ centred linear moving average filter (MAF). A 7-day window was implemented in order to capture the trends in $\hat{y}_d$ over the study duration.
In this work, it was shown how a deep network classification model could (naïvely) estimate the level of participant disability from ordinal classification categories. Severity outcome estimates stratified across HC and PwMS groups and were strongly correlated to disease status ($r$: 0.75; $\rho$: 0.71, p$<$0.001), as measured by the EDSS—considered the ground-truth assessment in PwMS
For instance, no misclassifcation of HC as PwMSmod was observed, or vice-versa, indicating that severity estimates were reflective of true disease status. More interestingly, those subjects at classification boundaries displayed severities representative of their clinical assessments. For instance, those with EDSS just above 3.5 (i.e. PwMSmod) were misclassified more as PwMSmild compared to those with EDSS much greater than 3.5, implying that a reflective estimate of disease severity could be captured by transforming a DCNN model into a simple probabilistic outcome per subject.
The longitudinal patterns of healthy controls versus participants with varying manifestations of MS severity could be characterised by examining severity outcomes over the duration of the FL study for individual subjects. For instance, the figure below depicts examples of stable trends for a correctly classified HC and PwMSmod subject respectively. While both participants had some incorrect predictions, the mean severity prediction over all repeated tests reflected the participant’s true class grouping.
Evaluating subject’s performance longitudinally suggested that severity estimates may be sensitive to capture MS-symptom worsening, particulary relapses. During the FL study, four participants experienced relapses which they self-reported using the application on their smartphones. Longitudinal analysis of the trajectories of daily severity estimates from these subjects revealed useful insights into the manifestation of relapses expressed in remote inertial sensor data. For instance, two subjects displayed an increased severity outcome up to and around the data of reporting a relapse (see figures below), suggesting that sensor-based ambulatory outcomes could potentially be sensitive enough to remotely capture relapse events.
We need to consider that missing data might also contain useful information relating to how a patient is feeling or their MS disability status. For example, the study participant below reported (non-relapse) adverse clinical events occurring on non-specified dates between weeks 8 and 12. Around these dates the particiapnt’s adhereance drops and they stop contributing daily 2MWTs. Obviously, this is an anecdotal example, but it poses the question: “did the adverse event affect this participant’s adherence or ability to perform the test ?”
Indeed, missing data might be a marker of changes in disability status itself (i.e., that the data is not missing-at-random (MAR)). However, it is difficult to monitor patients if they’re not contributing data over a long periods—adherence will often be a problem when prescribing 2 minutes of walking a day, daily, for 24 weeks. Therefore, we must consider patient burden when developing our digital biomarker assessments and solutions. In my opinion, I think we need to consider how we can incorporate as much information as possible from passive approaches, collecting sensor data during daily life. For example, a smartwatch could collect sensor-data characterising physical activity and ambulation unobtrusively from patients. Swapping out some prescribed “active” tests, or reducing the frequency of assessments by supplementing with passive data could help build more robust longitudinal models and help circumvent some of the problems associated with missing assessment data.
Despite the potential of smartphone-based outcomes to remotely monitor individual participant’s ambulatory function longitudinally, there are several limitations of this study which must be considered.
We believe that the work presented in this study to be of important value, emphasising the potential of remote sensor outcomes to augment current in-clinic acquired patient information. The long-term remote monitoring of PwMS function could open up the space for true personalisation: the clustering of disease trajectories or similar patients, estimating the likelihood of disease progression, quantifying response to different treatments as a population or an individual, as well catching the mutable patterns of MS disease that are only visible out-of-clinic and as a function of time. We believe that our work helps informs future digital health technology-based study design, to better remotely characterise the impact of MS, ultimately to expand the use of DHT to develop more sensitive, and patient-centric, endpoints in clinical trials and real-world studies.
If you use or like our work, please consider citing us:
@article{creagh2022longitudinal,
author={Creagh, Andrew P. and Dondelinger, Frank and Lipsmeier, Florian and Lindemann, Michael and De Vos, Maarten},
journal={IEEE Open Journal of Engineering in Medicine and Biology},
title={Longitudinal Trend Monitoring of Multiple Sclerosis Ambulation Using Smartphones},
year={2022},
volume={3},
number={},
pages={202-210},
doi={10.1109/OJEMB.2022.3221306}
}