This work is the first to take advantage of recurrent neural networks to predict
influenza-like-illness (ILI) dynamics from various linguistic signals extracted from social
media data. Unlike other approaches that rely on timeseries analysis of historical ILI
data [1, 2] and the state-of-the-art machine learning models [3, 4], we build and evaluate
the predictive power of Long Short Term Memory (LSTMs) architectures capable of
nowcasting (predicting in \real-time") and forecasting (predicting the future) ILI
dynamics in the 2011 { 2014 influenza seasons. To build our models we integrate
information people post in social media e.g., topics, stylistic and syntactic patterns,
emotions and opinions, and communication behavior. We then quantitatively evaluate
the predictive power of different social media signals and contrast the performance of
the-state-of-the-art regression models with neural networks. Finally, we combine ILI
and social media signals to build joint neural network models for ILI dynamics
prediction. Unlike the majority of the existing work, we specifically focus on developing
models for local rather than national ILI surveillance [1], specifically for military rather
than general populations [3] in 26 U.S. and six international locations.
Our approach demonstrates several advantages: (a) Neural network models learned
from social media data yield the best performance compared to previously used
regression models. (b) Previously under-explored language and communication behavior
features are more predictive of ILI dynamics than syntactic and stylistic signals
expressed in social media. (c) Neural network models learned exclusively from social
media signals yield comparable or better performance to the models learned from ILI
historical data, thus, signals from social media can be potentially used to accurately
forecast ILI dynamics for the regions where ILI historical data is not available. (d)
Neural network models learned from combined ILI and social media signals significantly
outperform models that rely solely on ILI historical data, which adds to a great
potential of alternative public sources for ILI dynamics prediction. (e) Location-specific
models outperform previously used location-independent models e.g., U.S. only. (f)
Prediction results significantly vary across geolocations depending on the amount of
social media data available and ILI activity patterns.
Revised: March 8, 2018 |
Published: December 15, 2017
Citation
Volkova S., E.M. Ayton, K. Porterfield, and C.D. Corley. 2017.Forecasting Influenza-like Illness Dynamics for Military Populations using Neural Networks and Social Media.PLoS One 12, no. 12:e0188941.PNNL-SA-124148.doi:10.1371/journal.pone.0188941