In an instance where desired pre-defined actions, behaviors, or other categories are known a priori, various video classification and recognition models can be trained to discover those classifications and their location within the video. Absent that information, one might still be tasked with identifying interesting portions within a video, a process which—if done manually—is onerous and time-consuming as it requires manual inspection of the video itself. Recognizing high-level interesting segments within a whole video has been a general area of interest due to the ubiquity of video data. However the size of the data makes storage, retrieval, and inspection of large collections of videos cumbersome. This problem motivates the task of generating shortened clips highlighting the primary content of a video, relieving the burden of having to watch the entire video. This paper presents an unsupervised method of creating shortened clips of videos, enabling the rapid review of the most interesting content within a video. Our method uses features extracted from pre-trained action recognition models as input to online moving window robust principal component analysis to generate summaries. The procedure is tested on a publicly available video summarization dataset and demonstrates comparable performance to state-of-the-art in an un-augmented setting while requiring no training.
Revised: December 9, 2020 |
Published: September 23, 2020
Citation
Claborne D.M., K. Pazdernik, S.J. Rysavy, and M.J. Henry. 2020.Video Summarization Using Deep Action Recognition Features and Robust Principal Components Analysis. In Proceedings of the 24th World Multi-Conference on Systemics, Cybernetics and Informatics, (WMSCI 2020), September 13-16, 2020, (Online), edited by NC Callaos, et al, 96-103. Winter Garden, Florida:International Institute of Informatics and Systemics (IIIS).PNNL-SA-152604.