Democratizing Video Human Activity Recognition
Orellana Trullols, Guillem
MetadataShow full item record
The exponential growth of video sources available like smartphones, surveillance video cameras and video sharing platforms, and the recent advances related to video encoding, storage, and computational resources, has engaged researchers to further explore the computer's capability of understanding videos. This thesis establishes a deep learning framework based on good practices wrapping the human activity recognition in videos field. The framework emphasizes the reproducibility of experiments and encourages the use of techniques to maximize the learning capabilities of video understanding models. Our main contributions are open sourcing an Activity Recognition (AR) Python package and the creation of a reduced dataset called SARD alongside an ablation study demonstrating that having access to a small number of annotated videos is not a limitation to obtain a robust video classifier.
European research projects
The following license files are associated with this item: