Research by Prime Video demonstrates that it’s vital to consider both spatial and temporal features when developing a model to estimate the visual quality of production-related content.
The goal of this work is to assess the importance of spatial and temporal learning for production-related VQA. In particular, it assesses state-of-the-art UGC video quality assessment perspectives on LIVE-APV dataset, demonstrating the importance of learning contextual characteristics from each video frame, as well as capturing temporal correlations between them.