Skip to main content

Prime Video presents on Video/Audio Quality in Computer Vision and hosts a Grand Challenge during WACV 2023

During the Winter Conference on Applications of Computer Vision (WACV), Prime Video’s Yongjun Wu and Sriram Sethuraman discussed Video/Audio Quality in Computer Vision, and Hai Wei presented the HDR VQM Grand Challenge awards.

In January 2023, Prime Video’s Yongjun Wu (Senior Principal Engineer – Prime Video) and Sriram Sethuraman (Senior Principal Scientist – Prime Video) presented at a workshop on Video/Audio Quality in Computer Vision at the Winter Conference on Applications of Computer Vision (WACV). Prime Video also hosted a High Dynamic Range Video Quality Measurement (HDR VQM) Grand Challenge and invited the global research community to participate. At the workshop, Prime Video’s Hai Wei (Principal Research Scientist – Prime Video) presented the awards to the challenge winners.

Many machine learning (ML) tasks and computer vision (CV) algorithms are susceptible to video/audio quality artifacts. Nonetheless, most visual learning and vision systems assume high-quality video/audio as input. In reality, noises and distortions are common in video/audio capturing and acquisition process. Such artifacts can often be introduced in the video compression, transcoding, transmission, decoding, or rendering process. These quality issues impact the performance of learning algorithms, systems and applications, and, ultimately, can directly affect our customers’ experience.

Because of this potential impact, video/audio quality has become increasingly important in CV products, systems, and services, but has not yet received enough attention in the general computer vision and machine learning (CV/ML) community.

Therefore, it’s important to systematically investigate learning performance on video or audio input with variegated quality issues (for example, noises, distortions, and artifacts). It’s also equally important to leverage learning advancements to improve the state-of-the-art video/audio quality assessment technologies, in addition to developing new learning-based video/audio quality improvement algorithms and applications.

The workshop addressed topics related to video/audio quality in CV/ML, including:

  • Evaluating video/audio quality in CV/ML use-cases, such as object detection, segmentation, tracking, and recognition.
  • Analyzing, modelling, and learning the quality impact of video/audio acquisition, compression, transcoding, transmission, decoding, rendering, or display.
  • Novel video/audio quality assessment methodologies: full reference, reduced-reference, and non-reference.
  • Video/audio quality issues in synthesized or computer-generated video/audio data.
  • Techniques to remove artifacts, such as shadows, glare, and reflections.
  • Techniques to improve quality, such as brightening, color adjustment, sharpening, inpainting, deblurring, denoising, dehazing, deraining, and demosaicing.
  • Video/image quality improvement on resolution, frame rate, color gamut, dynamic range (Standard Dynamic Range versus High Dynamic Range), blurring, noise, or lighting.
  • Datasets, statistics, and theory of video/audio quality, in addition to research, applications, and system development of them.

During the full-day workshop, researchers from the global research community presented their most recent work addressing the above topics in both oral presentation and poster formats. During the HDR VQM Grand Challenge, the community submitted novel or improved VQM models for objectively predicting HDR video quality for both full-reference and no-reference use cases. HDR video datasets with human subjective quality scores (as ground truth) were shared to facilitate the VQM model training and testing. Prizes were awarded for first, second, and third placed teams.

This workshop was organized by the following Amazon scientists: Zongyi Liu, Yarong Feng, Yuan Ling, Hai Wei, and Yixu Chen, with organizational support from Natalie Strobach (Senior Program Manager, Research Programs – Prime Video). The open-source HDR video dataset was provided by Professor Alan Bovik’s group at the Laboratory of Image and Video Engineering, University of Texas at Austin.

For more information about this Grand Challenge and the Video/Audio Quality in Computer Vision workshop, see the Winter Conference on Applications of Computer Vision (WACV) 2023 website.

Senior Principal Engineer – Prime Video
Senior Principal Scientist – Prime Video
Principal Video Specialist – Prime Video