Publications

Language agnostic missing subtitle detection

This paper discusses the problem of missing transcription, where the subtitle blocks corresponding to some speech segments in the DEC are non-existent. We present a solution to augment human correction process by automatically identifying the timings associated with the non-transcribed dialogues in a language agnostic manner.

Honey Gupta, Mayank Sharma

Jan 02, 2023

Machine Learning

Exploring heterogeneous metadata for video recommendation with two-tower model

In this work, we propose to adopt a two-tower model, in which one tower is to learn the user representation based on their watch history, and the other tower is to learn the effective representations for titles using metadata.

Ainur Yessenalina, Ali Roshan Ghias

Jan 02, 2023

Computer Vision

Depth-guided sparse structure-from-motion for movies and TV shows

We propose a simple yet effective approach that uses single-frame depth-prior obtained from a pretrained network to significantly improve geometry-based SfM for our small-parallax setting.

Xiaohan Nie, Raffay Hamid

Jan 02, 2023

Computer Vision

On the importance of spatio-temporal learning for video quality assessment

The goal of this work is to assess the importance of spatial and temporal learning for production-related VQA. In particular, it assesses state-of-the-art UGC video quality assessment perspectives on LIVE-APV dataset, demonstrating the importance of learning contextual characteristics from each video frame, as well as capturing temporal correlations between them.

Dario Fontanel, David Higham, Benoit Quentin Arthur Vallade

Jan 02, 2023

Computer Vision

A comprehensive empirical review of modern voice activity detection approaches for movies and TV shows

In this work, we present a thorough survey on DNN based VADs on DEC data in terms of their accuracy, Area Under Curve (AUC), noise sensitivity, and language agnostic behavior.

Mayank Sharma, Raffay Hamid

Jan 02, 2023

Computer Vision

No-reference video quality assessment using space-time chips

We propose a new prototype model for no-reference video quality assessment (VQA) based on the natural statistics of space-time chips of videos. Space-time chips (ST-chips) are a new, quality-aware feature space which we define as space-time localized cuts of video data in directions that are determined by the local motion flow.

Yongjun Wu, Hai Wei

Jan 02, 2023

Machine Learning

A simple and efficient method for dubbed audio sync detection using compressive sensing

In this paper, we present a novel, accurate and efficient method for temporal sync detection between dubbed audio tracks and corresponding non-dubbed original-language audio tracks.

Avijit Vajpayee, Zhikang Zhang, Abhinav Jain, Vimal Bhat

Jan 02, 2023

Computer Vision

Subjective and objective quality assessment of high-motion sports videos at low-bitrates

We conducted the first large-scale study of medium and low-bitrate videos from live sports for two codecs (Elemental AVC and HEVC) and created the Amazon Prime Video Low-Bitrate Sports (APV LBS) dataset.

Yongjun Wu, Hai Wei, Sriram Sethuraman

Jan 02, 2023

Computer Vision

Detection of audio-video synchronization errors via event detection

We present a new method and a large-scale database to detect audio-video synchronization (A/V sync) errors in tennis videos.

Yongjun Wu, Hai Wei, Zongyi Liu

Jan 02, 2023

Computer Vision

Towards better quality assessment of high-quality videos

In this study towards better quality assessment of high-quality videos, a subjective study was conducted focusing on high-quality HD and UHD content with the Degradation Category Rating (DCR) protocol.

Deepthi Nandakumar, Sriram Sethuraman

Jan 02, 2023

Computer Vision

Assessment of subjective and objective quality of live streaming sports videos

We built a video quality database specifically designed for live streaming VQA research. The new video database is called the Laboratory for Image and Video Engineering (LIVE) Live stream Database. The LIVE Livestream Database includes 315 videos of 45 contents impaired by 6 types of distortions.

Yongjun Wu, Hai Wei, Sriram Sethuraman

Jan 02, 2023

Cloud and Scale

Differential cost analysis with simultaneous potentials and anti-potentials

We present a novel approach to differential cost analysis that, given a program revision, attempts to statically bound the difference in resource usage, or cost, between the two program versions. Differential cost analysis is particularly interesting because of the many compelling applications for it, such as detecting resource-use regressions at code-review time or proving the absence of certain side-channel vulnerabilities.

Ðorđe Žikelić, Bor-Yuh Evan Chang, Pauline Bolignano, Franco Raimondi

Jan 02, 2023

Publications

Prime Video is a great place to build and innovate at scale, but that’s only one part of the story. Our technologists also publish, teach, and engage with the worldwide research community.