Skip to main content

Publications

Prime Video is a great place to build and innovate at scale, but that’s only one part of the story. Our technologists also publish, teach, and engage with the worldwide research community.

This paper discusses the problem of missing transcription, where the subtitle blocks corresponding to some speech segments in the DEC are non-existent. We present a solution to augment human correction process by automatically identifying the timings associated with the non-transcribed dialogues in a language agnostic manner.
In this work, we propose to adopt a two-tower model, in which one tower is to learn the user representation based on their watch history, and the other tower is to learn the effective representations for titles using metadata.
We propose a simple yet effective approach that uses single-frame depth-prior obtained from a pretrained network to significantly improve geometry-based SfM for our small-parallax setting.
The goal of this work is to assess the importance of spatial and temporal learning for production-related VQA. In particular, it assesses state-of-the-art UGC video quality assessment perspectives on LIVE-APV dataset, demonstrating the importance of learning contextual characteristics from each video frame, as well as capturing temporal correlations between them.
In this work, we present a thorough survey on DNN based VADs on DEC data in terms of their accuracy, Area Under Curve (AUC), noise sensitivity, and language agnostic behavior.
We propose a new prototype model for no-reference video quality assessment (VQA) based on the natural statistics of space-time chips of videos. Space-time chips (ST-chips) are a new, quality-aware feature space which we define as space-time localized cuts of video data in directions that are determined by the local motion flow.
In this paper, we present a novel, accurate and efficient method for temporal sync detection between dubbed audio tracks and corresponding non-dubbed original-language audio tracks.
We conducted the first large-scale study of medium and low-bitrate videos from live sports for two codecs (Elemental AVC and HEVC) and created the Amazon Prime Video Low-Bitrate Sports (APV LBS) dataset.
We present a new method and a large-scale database to detect audio-video synchronization (A/V sync) errors in tennis videos.
In this study towards better quality assessment of high-quality videos, a subjective study was conducted focusing on high-quality HD and UHD content with the Degradation Category Rating (DCR) protocol.
We built a video quality database specifically designed for live streaming VQA research. The new video database is called the Laboratory for Image and Video Engineering (LIVE) Live stream Database. The LIVE Livestream Database includes 315 videos of 45 contents impaired by 6 types of distortions.
We present a novel approach to differential cost analysis that, given a program revision, attempts to statically bound the difference in resource usage, or cost, between the two program versions. Differential cost analysis is particularly interesting because of the many compelling applications for it, such as detecting resource-use regressions at code-review time or proving the absence of certain side-channel vulnerabilities.