Mayank Sharma

Graphic with the text: "Using ML to synchronize subtitles at scale."

Machine Learning

Prime Video uses ML-driven subtitle synchronization to ensure a smooth viewing experience

Prime Video developed a language-agnostic system to flag and automatically synchronize out-of-sync subtitles.

Mayank Sharma

Mar 08, 2023

Machine Learning

Multi-lingual multi-task speech emotion recognition using wav2vec 2.0

In this work, we present a Multi-Lingual (MLi) and Multi-Task Learning (MTL) audio only SER system based on the multi-lingual pre-trained wav2vec 2.0 model.

Mayank Sharma

Jan 02, 2023

Computer Vision

CNN-based audio event recognition for automated violence classification and rating for Prime Video content

We show that, (a) audio based approach results in superior performance compared to other baselines, (b) benefit due to audio model is more pronounced on global multi-lingual data compared to English data and (c) the multi-modal model results in 63% rating accuracy and provides the ability to backfill top 90% Stream Weighted Coverage titles in PV catalog with 88% coverage at 91% accuracy.

Mayank Sharma, Xiang Hao, Raffay Hamid

Jan 02, 2023

Machine Learning

Language agnostic missing subtitle detection

This paper discusses the problem of missing transcription, where the subtitle blocks corresponding to some speech segments in the DEC are non-existent. We present a solution to augment human correction process by automatically identifying the timings associated with the non-transcribed dialogues in a language agnostic manner.

Honey Gupta, Mayank Sharma

Jan 02, 2023

Computer Vision

A comprehensive empirical review of modern voice activity detection approaches for movies and TV shows

In this work, we present a thorough survey on DNN based VADs on DEC data in terms of their accuracy, Area Under Curve (AUC), noise sensitivity, and language agnostic behavior.

Mayank Sharma, Raffay Hamid

Jan 02, 2023