Computer Vision

Automatically identifying scene boundaries in movies and TV shows

Prime Video beat previous state-of-the-art work on the MovieNet dataset by 13% with a new model that is 90% smaller and 84% faster.

Shixing Chen,

Xiaohan Nie,

David Fan,

Dongqing Zhang,

Vimal Bhat,

Raffay Hamid

Feb 09, 2023

Twitter

In June 2021 at the conference on Computer Vision and Pattern Recognition (CVPR 2021), Prime Video presented ShotCoL, a state-of-the-art self-supervised algorithm that we developed for scene boundary detection.

Scene boundary detection is identifying where scenes in a movie begin and end. This foundational capability is used at Prime Video for introducing adverts in advertising-based video-on-demand (AVOD) content at the least disruptive moments in the content’s timeline and in applications based on cinematic content understudying, such as scene classification, video retrieval, and video summarization.

ShotCoL achieves 13% higher average precision than previous works on the publicly-available MovieNet dataset, while running 84% faster and using 90% fewer model parameters. ShotCoL is also more data efficient than previous works. In fact, it requires 75% less labeled data during downstream evaluation to match the previous state-of-the-art performance.

ShotCoL leverages contrastive learning (a form of self-supervised learning) which became popular in 2019 for image-based learning. Contrastive learning teaches the model to distinguish between similar and dissimilar examples that are defined without using human labels. While the previous works leverage image based contrastive learning solutions that use augmentation (such as flipping or rotating the image) of image as a positive pair, we are one of the first to apply contrastive learning to videos, and specifically for movies.

Image augmentation doesn’t work well for videos because it doesn’t incorporate the temporal context. Therefore, we designed a new contrastive learning algorithm. The key intuition behind ShotCoL is that nearby movie shots are more likely to be similar to each other than to further away shots. We define positive pairs by searching a local neighborhood of shots and picking the most similar one. The model then learns to contrast these similar shots against randomly selected shots from further away. This leads to a representation that can effectively cluster similar shots and thereby localize scene boundaries.

After publishing at CVPR 2021, we deployed the model to drive a 43% reduction in the operator handling time for advertisement cue point insertion in AVOD content. The model is one of the foundational components that break a video into its logical sub-parts to drive cinematic content understanding based applications. You can read more about the model in our Automatically identifying scene boundaries in movies and TV shows article on Amazon Science.

Twitter

Tags:

Shixing Chen

Applied Scientist II – Prime Video

Xiaohan Nie

Senior Applied Scientist – Amazon Devices

David Fan

Applied Scientist – Prime Video

Dongqing Zhang

Senior Applied Scientist – Amazon Web Services (AWS)

Vimal Bhat

Senior Manager Applied Science – Prime Video

Raffay Hamid

Senior Principal Scientist – Prime Video

Most popular

Video Streaming

“We’re just beginning to build the future of live sports streaming”

At the European Women in Tech conference 2022, Filippa Hasselstrom, head of low-latency streaming at Prime Video, explained how her team builds the future of live sports streaming using UDP.

Filippa Hasselstrom

Feb 07, 2023

Our Innovation

Prime Video announces Amazon Research Awards recipients for fall 2022

Prime Video announces ARA awards in the fields of anomaly detection and insights, automated reasoning, personalization and discovery, and video quality analysis.

Staff Writer

Apr 17, 2023

Our People

Empathetic by design: How Amélie Werner prioritizes her team to drive innovation for customers

As head of Design Ops, UX Research, and Global Commerce Design at Prime Video, Amélie helped oversee the redesign of the user experience – a journey that’s allowed her to embrace Amazon’s Leadership Principles while empowering her colleagues.

Amélie Werner

Apr 05, 2023

Video Streaming

Innovating live video streaming for a VOD-only world

Here’s how Prime Video delivers live video streaming on customer devices that only support video-on-demand (VOD) playback.

Parminder Singh

Apr 13, 2023