How Prime Video distills time series anomalies into actionable alarms
Targeted handling of three distinct types of “special events” dramatically reduces false-alarm rate.
Prime Video customers must be able to reliably stream content at all times on any device that supports the Prime Video application, such as mobile phones, smart TVs, or video game consoles.
For the Prime Video team, deploying and maintaining the application on such a broad scale entails custom code configurations and third-party integrations that are unique to particular geographical regions and families of devices. This diversity poses the risk of a fragmented customer experience, wherein device- or region-specific issues affect only a subset of customers.
Manually setting alarms that monitor the quality of the Prime Video application across all combinations of customer activities, device types, and regions is unfeasible. However, this problem can be reframed as a large-scale, online, time-series anomaly detection problem, such that an automated monitoring solution alerts on-call engineers to deviations from expected behavior in observed traffic.
In our How Prime Video distills time series anomalies into actionable alarms article on the Amazon Science website, we discuss the practical challenges that arise when applying anomaly detection to time series describing customer activity and present a selection of mitigating techniques. The proposed solutions distinguish different categories of deviations induced by fluctuating customer viewing behavior and have contributed to a significant reduction in the false alarms that would otherwise distract Prime Video engineers from meeting real customer needs.
You can find more information about this by reading the full How Prime Video distills time series anomalies into actionable alarms article on the Amazon Science website.