Elasticsearch Indexing Strategy in Asset Management Platform (AMP)

Asset Management at Netflix

At Netflix, all of our digital media assets (images, videos, text, etc.) are stored in secure storage layers. We built an asset management platform (AMP), codenamed Amsterdam, in order to easily organize and manage the metadata, schema, relations and permissions of these assets. It is also responsible for asset discovery, validation, sharing, and for triggering workflows.

Elasticsearch Integration

Elasticsearch is one of the best and widely adopted distributed, open source search and analytics engines for all types of data, including textual, numerical, geospatial, structured or unstructured data. It provides simple APIs for creating indices, indexing or searching documents, which makes it easy to integrate. No matter whether you use in-house deployments or hosted solutions, you can quickly stand up an Elasticsearch cluster, and start integrating it from your application using one of the clients provided based on your programming language (Elasticsearch has a rich set of languages it supports; Java, Python, .Net, Ruby, Perl etc.).

Fig 1. Indices based on Asset Types
Fig 2. Indices based on Time Buckets
Fig 3. Snippet of the index mapping
Fig 4. Snippet of nested metadata field on a stored document
Fig 5. Search/Indexing RPS
Fig 6. CPU Spike with Old indexing strategy
Fig 7. CPU Usage with New indexing strategy

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Netflix Technology Blog

Netflix Technology Blog

323K Followers

Learn more about how Netflix designs, builds, and operates our systems and engineering organizations