Backstage Blog

RSS logo

You're browsing posts of the category Search

How to Reindex One Billion Documents in One Hour at SoundCloud

March 21st, 2019 by Qaiser Abbasi

In the past, the Search Team at SoundCloud had high lead times for making updates to Elasticsearch clusters, either during the implementation of a new feature or simply while fixing a bug. This was because both tasks require us to reindex our catalog from scratch, which means reindexing more than 720 million users, tracks, playlists, and albums. Altogether, this process took up to one week, though there was even one scenario where it almost took one month to roll out a bug fix.

In this post, I would like to share the concrete Elasticsearch tweaks we made so that we can now reindex our entire catalog in one hour.

Read more…

PageRank in Spark

January 24th, 2018 by Josh Devins

SoundCloud consists of hundreds of millions of tracks, people, albums, and playlists, and navigating this vast collection of music and personalities poses a large challenge, particularly with so many covers, remixes, and original works all in one place.

Read more…