Partial Outage: Index Outage on Studio page
Incident Report for Knak
Postmortem

Context

Knak uses Algolia for a few pages to make searching and finding information quicker and easier. The Studio page (formerly Email Center) relies on Algolia search capabilities both for displaying emails and landing pages and providing searchability on those assets.

Note: The email and page data is all stored in the database and was unaffected, and the index only stores a "summary" of the information to enable quicker searching.

Issue Timeline

April 14th, 2021 3:20PM

During clean-up of Algolia, to remove data generated through our automated tests, our production index was cleared accidentally by an engineer. This made it such that the emails could no longer be found in the Email Center/Studio.

The issue was flagged immediately and we put up a Status Page alert to explain the issue to clients. We then built out a script to re-index emails and landing pages, starting by the most recently updated. Within about 90 minutes, any emails that had been built or updated since March 1 had been restored in the index and could be found in the studio.

Over the next few hours, we were able to restore all emails and pages created in Knak.

Path Forward

Separation of Applications

The indexes are currently all within one application. We will be separating out the production instance from the rest of the instances to remove the risk of accidental deletion. Anyone that has access to the production instances of Algolia will need to do so through a separate login and permissions on the production index will be very limited.

Daily Backup of index

We are putting measures in place to backup the Algolia index daily. Should we ever lose an index in the future, we would be able to restore an existing backup and only need to make updates on emails and pages that were updated on that day, making the recovery of a lost index take 10-15 minutes rather than a few hours.

Posted Apr 19, 2021 - 12:15 EDT

Resolved
This incident has been resolved.
Posted Apr 15, 2021 - 07:46 EDT
Update
Index has been fully restored.
Posted Apr 15, 2021 - 07:46 EDT
Update
We are continuing to work on a fix for this issue.
Posted Apr 15, 2021 - 07:46 EDT
Identified
The issue has been identified and a fix is being implemented.
Posted Apr 14, 2021 - 17:39 EDT
Update
Emails and pages that have been update since the beginning of March have been reindexed.
Posted Apr 14, 2021 - 17:38 EDT
Update
We are continuing to investigate this issue.
Posted Apr 14, 2021 - 15:46 EDT
Investigating
We are currently investigating an incident where the index for emails and pages in studio has been lost. We are currently rebuilding it from the latest emails and pages backward. The emails and pages will still be able to be retrieved by ID using the URL, but will not be accessible from Studio until the index is rebuilt.
Posted Apr 14, 2021 - 15:43 EDT
This incident affected: Knak App.