Message from InfluxData Founder & CTO Paul Dix: Discontinuation of InfluxDB Cloud in AWS Sydney and GCP Belgium
By
Paul Dix /
Company
Jul 10, 2023
Navigate to:
July 14, 2023 update: We have successfully recovered the time series data for the GCP Belgium users and are in the process of getting that data back to them. More details in the following update: Recovery of InfluxDB Cloud Data in GCP Belgium.
Last week, InfluxData discontinued InfluxDB Cloud service in two regions: AWS Sydney and GCP Belgium. We notified users of the shutdown over the span of several months, but some users were caught by surprise and lost the data they stored in those regions.
First, we deeply apologize to the customers who were caught off guard by the discontinuation. Our goal was to ensure that all users were aware of the discontinuation and able to take appropriate action, and we clearly did not achieve that goal.
I also want to address the situation and provide more context on how we came about this decision, communicated it to our customers, and share what we’d do differently.
Why did InfluxData shut down the regions?
InfluxData operates InfluxDB Cloud as a SaaS in multiple regions spanning AWS, Google, and Azure. Over the years, two of the regions did not get enough demand to justify the continuation of those regional services.
What process was followed during the shutdown? How were customers and users informed?
Discontinuing a service like this is never easy. We needed to inform our users and customers in those regions of the discontinuation and help them move their data. We invested significant effort into informing these users and helping them with the migration process. To that end we:
-
Laid out a schedule of email updates to all customers, sending three emails total to the account information we had on file. We sent those emails on February 23, April 6, and May 15, 2023.
-
Reached out to customers for whom we had contact information and made sure they understood the situation and the timeline. Had direct conversations with them, and assisted them in migrating.
-
Updated the homepage of the UI for InfluxDB Cloud 2 in those regions with a notice that the service was going to be shut down on June 30, 2023.
-
Actively supported every customer or user, large or small, who responded to any of our communications to migrate their data and workload to a different InfluxDB Cloud region.
-
Continued operating the InfluxDB Cloud regions for a few extra days in case customers had not completed their migrations.
From most appearances, this effort worked. Our support team was involved in many migration efforts, and many other customers migrated without our support.
Who was impacted by the shutdown?
The vast majority of users and customers responded by moving their workloads to other InfluxDB Cloud regions. However, via our community Slack channel, Support, and forums, we soon realized that our communication did not register with everyone and there were some users who were caught off guard and lost access to their data.
What happens now?
Our engineering team is looking into whether they can restore the last 100 days of data for GCP Belgium. It appears at this time that for AWS Sydney users, the data is no longer available.
Users can reach out to our support team ([email protected]) and we will begin the data retrieval and recovery process.
What could InfluxData have done better?
In hindsight, our assumption that the emails, sales outreach, and web notifications would be sufficient to ensure all users were aware of and acted on the notifications was overly optimistic. If we could do it over again, we would do the following things differently:
-
Create a separate category of “Service Notification” emails that customers could not opt-out of. These emails would only be for critical service updates and would come from a support/formal alert alias.
-
Improve email processes and clarity. We would set up the email communication to increase in frequency as the date approached with additional reminders 24 hours ahead of, and on the day of the shutoff.
-
Redouble efforts to contact users who have not reduced their reads or writes within the 30 or 45 days before the end of life date for the region.
-
Shut down the service more gracefully. We would conduct a scream test by shutting down the services for one hour or one day to give users who did not register the notifications a chance to notice that their workloads were not running, and then turn the service back on for a short time period to give those users one last chance to migrate their data.
-
Implement a 30-day data retention grace period where we exported and stored customer and user data before deleting it.
-
Add a banner at the top of the status.influxdata.com page as soon as the initial notifications went out (and left it there until service was terminated) and used this page as a backup communication avenue for major service updates.
Looking ahead
I want to apologize again to our customers and community. Trusting a vendor with your data is critical, and in our handling of this situation, we damaged that trust. We are committed to learning from this and improving our processes to ensure this does not happen again. We can and must do better.
If you’ve been impacted by this, please email me personally and I will do my best to help out: [email protected]