21st June 18. A post-mortem and an apology.

22.06.18
2 minute read

Yesterday, TransferWise went down. Our website, apps and debit cards weren’t functioning during this time and we weren’t able to send payments out. We’re extremely sorry for the inconvenience and frustration this caused.

Our teams worked tirelessly to restore service as soon as possible. Your money and your data were entirely safe throughout, and are still safe now. This was not a cyber attack.

Why did this happen?

Some of our critical infrastructure is hosted out of a data centre in Europe. Yesterday (21 June), at approximately 0930 UTC, this data centre suffered a catastrophic power failure.

In the process, a power surge destroyed some of our equipment (specifically, a remote management controller) which had to be replaced. Once replaced and reconfigured, we were able to access our infrastructure and begin restoring service.

This was a freak accident that should never have happened. Nonetheless, this is no excuse for your delayed payments and declined card transactions. It isn’t good enough that one power failure should stop your ability to move your money across borders.

Why did it take so long?

Once we were able to access our infrastructure, after power was restored, and after we had replaced equipment destroyed by the power surge, we kicked off our recovery procedures. We have a large distributed system – carrying out a complete restart requires some orchestration, and we always prioritise consistency of data over availability.

Faster Payment Scheme payments were restored at 1718 UTC and TransferWise Debit Card payments were restored at 2013 UTC. Full service was restored at 2300 UTC.

Although we back up all customer data to our secondary data centre, we currently don’t run an active-active setup across our data centres for all our components. This is about to change.

What are we doing about it?

Our engineering teams have been working for several months now to migrate our infrastructure to Amazon Web Services (AWS) to improve the speed and reliability of our services. As we’ve grown, we’ve optimised for building features, adding currencies, and improving your transfer speeds, but it’s clear some of our infrastructure partners will not scale further with us on our journey. Our number one motivation for this migration to AWS is to reduce the chances of this level of failure happening again to an absolute minimum.

We have a highly distributed system and migrating to a new infrastructure is a big undertaking. We commit to provide more information about the migration and other stability improvements in the future.

We realise how much you, our customers, depend on TransferWise in your daily lives. This is a responsibility we don’t take lightly and we are truly sorry for the inconvenience and frustration this has caused you.

Thank you for your patience and for bearing with us. We’re absolutely committed to making this right.

– Harsh Sinha, TransferWise VP Engineering.

TransferWise is the smart, new way to send money abroad.

Find out more