Sunday, July 20, 2008

Why amazon s3 sucks

ok so this outage started a little before 9am this morning:



9:05 AM PDT We are currently experiencing elevated error rates with S3. We are investigating.
Translation: s3 is down

9:26 AM PDT We're investigating an issue affecting requests. We'll continue to post updates here.
translation: its still down

9:48 AM PDT Just wanted to provide an update that we are currently pursuing several paths of corrective action.
translation: its still down

10:12 AM PDT We are continuing to pursue corrective action.
translation: its still down

10:32 AM PDT A quick update that we believe this is an issue with the communication between several Amazon S3 internal components. We do not have an ETA at this time but will continue to keep you updated.
translation: its still down

11:01 AM PDT We're currently in the process of testing a potential solution.
translation: its still down

11:22 AM PDT Testing is still in progress. We're working very hard to restore service to our customers.
translation: its still down

11:45 AM PDT We are still in the process of testing a series of configuration changes aimed at bringing the service back online.
translation: its still down

12:05 PM PDT We have now restored communication between a small subset of hosts. We are working on restoring internal communication across the rest of the fleet. Once communication is fully restored, then we will work to restore request processing.
translation: its still down

12:25 PM PDT We have restored communication between additional hosts and are continuing this work across the rest of the fleet. Thank you for your continued patience.
translation: its still down

12:51 PM PDT The restored hosts are stable and we are moving forward in restoring communication between additional hosts.
translation: its still down

1:17 PM PDT We continue to make incremental progress and communication between additional hosts has been restored. We are continuing with the plan to restore communication across Amazon S3's large fleet of hosts.
translation: its still down

1:38 PM PDT At this point, we are accelerating progress on restoring internal communication as all signs continue to look good.
translation: its still down

2:03 PM PDT We have restored all internal communication between hosts in the EU and we are continuing to make progress in the US. Once all internal communication has been restored, we will start a multi-step process to begin accepting requests across Amazon S3 locations.
translation: its still down

2:19 PM PDT A quick update to let you know that we have now also restored all internal communication between hosts in our West Coast facilities in the US.
translation: its still down


5 and a half hours of outage... booooooooooooooooo!

0 comments: