Back to overview
Degraded

All services are down for workspaces in us-east-1 due to AWS incident

Oct 20 at 09:01am CEST
Affected services
Real-time decision execution
User interface

Resolved
Oct 20 at 11:29am CEST

[Recovered] between 6:52am 9:22am UTC Oct 20th, all US Taktile customers were impacted by a major AWS outage in the US East (N. Virginia) Region.
Taktile APIs returned 5xx error codes.

As per our current understanding no other regions were affected.

We will conduct Post Mortem analysis to understand how we could mitigate this better.

Updated
Oct 20 at 11:25am CEST

AWS seems to report that it's recovering.

We also see the first signs of recovery on our end as-well.

Updated
Oct 20 at 11:02am CEST

Still on-going - looks like AWS identified the root cause, but services didn't recover yet.

AWS update:
"[02:01 AM PDT] We have identified a potential root cause for error rates for the DynamoDB APIs in the US-EAST-1 Region. Based on our investigation, the issue appears to be related to DNS resolution of the DynamoDB API endpoint in US-EAST-1. We are working on multiple parallel paths to accelerate recovery. This issue also affects other AWS Services in the US-EAST-1 Region. Global services or features that rely on US-EAST-1 endpoints such as IAM updates and DynamoDB Global tables may also be experiencing issues. During this time, customers may be unable to create or update Support Cases. We recommend customers continue to retry any failed requests. We will continue to provide updates as we have more information to share, or by 2:45 AM."

Updated
Oct 20 at 10:32am CEST

[On-going] since ~6:15am UTC, all US Taktile customers are impacted by an ongoing AWS regional outage. Taktile APIs return 5xx error codes.

Blast radius: all APIs in us-east-1. Other regions are not affected at this time. We lack clear visibility into impact because the AWS console itself is having major issues.

Containment actions:
- Taktile status page updated, incident posted: https://status.taktile.com/incident/746742

Next steps:
- tracking AWS service recovery via AWS incident, will keep updating this thread

Updated
Oct 20 at 09:58am CEST

Socure seems to be down because of the AWS incident too.

It's likely that customers that are using Socure connections might experience issues.

Updated
Oct 20 at 09:15am CEST

AWS confirmed the incident, multiple other services are impacted, waiting recovery/more info from their end.

Created
Oct 20 at 09:01am CEST

Seems to be an AWS incident, on-call team is working on it.