Intermittent Database Connect issues inside AWS

April 23, Sun
Affected Service: Platform Admin Panel
  • Investigating
    StatusKit is currently investigating a database connect issue that is resulting in timeouts and 500 server errors to user of the Admin panel and Platform. StatusKit pages are up and running and end users should not experience problems.

    We have rebooted our RDS instances in AWS and

    Event logs show this issue starting late on 4/20/2017 to current, 4/22/2017.

    Please report any issues to hello@statuskit.com.

    Thank you for your patience.

    18:08, Apr 22 UTC
  • Update
    A fix has been implemented and is in testing. We're monitoring the situation and event logs before further updating status. Thank you.

    20:23, Apr 22 UTC
  • Resolved
    The issue, which we believed caused by an internal DNS timeout in our Amazon cloud when accessing the RDS database, appears to be behind us.

    At the time of this update, this exception had not been seen for 8 hours.
    https://www.screencast.com/t/NIqlAZqoIf

    RDS loads were not out of the ordinary leading us to determine the DNS lookups internal to our AWS were at fault on this intermittent issue. We made some changes on 4/22 mentioned above and have configuration changes at the ready to relax timeout thresholds should performance degrade in the future.

    We are tagging this as RESOLVED and will continue to monitor. Thank you!

    *** PS: we regret any performance issues with our service but remind ourselves as we type this update, this is why we exist, for you when similar issues occur!!!

    18:08, Apr 23 UTC
Back to current status