All service operational

About this page

This page monitors all of our critical services. If there is a disruption in service, incident notes will be updated here. Subscribing to updates is the best way to stay informed about any issues affecting our services availability.

  • Platform

    Operational
  • Admin Panel

    Operational
  • Public Status Page

    Operational
  • Website

    Operational

Ongoing Incidents

  • Affected Service: Platform Admin Panel Public Status Page Website June 01, Mon

    Resolved
    The StatusKit server crashed or restarted in the early morning May 30, 2020 and did not recover correctly.
    On reboot the Docker config and nginx config did not load in the proper order forcing a default nginx webserver message to display for users of the service -- or a failure to load browser message as if the service was completely offline.

    In January, when the SSL wildcard cert was changed this process installed default nginx web server config that was not previously on the production servers. The SSL certs were correctly installed and the service functioned normally for the almost 5 months with no reboots.

    When the service crashed and restarted the correct web service config was not load due the error introduced in January.

    No customer data was impacted. No customer data was lost. Databases functioned normally throughout the incident.

    We believe we have fixed the error but will be testing it again in a future maintenance window.

    The IP address of the statuskit.com service changed as a result of the downtime. Please refresh your cache if your client's experience issues.

    During this extended downtime on May 31, 2020, we did expand the disk space available to the application, performed log maintenance and greatly expanded the size of the database to accommodate future growth.

    01:03, Jun 01 UTC

Recent Incidents

  • Affected Service: Platform Admin Panel April 23, Sun

    Investigating
    StatusKit is currently investigating a database connect issue that is resulting in timeouts and 500 server errors to user of the Admin panel and Platform. StatusKit pages are up and running and end users should not experience problems.

    We have rebooted our RDS instances in AWS and

    Event logs show this issue starting late on 4/20/2017 to current, 4/22/2017.

    Please report any issues to hello@statuskit.com.

    Thank you for your patience.

    18:08, Apr 22 UTC

    Update
    A fix has been implemented and is in testing. We're monitoring the situation and event logs before further updating status. Thank you.

    20:23, Apr 22 UTC

    Resolved
    The issue, which we believed caused by an internal DNS timeout in our Amazon cloud when accessing the RDS database, appears to be behind us.

    At the time of this update, this exception had not been seen for 8 hours.
    https://www.screencast.com/t/NIqlAZqoIf

    RDS loads were not out of the ordinary leading us to determine the DNS lookups internal to our AWS were at fault on this intermittent issue. We made some changes on 4/22 mentioned above and have configuration changes at the ready to relax timeout thresholds should performance degrade in the future.

    We are tagging this as RESOLVED and will continue to monitor. Thank you!

    *** PS: we regret any performance issues with our service but remind ourselves as we type this update, this is why we exist, for you when similar issues occur!!!

    18:08, Apr 23 UTC
  • Affected Service: Admin Panel September 22, Thu

    Identified
    Editing incident will result in an internal server error, data can not be updated as a result. We’re working to resolve this issue and hope to have it all fixed soon. Thanks so much for your patience and understanding, and sorry for any inconvenience.

    15:04, Sep 22 UTC

    Fixing
    We’re working to resolve this issue now and hope to have it all fixed soon.

    15:10, Sep 22 UTC

    Fixing
    A fix has been implemented and deployed. We're monitoring the situation to make sure everything runs smoothly.

    15:18, Sep 22 UTC

    Resolved
    This incident has been resolved.

    15:39, Sep 22 UTC
  • No incidents reported