nginx docker config error - unforced downtime May 30 overnight, May 30, 2020

August 14, Fri
Affected Service: Platform Admin Panel Public Status Page Website
  • Resolved
    The StatusKit server crashed or restarted in the early morning May 30, 2020 and did not recover correctly.
    On reboot the Docker config and nginx config did not load in the proper order forcing a default nginx webserver message to display for users of the service -- or a failure to load browser message as if the service was completely offline.

    In January, when the SSL wildcard cert was changed this process installed default nginx web server config that was not previously on the production servers. The SSL certs were correctly installed and the service functioned normally for the almost 5 months with no reboots.

    When the service crashed and restarted the correct web service config was not load due the error introduced in January.

    No customer data was impacted. No customer data was lost. Databases functioned normally throughout the incident.

    We believe we have fixed the error but will be testing it again in a future maintenance window.

    The IP address of the service changed as a result of the downtime. Please refresh your cache if your client's experience issues.

    During this extended downtime on May 31, 2020, we did expand the disk space available to the application, performed log maintenance and greatly expanded the size of the database to accommodate future growth.

    01:03, Jun 01 UTC
  • Resolved

    18:34, Aug 14 UTC
Back to current status