Intermittent Service Outage for FormAssembly – April 1, 2016
Earlier this morning, FormAssembly incurred an intermittent service outage. The affected group are FormAssembly users who have opted into using FormAssembly 4.x (also known as “Gemini”). No FormAssembly Enterprise customers were affected, including those who have migrated to 4.x.
The affected part of the application was limited to the management of forms (specifically, access via https://app.formassembly.com). Consequently, logging into FormAssembly and managing forms was intermittently unavailable. Form viewing and submissions appears to have been unaffected during the intermittent service outage.
At 04:41 ET, the FormAssembly on-call staff was alerted to a potential issue via the monitoring systems. Customers were also reporting intermittent issues via Intercom. An investigation began and it was determined that the affected service was FormAssembly 4.x and the outage was not total. Then, the FormAssembly team identified the cause of the intermittent service outage, which was an internal DNS server not responding properly.
The DNS server was removed. All machines using that DNS server were updated. Because our other DNS servers were operating normally, this resulted in the intermittent nature of the outage. The service was operating normally by 06:30 ET.
What Will Be Done
Before today, we had already began improving the existing internal DNS servers used by the FormAssembly service. As a result, we will now be accelerating that process.
We will also improve our machine configurations. Before today, the DNS servers used by each machine in had the same DNS configuration. This was a mistake and exposed the risk that caused the outage today. In the future, the machines will use different internal servers and different ordering to minimize the chance of this occurring again.