West2 prxtransfer server outage
Incident Report for PRX
Resolved
West2 is now completely operational. Some files are still being downloaded but the process should complete within the next 45 minutes. Customers geographically closer to the West2 server will begin being routed to it in the next 10 minutes.
Posted about 3 years ago. Sep 26, 2014 - 16:04 EDT
Update
The check of the files on West2 is complete and we are re-running the sync process for the server. No corrupted files were found.

Because syncing was put on hold during this process, approximately 15GB worth of updates are available for the server which will need to be completed before it can be reactivated. Our compression and de-duplication process should make this finish relatively quickly, but the data is being transferred transcontinentally so it is still subject to severe limitations.

Aside from the brief period before the server was removed from the rotation, this process should have had minimal impact on any customers. If you are still seeing impact of any sort, please contact support, as it is likely unrelated to this incident.
Posted about 3 years ago. Sep 26, 2014 - 15:36 EDT
Update
Our analysis of the (nearly 40,000) files is over half-way completed. At this time, no errors have been found.

During this analysis and maintenance, the automatic sync process has been suspended, which means that once the analysis has finished there will likely be some files which require syncing before the server can be reactivated.

East1 seems to be handling the increased load without issue.
Posted about 3 years ago. Sep 26, 2014 - 14:41 EDT
Update
As a point of clarification, the affected server will not be promoted for use by customers until it has been verified that the files are completely up to date. We expect this to happen some time this evening or early tomorrow. Customers who are being rerouted to the east1 server should see little if any degraded performance.
Posted about 3 years ago. Sep 26, 2014 - 12:52 EDT
Monitoring
We have brought the West2 server back to operation and are currently verifying that no data was corrupted during the outage.

Once the process of verifying the data on the server has completed, any corrupted files will be automatically replaced with the source data by the sync process. This may take under an hour or over six hours depending on the scale of corruption.

At this time, no data appears to have been corrupted. Again, no additional work should be required by customers. In the event that data corruption is found, timestamps will be updated on the files to indicate a revision.
Posted about 3 years ago. Sep 26, 2014 - 12:46 EDT
Identified
An error on the West2 FTP Server is resulting in unexpected behavior for certain customers which are directed to the West2 server.

We have temporarily redirected traffic bound for the west2 server to the east1 server, which should be able to withstand increased traffic. We are closely monitoring the situation there.

Impacted customers will be automatically switched to the new server in the next few minutes, and no further action is required for SubAuto customers.

We will provide more information as it becomes available.
Posted about 3 years ago. Sep 26, 2014 - 12:05 EDT