tag:status.prx.org,2005:/historyPRX Status - Incident History2024-03-29T03:51:20-04:00PRXtag:status.prx.org,2005:Incident/189356142023-10-26T16:18:18-04:002023-10-26T16:18:19-04:00Intermittent Errors with Dovetail Applications<p><small>Oct <var data-var='date'>26</var>, <var data-var='time'>16:18</var> EDT</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Oct <var data-var='date'>26</var>, <var data-var='time'>14:52</var> EDT</small><br><strong>Monitoring</strong> - We believe all applications have returned to normal operations. This incident will remain open for the next hour or so as workloads return to regular volumes and we confirm that all systems continue to behave correctly.</p><p><small>Oct <var data-var='date'>26</var>, <var data-var='time'>14:01</var> EDT</small><br><strong>Identified</strong> - We have been able to significantly reduce the impact of the issue we are experiencing in an underlying system, so applications should be behaving much better now. A small number of inventory forecasting features have been paused while we continue to evaluate the situation and implement fixes.</p><p><small>Oct <var data-var='date'>26</var>, <var data-var='time'>09:00</var> EDT</small><br><strong>Investigating</strong> - Starting around 9 AM ET, we've been experiencing an elevated level of errors with ad inventory management and forecasting and podcast content management. We are investigating the root cause of these errors, but access to Publish, Dovetail Podcasts, and Dovetail Inventory may be unstable.<br /><br />There's no evidence that serving audio files to listeners or analytics for audio downloads is impacted at this time.</p>tag:status.prx.org,2005:Incident/185385392023-09-18T22:04:02-04:002023-09-18T22:04:02-04:00AWS Networking Incident<p><small>Sep <var data-var='date'>18</var>, <var data-var='time'>22:04</var> EDT</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Sep <var data-var='date'>18</var>, <var data-var='time'>20:26</var> EDT</small><br><strong>Update</strong> - AWS has not fully resolved their networking issue in all regions, but we are no longer seeing any impact to our services, so we are considering this incident closed at this time.</p><p><small>Sep <var data-var='date'>18</var>, <var data-var='time'>15:59</var> EDT</small><br><strong>Identified</strong> - Due to an ongoing networking issue involving some underlying AWS infrastructure, we are seeing intermittent slow response times and errors for some of our services. We are continuing to evaluate how these issues are affecting all of our services, but so far we have identified some abnormalities with authenticating to FTP servers (including FTP, FTPS, and SFTP), as well as serving podcast audio. Based on what we have seen so far, these issues are isolated and infrequent, and should not significantly impact any users.</p>tag:status.prx.org,2005:Incident/182337392023-08-16T10:00:00-04:002023-08-23T11:04:12-04:00Media Processing Delays<p><small>Aug <var data-var='date'>16</var>, <var data-var='time'>10:00</var> EDT</small><br><strong>Resolved</strong> - Our media processing infrastructure is running behind which is causing audio, video, and image files that are uploaded to remain in the "processing" state longer than expected.</p>tag:status.prx.org,2005:Incident/178999762023-07-19T10:03:59-04:002023-07-19T10:03:59-04:00Database Maintenance<p><small>Jul <var data-var='date'>19</var>, <var data-var='time'>10:03</var> EDT</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Jul <var data-var='date'>19</var>, <var data-var='time'>09:16</var> EDT</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Jul <var data-var='date'>19</var>, <var data-var='time'>09:09</var> EDT</small><br><strong>Scheduled</strong> - Dovetail and Exchange access will be temporarily unavailable on Wednesday, July 19th, due to maintenance on our production database. The downtime will occur from 9:00 a.m. to 10:00 a.m. ET.<br /><br />During this maintenance period, the usability of Dovetail and Exchange will be affected. You will be unable to log in, upload, schedule, or publish any content. Additionally, any episodes previously scheduled to be published within this time frame will be delayed.<br /><br />However, we want to assure you that during this time, your podcast feeds will remain active, and listeners will still be able to download episodes as usual.</p>tag:status.prx.org,2005:Incident/175632302023-06-13T18:43:11-04:002023-06-13T18:43:25-04:00AWS Outage Causing Multiple Services to fail<p><small>Jun <var data-var='date'>13</var>, <var data-var='time'>18:43</var> EDT</small><br><strong>Resolved</strong> - All systems have recovered, and are processing normally.</p><p><small>Jun <var data-var='date'>13</var>, <var data-var='time'>17:42</var> EDT</small><br><strong>Monitoring</strong> - The systems have recovered and should be functional.<br />We are seeing a few errors still and some delayed processing, and will continue monitoring.</p><p><small>Jun <var data-var='date'>13</var>, <var data-var='time'>15:29</var> EDT</small><br><strong>Investigating</strong> - AWS is having issues with their services, impacting ours<br />https://health.aws.amazon.com/health/status<br /><br />We are investigating and will update as we know more and work to mitigate the issues.</p>tag:status.prx.org,2005:Incident/87823492021-12-08T11:00:27-05:002021-12-08T11:00:29-05:00PRX Outage<p><small>Dec <var data-var='date'> 8</var>, <var data-var='time'>11:00</var> EST</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Dec <var data-var='date'> 8</var>, <var data-var='time'>10:58</var> EST</small><br><strong>Update</strong> - This incident has been resolved. PRX Dovetail and PRX Exchange are fully operational.</p><p><small>Dec <var data-var='date'> 7</var>, <var data-var='time'>11:38</var> EST</small><br><strong>Investigating</strong> - AWS is experiencing technical difficulties which are currently impacting PRX Dovetail and PRX Exchange. At the moment producers are unable to publish episodes through Publish and upload audio on the Exchange. We are currently monitoring the situation.</p>tag:status.prx.org,2005:Incident/56561382020-12-08T11:55:27-05:002020-12-08T11:55:27-05:00PRX Metrics data not updating<p><small>Dec <var data-var='date'> 8</var>, <var data-var='time'>11:55</var> EST</small><br><strong>Resolved</strong> - Data recovery complete for the AWS outage timeframe.</p><p><small>Nov <var data-var='date'>26</var>, <var data-var='time'>07:34</var> EST</small><br><strong>Monitoring</strong> - Around 7:00 UTC, operation of the underlying service returned to normal, and around 10:55 UTC our processing pipeline finished working through its backlog of data that had accumulated during the incident. We are continuing to actively monitor all aspects of our platform, and are working to understand the full impact of the incident on podcast metric data. As of now, there will still be gaps or inaccuracies in the numbers reported by PRX Metrics for the period of the incident. These anomalies may appear on November 25th and/or 26th in reports and graphs, depending on which metric is being viewed and your local timezone.<br /><br />As reported previously, there was no degradation in service to audio file serving at any point during the incident.</p><p><small>Nov <var data-var='date'>25</var>, <var data-var='time'>14:29</var> EST</small><br><strong>Identified</strong> - Due to an ongoing issue with an underlying AWS service, the data processing pipeline for PRX Metrics is experiencing delays. Beginning around 13:00 UTC, downloads and other numbers in PRX Metrics may appear missing or incomplete. There has been no impact to the availability of audio files or downloads to listeners.</p>tag:status.prx.org,2005:Incident/25483612019-06-14T16:30:24-04:002019-06-14T16:30:24-04:00Metrics not updating for 6/14 downloads<p><small>Jun <var data-var='date'>14</var>, <var data-var='time'>16:30</var> EDT</small><br><strong>Resolved</strong> - Metrics processing is caught up, and all counts should be up to date, including on the metrics site and in ad impression reporting.</p><p><small>Jun <var data-var='date'>14</var>, <var data-var='time'>14:31</var> EDT</small><br><strong>Monitoring</strong> - We have deployed a fix to reprocess the missed data, and it is currently working through the back log; we estimate it will take ~5 hours to catch up. We'll update if that changes or when it is complete.</p><p><small>Jun <var data-var='date'>14</var>, <var data-var='time'>11:33</var> EDT</small><br><strong>Update</strong> - Reverting the upgrade did fix the issue, and data has been processing since.<br />We have identified the exact cause of the issue with that upgrade, and are now working on reloading the missed data. We'll keep this open and update as continue to work on the data reload.</p><p><small>Jun <var data-var='date'>14</var>, <var data-var='time'>10:26</var> EDT</small><br><strong>Identified</strong> - We're seeing an issue in our processing pipeline affecting downloads over night.<br />We believe this was caused by an upgrade made at that time, which has been reverted, and we're now continuing to investigate what happened.</p>tag:status.prx.org,2005:Incident/19565592018-10-08T17:12:13-04:002018-10-08T17:12:13-04:00Planned Upgrades<p><small>Oct <var data-var='date'> 8</var>, <var data-var='time'>17:12</var> EDT</small><br><strong>Resolved</strong> - Marking this resolved, we will continue monitoring.</p><p><small>Oct <var data-var='date'> 8</var>, <var data-var='time'>15:48</var> EDT</small><br><strong>Update</strong> - All systems are working. We have some minor clean up changes to complete, but they will not require downtime or have a significant impact on the system. We'll continue monitoring, but this maintenance work is now complete.</p><p><small>Oct <var data-var='date'> 8</var>, <var data-var='time'>14:33</var> EDT</small><br><strong>Update</strong> - All changes have been deployed, and services moved.<br />Everything seems to working, we are continuing to monitor to make sure things are working as expected.</p><p><small>Oct <var data-var='date'> 8</var>, <var data-var='time'>11:05</var> EDT</small><br><strong>Monitoring</strong> - We are making a planned upgrade to exchange, publish, and metrics systems. We will update as we progress, and expect to be done before 3pm.</p>tag:status.prx.org,2005:Incident/11418532017-02-28T17:21:24-05:002017-02-28T17:21:25-05:00Amazon (AWS) S3 file storage errors<p><small>Feb <var data-var='date'>28</var>, <var data-var='time'>17:21</var> EST</small><br><strong>Resolved</strong> - Amazon has resolved all issues with their S3 service, and PRX services are back to normal operation.</p><p><small>Feb <var data-var='date'>28</var>, <var data-var='time'>13:12</var> EST</small><br><strong>Identified</strong> - Amazon is experiencing issues with access to files stored on their S3 service.
<br />https://status.aws.amazon.com/
<br />
<br />This is causing issues with uploading and downloading audio files and images from PRX and podcasts.</p>tag:status.prx.org,2005:Incident/11332652017-02-22T11:48:19-05:002017-02-22T11:48:20-05:00Server Maintenance<p><small>Feb <var data-var='date'>22</var>, <var data-var='time'>11:48</var> EST</small><br><strong>Resolved</strong> - Complete.</p><p><small>Feb <var data-var='date'>22</var>, <var data-var='time'>11:18</var> EST</small><br><strong>Monitoring</strong> - Provider working on some maintenance for several servers.</p>tag:status.prx.org,2005:Incident/9559962016-10-22T04:19:02-04:002016-10-22T04:19:03-04:00DDoS attack on DNS affecting all services<p><small>Oct <var data-var='date'>22</var>, <var data-var='time'>04:19</var> EDT</small><br><strong>Resolved</strong> - Attacks have not resumed.</p><p><small>Oct <var data-var='date'>21</var>, <var data-var='time'>12:56</var> EDT</small><br><strong>Update</strong> - Attacks resumed and intensified ~45 minutes ago, causing more access issues to prx sites and dovetail hosted podcasts.</p><p><small>Oct <var data-var='date'>21</var>, <var data-var='time'>09:40</var> EDT</small><br><strong>Monitoring</strong> - There is DDos attack against a major DNS provider, it is affecting our www.prx.org, networks.prx.org, and dovetail services.
<br />
<br />https://www.dynstatus.com/incidents/nlr4yrr162t8
<br />http://thenextweb.com/security/2016/10/21/massive-ddos-attack-dyn-dns-causing-havoc-online/
<br />
<br />It appears at this time that systems are recovering, but we will continue to monitor.</p>tag:status.prx.org,2005:Incident/5041152016-05-14T13:20:59-04:002016-05-14T13:21:00-04:00Audio serving timeouts related to ad placement requests<p><small>May <var data-var='date'>14</var>, <var data-var='time'>13:20</var> EDT</small><br><strong>Resolved</strong> - Issues have not returned, marking this resolved. I'll follow up with an explanation from the ad serving provider.</p><p><small>May <var data-var='date'>14</var>, <var data-var='time'>08:56</var> EDT</small><br><strong>Monitoring</strong> - Starting at 8 am ET, we became aware of an issue with responses for ad placements. The ad serving system's performance degraded from .15 to 1.5 secs per response. It has since recovered. Looking back, it appears this issue started shortly after 7 am, but only degraded to the point of causing issues ~8:10 am. The ad serving system recovered at 8:40, and audio and ad serving is now operating normally. We'll continue to monitor this issue and follow up what caused the ad serving issue.</p>tag:status.prx.org,2005:Incident/4455862016-03-18T15:45:38-04:002016-03-18T15:45:38-04:00Slow responding to audio requests for dovetail<p><small>Mar <var data-var='date'>18</var>, <var data-var='time'>15:45</var> EDT</small><br><strong>Resolved</strong> - Adzerk had an issue from 2:55 - 3:20 pm which resulted in extremely slow responses (> 2 seconds or more).
<br />http://status.adzerk.com/incidents/dcxj14cfwgxr
<br />
<br />They have fixed the problem, and things are responding normally now.</p>tag:status.prx.org,2005:Incident/4254692016-02-24T11:42:04-05:002016-02-24T11:42:04-05:00Server Upgrades<p><small>Feb <var data-var='date'>24</var>, <var data-var='time'>11:42</var> EST</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Feb <var data-var='date'>24</var>, <var data-var='time'>11:36</var> EST</small><br><strong>Verifying</strong> - Verification is currently underway for the maintenance items.</p><p><small>Feb <var data-var='date'>24</var>, <var data-var='time'>11:00</var> EST</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Feb <var data-var='date'>23</var>, <var data-var='time'>15:19</var> EST</small><br><strong>Scheduled</strong> - We are making upgrades to the OS on our servers for prx.org and networks.prx.org. There will be a period of downtime starting at ~11am, hopefully brief.</p>tag:status.prx.org,2005:Incident/3906892016-01-05T16:17:22-05:002016-01-05T16:17:23-05:00Increased traffic on www.prx.org<p><small>Jan <var data-var='date'> 5</var>, <var data-var='time'>16:17</var> EST</small><br><strong>Resolved</strong> - We applied changes that are effectively blocking this unexpected traffic.</p><p><small>Jan <var data-var='date'> 5</var>, <var data-var='time'>13:06</var> EST</small><br><strong>Monitoring</strong> - We have seen increased traffic to prx.org, currently 20x normal, seemingly generated programmatically from specific ranges.
<br />
<br />It resembles a DoS attack, but may be poorly created and misbehaving bots that are acting as one.
<br />
<br />In either case, we are taking steps to block and minimize the impact of these requests, and the site is once again responsive.</p>tag:status.prx.org,2005:Incident/3531362015-11-09T10:37:00-05:002015-11-09T10:37:29-05:00Audio file transcoding delayed<p><small>Nov <var data-var='date'> 9</var>, <var data-var='time'>10:37</var> EST</small><br><strong>Resolved</strong> - The system is fully caught up.</p><p><small>Nov <var data-var='date'> 9</var>, <var data-var='time'>07:24</var> EST</small><br><strong>Monitoring</strong> - A process for updating processing status halted unexpectedly. It is working again, but upload processing is currently delayed. We will update when the system has caught back up - could be a few hours.</p>tag:status.prx.org,2005:Incident/3469212015-10-30T02:50:02-04:002015-10-30T02:50:02-04:00Some FTP deliveries delayed from 10/27, are delivering now<p><small>Oct <var data-var='date'>30</var>, <var data-var='time'>02:50</var> EDT</small><br><strong>Resolved</strong> - Deliveries complete and working as expected</p><p><small>Oct <var data-var='date'>29</var>, <var data-var='time'>17:06</var> EDT</small><br><strong>Monitoring</strong> - We identified an issue following a release of new code that caused some deliveries starting after 11:30 AM on 10/27 to not be delivered properly.
<br />The issue has been resolved, and the deliveries are now being sent out.
<br />This incident will remain until we confirm the affected deliveries have been fixed and successfully delievred</p>tag:status.prx.org,2005:Incident/2994792015-08-20T15:25:08-04:002015-08-20T15:25:08-04:00Slower than usual FTP pull subauto deliveries<p><small>Aug <var data-var='date'>20</var>, <var data-var='time'>15:25</var> EDT</small><br><strong>Resolved</strong> - The backlog was eliminated at 12 noon ET; systems are operating normally.</p><p><small>Aug <var data-var='date'>20</var>, <var data-var='time'>11:12</var> EDT</small><br><strong>Monitoring</strong> - Customers with personalized deliveries (custom cart numbers and date ranges) using FTP pull may receive their deliveries up to 6 hours after they have been marked "delivered" on PRX.org. The cause of this issue has been identified and corrected and we should be working through the backlog of audio in the next few hours, depending on your geographic location. The affected files have been reviewed and broadcast schedules should not be affected. If you believe that you have not received the appropriate files in time please contact support.</p>tag:status.prx.org,2005:Incident/2375512015-05-27T13:29:30-04:002015-05-27T13:29:30-04:00Scheduled Upgrades in Datacenter<p><small>May <var data-var='date'>27</var>, <var data-var='time'>13:29</var> EDT</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>May <var data-var='date'>27</var>, <var data-var='time'>13:00</var> EDT</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. All of our colocated servers have been restarted and service should be returning to normal.
<br />
<br />Additional steps resulting in additional outages may be necessary during this window. We will provide updates here as they are made available to us.</p><p><small>May <var data-var='date'>26</var>, <var data-var='time'>13:13</var> EDT</small><br><strong>Scheduled</strong> - Our hosting provider has scheduled upgrades to some of our servers starting at Noon EDT on Wednesday, May 27th and ending at 5:00PM EDT. We expect some downtime during this maintenance window. All PRX websites will be affected.
<br />
<br />Note that FTP Pull servers (which are used for certain SubAuto deliveries) will remain available.
<br />
<br />During the window, we will introduce necessary upgrades to our software and hardware infrastructure. We will work to minimize any downtime, but users of PRX services should ensure that all critical activity which must be finished before 5:00PM EDT Wed May 27th – including uploading scheduled stories, downloading licensed pieces, and pulling stories from Networks – is completed before the start of the maintenance window at 12:00 EDT.</p>tag:status.prx.org,2005:Incident/2069482015-04-26T15:13:56-04:002018-08-01T15:37:05-04:00Delivery status updates and transcoding delayed<p><small>Apr <var data-var='date'>26</var>, <var data-var='time'>15:13</var> EDT</small><br><strong>Resolved</strong> - All servers are caught up, and to be on the safe side, all series have been reprocessed to ensure no deliveries were missed (this will result in some retries of previously successful deliveries).</p><p><small>Apr <var data-var='date'>26</var>, <var data-var='time'>13:27</var> EDT</small><br><strong>Monitoring</strong> - A server handling delivery status updates was not working correctly, and delayed notification of completed deliveries. This also affected updates of completed validations and further processing on file uploads. The system is working correctly now, and we are monitoring to make sure it catches up and completes all processing.</p>tag:status.prx.org,2005:Incident/1802342015-03-03T16:49:09-05:002015-03-03T16:49:10-05:00Database issue on primary server<p><small>Mar <var data-var='date'> 3</var>, <var data-var='time'>16:49</var> EST</small><br><strong>Resolved</strong> - The database on our primary server for prx.org become non-responsive. We have restarted it, and things have recovered. We will continue to monitor to make sure things are working properly.</p>tag:status.prx.org,2005:Incident/1581052015-01-09T20:36:30-05:002015-01-09T20:36:30-05:00AudioVault AVFTPServer can fail to download new files.<p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>20:36</var> EST</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>17:27</var> EST</small><br><strong>Identified</strong> - AudioVault AVFTPServer can fail to download new files.
<br />
<br />If AVFTPServer has a failed download rule from before 1/1, it will continue to look for these older files, causing it to fail to retrieve newer files from the FTP server.
<br />
<br />We have heard from several stations regarding this issue who did not get the latest audio for some PRX shows.
<br />We confirmed the issue with BE, and received from them the following solution:
<br />
<br />- On or After January 1st
<br />- Terminate AVFTPServer
<br />- Open c:\audiovau folder
<br />- Delete all AutoImport#.dir files
<br />- Re-open AVFTPServer
<br />
<br />If you have further issues or questions with AVFTPServer, please contact BE support.</p>tag:status.prx.org,2005:Incident/1353312014-11-13T17:07:50-05:002014-11-13T17:07:51-05:00Emergency maintenance for prx.org at 5 pm ET<p><small>Nov <var data-var='date'>13</var>, <var data-var='time'>17:07</var> EST</small><br><strong>Resolved</strong> - Site is back online, downtime was only a few minutes.</p><p><small>Nov <var data-var='date'>13</var>, <var data-var='time'>16:46</var> EST</small><br><strong>Monitoring</strong> - The load balancer for the prx.org website has a hardware problem and will be replaced at 5 pm ET.
<br />We expect no more than 15 minutes of downtime.</p>tag:status.prx.org,2005:Incident/1329082014-11-06T12:22:36-05:002014-11-06T12:22:36-05:00Delivery and audio processing delayed<p><small>Nov <var data-var='date'> 6</var>, <var data-var='time'>12:22</var> EST</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Nov <var data-var='date'> 6</var>, <var data-var='time'>10:46</var> EST</small><br><strong>Monitoring</strong> - We are processing, and working through the backlog. All top priority jobs are already caught up. We will continue to monitor until the backlog is completed.</p><p><small>Nov <var data-var='date'> 6</var>, <var data-var='time'>10:05</var> EST</small><br><strong>Identified</strong> - We have discovered a problem with deployment and provisioning new processing servers. A service we rely on became unavailable unexpectedly. We are updating our scripts to get rid of this dependency, and restore our processing capability.</p>