Incident log entries and details not loading for some users
Incident Report for PagerDuty

Summary

On May 11, 2017, starting at 15:00 UTC, we suffered a degradation to the data pipeline service responsible for processing incident details. This caused a delay in the display of incident details for approximately thirty minutes. This was followed by an additional thirty minute period of severe degradation to the service, during which incident details were not visible.

What Happened?

A transient network issue caused our data pipeline cluster to enter an inconsistent state, exacerbated by a bug in third-party software that we use. This resulted in a delay in the processing of event details. In the initial response, restarting a subset of cluster control hosts resulted in further state inconsistency, which resulted in the elevated service degradation.

What Are We Doing About This?

We have updated our incident response documentation concerning the data pipeline, to more effectively restore the cluster in the event that it enters an inconsistent state. We also plan to upgrade the cluster control software so that it is no longer affected by the aforementioned bug.

We would like to again apologize for any inconvenience this issue caused. If you have any questions, do not hesitate to contact us at support@pagerduty.com.

Posted 9 months ago. Jun 08, 2017 - 19:12 UTC

Resolved
We have fully recovered at this time. All incident details and log entries should be displayed normally.
Posted 10 months ago. May 11, 2017 - 16:09 UTC
Identified
We have taken steps to mitigate the impact of the aforementioned issue and are in the process of recovering.
Posted 10 months ago. May 11, 2017 - 16:03 UTC
Update
We are still investigating the issue affecting display of incident details and log entries. All other services are working normally.
Posted 10 months ago. May 11, 2017 - 15:53 UTC
Investigating
Incident log entries and details are not being rendered for some customers. Notification delivery and response are working properly at this time.
Posted 10 months ago. May 11, 2017 - 15:37 UTC