IIX Jakarta Traffic Outage — Root Cause Identified (Oct 30 2025)
Observed Evidence
Below are screenshots from the IIX monitoring portal showing the traffic patterns for multiple members.
The first graph (Huawei Cloud – Arista / 136907) illustrates throughput dropping to near zero between ~01:30 and ~02:30 local time, remaining flat until 08:00. Similar patterns were seen across other ISPs and CSPs on the exchange, suggesting a common point of failure inside IIX fabric or core switching plane.
Operator Incident Statement
Root Cause: The service interruption was caused by looping on the IIX side due to one of its members.
Corrective Action: As a precaution, our team temporarily disabled the port to IIX.
This means that a looping broadcast storm or L2 loop originated from one IIX participant, causing congestion and instability across the exchange fabric. To contain the impact, the affected operator isolated its port until IIX restored normal operation.
Impact Assessment
- All major cloud providers (Huawei Cloud, Akamai, Tencent Cloud, etc.) connected to IIX showed traffic loss in the same period.
- End users experienced intermittent connection failures to local Indonesian services between 01:30 and 08:00 WIB.
- Traffic was temporarily rerouted to Singapore IX and other paths, causing higher latency and packet loss for domestic traffic.
Timeline of Events (Local Time WIB)
| Time | Event |
|---|---|
| 01:30 – 02:30 | Traffic drops observed across multiple IIX members. |
| 02:45 | Operators start isolating affected ports to prevent loop propagation. |
| 03:00 – 07:30 | IIX traffic remains near zero; looping issue under investigation. |
| 08:00 | Traffic restoration visible on IIX NMS graphs for Huawei Cloud and others. |
| 09:00 | Operator confirms loop source identified and port re-enabled after stabilization. |
Technical Interpretation
The reported loop is likely a Layer-2 broadcast or spanning-tree loop caused by misconfiguration or failure of a member switch. When such a loop occurs inside a large exchange like IIX, it can rapidly amplify traffic until interfaces become congested and MAC tables unstable, leading to wide-scale traffic loss. Temporarily disabling the affected port is a standard containment measure to allow the fabric to converge and clear the loop.
Next Steps and Recommendations
- IIX to conduct member-level audit of port loop prevention mechanisms (STP, BPDU Guard, Loop Protection).
- All participants should enable edge-port guard features to mitigate future loops.
- Operators to retain NMS logs and packet captures for root-cause correlation.
- Exchange fabric should review storm-control thresholds and implement loop-detection alarms.