Upstream (Cogent) routing issue resolved
(Update to the web only post)
Bluebox along with our direct upstream providers have isolated out our
exposure surrounding traffic crossing Cogent between 11:25am —> 11:35am
(pacific). Neither Buebox nor our direct providers utilize Cogent directly
meaning that the impact was minimal during this event.
From our graphing we witnessed between an 8 to 10 percent drop of inbound
traffic while Cogent was having severe packet loss to AT&T and others. The
major backbone players seem to be stable at the moment with some loss
between verizon & Level 3, but not nearly at the scale Cogent witnessed.
All systems are and remain in full functioning order.
Thank You,
BlueBox team
(previous web only post below)
Upstream Routing Issue
Currently we are investigating an upstream routing issue in which Cogent is
dropping vast amounts of traffic. BlueBox does not utilize Cogent as a
direct provider, however some users trying to reach BlueBox (or any other
sites) via Cogent are likely to encounter packetloss. We are taking
corrective action on our side to minimize impact from this.
Impact appears to be corrected now and appeared to have impacted roughly
10% of incoming traffic to our Seattle Primary Datacenter for a duration of
roughly 10 minutes.
(Web Post Only)
Re: lb04 dropping some packets
The lb04 issue has been resolved. It was tracked down to one instance
which was seeing unusual traffic patterns. We’ve resolved this and are
continuing to monitor the site.
(Web Only Site)
lb04 dropping some packets
One load balancer (lb04) is currently experiencing some packet drops.
No other load balancers are affected at this time.
We’re investigating and should have a resolution shortly.
Currently only a few sites are affected but we will keep monitoring
to see if it gets worse - should that happen we’ll fail over to the
redundant load balancer, but we will update you all if that is
necessary.
(Web Only Post)
Temporary Upstream Network Issue [resolved]
One of our upstream network providers experienced a network issue at
about 12:39am PDT (GMT -7) this morning, causing a BGP change.
Customers may have seen downtime lasting a 1 - 3 minutes during this
time. No residual network issues have been noticed, however the
upstream provider is working on hardware at this time, which may
result in one more BGP change when their work has finished. Please
open a ticket with Blue Box if you have any questions.
(web only post)
DNS issues [resolved]
Our primary DNS resolver used within Blue Box Group has been experiencing some issues which resulted in a disruption of services for some customers. While the secondary resolver should become active in such cases this did not happen for some customers. We did get the primary resolver running so everything should be functional again. Please let us know if your site is still experiencing any issues as a result of this DNS disruption.
Datacenter Maintenance Tonight - Seattle Primary DC
Maintenance has been completed, was non-intrusive to all services and all
areas of focus were completed and 100% successful.
(Web only post)
Datacenter Maintenance Tonight - Seattle Primary DC
Tonight starting at 11:00 pm Pacific we will be performing some physical
maintenance in one area of our Primary Seattle datacenter location. This
maintenance should be non-intrusive to all services and no impact is
expected. Estimated duration is 2 hours, 30 minutes and should be
completed no later then 02:00 am Pacific.
(Web post only)
Network Maintenance Complete
Maintenance has been completed, was non-intrusive to all services and all
areas of focus were completed and 100% successful.
(Web only post)
Network Maintenance Tonight
Tonight at 11:00pm PDT (06:00 UTC) we will be replacing a redundant stacked
device in our distribution layer which has shown to be unstable. No packet
loss, latency increases or other problems are anticipated as a result of
this work. However, as it does affect one of our distribution layers of
our networking gear the potential of a brief interruption of service does
exist.
Please note that this upgrade is to replace one device in a redundant
stack, we do not anticipate any issues in performing this work.
(Web only post)