Incident description

At 14:00 on Saturday the 7th of February geant.net domain was not being resolved by ROOT servers.

Incident severity: CRITICAL

Data loss: NO

Affected Services 

  • Dashboard
  • mail.geant.net
  • repositories.geant.net
  • All the systems trying to send email to user on dante.net and geant.net

Cause

Our DNS was routing requests to geant.net and dante.net to external ROOT servers. This configuration was in place for around 2 years, but suddenly the ROOT servers missed the records for our domains. 

Bind, running on the Consul servers, was setup to route as following: 

  • domain service.ha.geant.net was forwarded to local consul on port 8600
  • win.dante.org.uk and geant.local were forwarded to Windows servers geantdc01.geant.local and geantdc02.geant.local
  • everything else was going to Internet ROOT servers

Resolution.

Bind was changed to forward as following:

  • domain service.ha.geant.net remains the forward to local consul on port 8600
  • geant.local remains the forward to geantdc01.geant.local and geantdc02.geant.local
  • win.dante.org.uk is now forwarded to am-prd-dc01.win.dante.org.uk and am-prd-dc02.win.dante.org.uk
  • geant.net, geant.org and dante.net are now forwarded to infoblox grid members 62.40.104.250, 62.40.116.122, 62.40.116.114
  • everything else remains on Internet ROOT servers




  • No labels