Recent System Failures November 21st, 2007
Dear Customer,
Over the last few days there have been a number of problems on the AGUK network.
On Sunday 18th November at around 11:30GMT the performance of the server CHIEFWIGGUM start to deteriorate. Despite efforts by engineers the erver became unrecoverable. It appears the raid system had failed and corrupted both disk drives within the machine. At 23:00GMT on the same day it was decided that the system could not be fully recovered and that instead we would begin restoring sites to alternative servers using backup data.
Posted in Network Outages by
Andy @ AGUK | 6 Comments »
Network Outage November 15th, 2007
Today 15th November 2007 there was a network outage from 08:00 - 09:25GMT.
The network is now fully restored. During the outage no email was lost as our backup incoming gateway servers are based on an alternate network. When the network returned mail was then forwarded to the main mail server.
We are waiting on a response from the network providers as to the cause of the outage and will provide further details when these become available.
We apologise for any inconvenience this may have caused.
Posted in Network Outages by
Andy @ AGUK | No Comments »
Mail Server Issues October 5th, 2007
This morning (5th October 2007) we became aware of an issue with the mail server platform. SMTP connections to the main mail server were extremely slow or being rejected.
This meant that customers were unable to send mail or receive mail. The incoming gateway servers continued to work normally. Therefore incoming mail was accepted and queued by these servers.
We managed to resolve the problem with the main mail server at around 12:10GMT. The incoming gateway servers were then able to begin forwarding the queued mail to the main mail server for delivery to individual accounts.
Posted in Network Outages by
Andy @ AGUK | No Comments »
Spam Server Failure June 1st, 2007
Over the last two days we have become aware of an increase in the amount of spam arriving in customers inboxes. The cause has been identified as a failure on one of our inbound gateways.
Becuase of how the system works this did not affect the delivery of mail. The fallback is to simply deliver all mail. This failure only affected one of our gateways. The other gateways continued to filter correctly. This meant the increase in spam was only small.
The problem has now been rectified and customers inboxes should again become less burdened with spam.
Posted in Network Outages by
Andy @ AGUK | No Comments »
BART Server Outage May 14th, 2007
Today [14 May 2007] at approximately 08:15GMT we became aware of an issue affecting the server BART. The server was not responding to web or ping requests.
This server hosts web sites and the main stats server. As a result websites hosted on this server could not be accessed nor could the statistical data for any website hosted on the network.
The server was rebooted. However the server failed to restart. Engineers began checking the server system. It was discovered that the servers RAID array had become corrupt and needed to be reset and rebuilt.
Posted in Network Outages by
Andy @ AGUK | 2 Comments »
POP3/IMAP Outage February 28th, 2007
We have discovered that at around 15:00GMT today the POP3/IMAP service failed. This meant that POP3/IMAP connections were being rejected by the mail server.
This was rectified at 17:04GMT. The cause of the outage is currently unknown and is being investigated. We are also investigating why our notification script which periodically logs in to the POP3 server failed to notify us of the error.
An investigation is now taking place.
Further details will be forthcoming.
Posted in Network Outages by
Andy @ AGUK | No Comments »
POP3/IMAP Server Outage February 5th, 2007
From approximately 20:00GMT Saturday 3rd February 2006 the mail server began producing errors when customers attempted to check mail via POP3 or IMAP.
We did not become aware of the problem until approximately 10:00GMT Monday 5th February 2006. The problem was then resolved by 10:05GMT Monday 5th February 2006.
During the entire period the mail server continued to receive incoming mail and web mail access continued to be available.
Our monitoring system failed to indicate there was an error as it works via checking the appropriate system ports are responding. In this case the system continued to respond and the error was actually during the authentication process therefore not detectable by our system monitoring.
Posted in Network Outages by
Andy @ AGUK | No Comments »
CHIEFWIGGUM Server Outage January 23rd, 2007
At approximately 12:03GMT today (Tuesday 23rd January 2006) the server CHIEFWIGGUM became unresponsive and failed to reboot to normal operation.
As a result the server was taken offline. This meant downtime for all websites hosted on CHIEFWIGGUM and also Helm control panel access was suspended.
Service was not restored until 14:25GMT.
Total downtime 2 hours and 22 minutes.
The issue appears to have been with a corrupt driver on the server which has since been restored.
Our apologies to all customers affected by the outage.
Posted in Network Outages by
Andy @ AGUK | No Comments »
Spam Filtering Service Outage January 22nd, 2007
One of our incoming gateway servers suffered an error with the greylisting processing. This means a higher than normal amount of spam may have been received over the last 6 hours.
Due to the design of the system this did not affect delivery of any mail. Users may however have seen a slight increase in the amount of spam received.
This issue has now been resolved and the service is running as expected.
Posted in Network Outages by
Andy @ AGUK | No Comments »
Mail Server Problems November 30th, 2006
We experienced a significant software failure with our main mail server today at approximately 15:00.
This issue caused extremely slow performance and caused a large build up of unprocessed mail to hang on the server and not be delivered. This affected both incoming and outgoing mail.
We resolved the software issue at approximately 17:30. Since then mail is now being processed. However there is a back log of over 10,000 messages which are currently being processed. It will take at least a couple of hours for this backlog to be fully cleared.
Posted in Network Outages by
Andy @ AGUK | 1 Comment »