13 Mar Protecting your MailStore archive against disk corruption
So you are using MailStore to archive all your company email, and to manage mailbox sizes you have chosen to delete all email from users mailboxes older than 1 year. Its all working great, but then you receive a support call from one of your customers who is having issues reading an email that is only in their archive. All they get is a content unavailable error when they select the message in the archive?
So what’s going on?
Well, the error would suggest corruption somewhere within the MailStore archive store content folders. This may be due to local disk corruption, a power outage during an archive job or if using network storage, typically as a result of a temporary network outages.
The result is the same, the data has become damaged and it will need to be either recovered from a backup or the damaged data can be removed.
MailStore makes it straight forward to find out what data is corrupted by running a manual data integrity check against the effected archive store (or all stores if you are unsure). This will point out which data ‘.dat’ content files are affected and would need to be restored from a backup.
But when did the corruption happen and what backup do we need?
This is the main issue administrators come up against, you don’t initially know when the corruption happened and so it’s a bit trial and error looking for an old backup that contains a healthy backup of the affected archive store data you need to recover from.
Planning for this potential issue
There are some great features in MailStore to help fully automate the built in data integrity checks and if used alongside status email reports this can help highlight soon after a data corruption has happened .
The ‘Check Data Integrity’ task can check all of the storage location data MailStore has currently attached and will cross reference the database entries against the message data, checking the database entries point to actual data on the disk. This process can be time-consuming on large archives but can be scheduled to run out of hours.
We recommend you configure the job to run every day, but if you find its taking too long to run daily, a schedule at least once a week should be fine:
It’s worth noting that when the check only happens once a week there is the potential that your backups could backup the corrupted data for a whole week before you would be aware of an issue.
To help gauge this we recommend running the data integrity check once manually over a weekend and take a note of how long it takes, this will help you work out how often the check can run.
How will I know when a data integrity check has found an issue?
MailStore can run a daily status report that sends an email to summarise not only the jobs that have run that day, but also includes the Check Data Integrity job. This would very quickly alert you if the integrity check had failed for any reason prompting you to log into MailStore to investigate in more detail:
By receiving a daily alert you could revert to a previous backup very quickly with very little impact.
If administrators also deploy a good backup strategy this should protect against any data corruption that can occur on the hardware used by MailStore.
Look here for more information on configuring the Data Integrity Check and status report.
Backup, Backup and Backup
Any customer who is using Mailstore will understand the importance of the data it contains. This is especially so if MailStore is being used to delete any old email in the mail platform. As a result, a good Backup strategy is an absolute necessity.
So what is a good strategy?
Well, let’s look at the options.
MailStore built in backup job
This feature can be used to take a single mirror copy of all the storage locations (where the archive email lives) along with the Master Database (the MailStore configuration). This job creates a single regularly updated backup that can be stored on a local disk or a remote windows share.
Positives
- Simple Mirror backup that does not need to be restored to be used.
- If stored on a remote share it’s very quick to fire up a new MailStore server in the event the primary server hardware dies.
- The backup is very quick after the first run as only changed data is updated.
- Ideal to help provide a quick standby server option.
Negatives
- No history is stored, every time the backup runs the destination is overwritten with the current data set.
- Corruption can be copied to destination.
- Should not be relied on as the only backup solution.
3rd Party Backup solutions like BackupAssist
MailStore integrates very well with 3rd party backup solutions through the built-in Volume Shadow Copy Service support. This allows a backup task to run even when the MailStore server is running. There will be a short pause while the database transactions are written and a VSS snapshot is taken but if scheduled to run in a quiet time there should be no interruption. There is an option to exclude the Search indexes to minimise the backup time, as these can always be recreated if needed from a healthy data set.
Positives
- Short incremental backup times (MailStore data set is made up of multiple data files that are ideally suited to both file and block based backup solutions).
- De-duplicated data means backups are also kept small allowing for multiple revisions to be kept efficiently.
- A large number of backup revisions provides a big time window to spot and resolve any potential corruption issued before backups are overwritten.
- Multiple archive stores allows for granular restore of only damaged stores to be done if needed, especially useful if only old stores are damaged.
- Ability to take backups offsite to help meet 3-2-1 backup requirement.
Negatives
- Additional cost for the backup software.
- Some extra work to setup initially.
- Extra space required to store multiple backups.
The key requirement to fix any corruption issue is going to be the ability to revert to a backup prior to the corruption. For this reason, we would strongly recommend that you have backup revisions that go back at least several weeks if at all possible to give you enough time to spot and recover from a potential data error.
Armed with this backup strategy puts you in a much better place to deal with a potential disk corruption should it happen.