27 Jan Rsync Vs. file replication – which is best for your local backups?
I often get asked by customers which type of job they should choose when backing up files across a local network from one server to another. There are a range of considerations to make in order to fully answer this question, but the two areas I want to focus on in this post are cost and performance.
You may be wondering why I’d suggest using rsync for jobs running within a local network, as typically this technology is associated with bandwidth efficient backups over the Internet, and so usually in the context of an off-site solution.
This is indeed true, and rsync is still the best choice for that purpose, but I also see customers adopting rsync for internal traffic as it can provide significant cost savings in terms of licencing, particularly for multiple server setups.
Instead of purchasing a full copy of BackupAssist, if it’s only rsync jobs you’re looking to run, you can purchase the “rsync standalone” license. I won’t quote exact prices as they can change but you’ll see the difference in costs on our Web site pricing page here. Please note however that the full product plus the “rsync add-on” is still the best way to buy BackupAssist if you want to compliment your rsync transfer with say an image or file replication job.
Question – do an rsync and file replication job backing up the same files, transfer the same amount of data over the network? And do they take roughly the same amount of time to run?
Well the simple answer is no, and to highlight the major differences I’ve set up a simple test…
I set up two jobs in BackupAssist to mirror 39GB of changing files from one server to another using both Rsync and file replication to a network share. The jobs were set up as a simple mirror and update the backup daily. We are running a standard 100Mbps LAN with very low specification workstation PC’s, but as we’re only interested in a comparison this doesn’t matter, you may well see much better results but the relative difference should be similar.
The results are quite interesting.
The backup is a total of 39GB made up from 198000 files, with an average of 300MB (2000 files) changed each day.
Average time to run over a week
File replication: 33mins
Average data sent per day over the Network
File Replication: 368MB
As you would expect the Rsync job sends less data as it is only sending the bits of files that have changed and on larger backup jobs this can make a big difference. In our example we achieved over 83% saving in bandwidth over file replication.
However, the extra time Rsync needs in this case to analyse the files and what’s changed, and then to calculate the file deltas is significant which is why the job actually takes over twice as long.
One interesting thing we noticed is that this extra time is mostly taken up generating the media usage report, so if you can live without this you can cut almost half an hour of the job time by turning this option off.
Another point to note is that for our tests we didn’t use the ‘LAN optimisation option’ within the Rsync options – turning this on had the opposite effect to what I’d expected. More data was transmitted because it was copying the whole files that had changed, taking an additional half an hour to run.
So to summarise, for the fastest job possible over a LAN I’d recommend you stick to file replication, but just be aware that you will put more load on the network while the job is running. If cost is a primary concern then using Rsync is still a good option, but you may want to turn off the ‘Lan optimization’ and ‘media report’ to keep your jobs as short as possible.
I hope that helps you decide which is the best job for your requirements – don’t forget, there’s a 30-day trial available so there’s no need to commit to either before checking out the results for yourself.