Storage Management

Activate your FREE membership today  |  Log-in

  • Visit other TechTarget ANZ sites: 
Posted
Jun 30, 2008
 |  By:  Brian Peterson

The three types of data de-duplication

Bookmark and Share

In the storage business, data de-duplication is all the rage. Customers are clamoring to cash in on the savings, because it offers a number of improvements over traditional storage for backups. But with those benefits comes a confusing set of questions, the key one being: How do we choose the best de-dupe technology? In answering that question, it's important not to jump ahead to focus on specific products -- by first choosing product type, whether it be host-based, VTL-based or NAS-based, you can simplify the decision process.

Here's how they break down.

Host-based data de-duplication

Host-based de-duplication requires the backup client to do a lot of the de-dupe work. In many cases, that's not a problem, especially when the client is not CPU-bound. Host-based de-dupe really helps when backup bandwidth is constrained by small wide area network (WAN) pipes or consolidated virtual servers.

Host-based data de-duplication solutions usually require you to replace traditional backup software with the de-dupe backup software, so before you recommend such a change, make sure that the benefits are significant enough.
Remote office backups to the corporate site will benefit from host-based de-duplication because it eliminates most or all of the backup hardware located at the remote site and optimises the network bandwidth required to centralise backups to corporate data centers. VMware backups benefit from host-based de-duplication by limiting the network bandwidth required to back up multiple guest machines concurrently.

Virtual tape library (VTL) data de-duplication

De-duped virtual tape libraries (VTLs) work well when the backups are localised to the data center and/or bandwidth between the client and backup storage is not an issue. Naturally, many customers will want to take advantage of de-duplication in their existing or planned virtual tape infrastructure.

VTLs are already very common in mid-sized and large enterprises and consume a significant part of many companies' overall storage budget. De-duping at the VTL should be simple for customers because almost all backup software platforms support VTLs. In addition, de-duped VTLs are a good fit for disaster recovery replication and when the customer wants to replace tape for primary backups. Given the increased efficiency and de-duped VTL-to-VTL replication, there may finally be an opportunity to show real ROI for backup to disk instead of tape.

Primary network-attached storage (NAS) data de-duplication

VTLs introduce a lot of the same challenges that physical tape presents, such as tape contention, poor cartridge utilisation and intolerance to high storage area network (SAN) latencies. In some cases, customers want the benefits of target hardware-based de-duplication without the complexity and limitations of tape. In these cases, de-duped NAS file systems may be the perfect remedy. De-duped NAS storage has some impressive cost advantages because it doesn't require SAN connections or VTL licensing in the backup software. In some cases, the de-duped NAS storage can be used for more than just backups, such as highly duplicate archive data where throughput is less important than space savings.


TechTarget ANZ sites: SearchCIO.com.au | SearchNetworking.com.au | SearchSecurity.com.au | SearchStorage.com.au | SearchVoIP.com.au

WF Online community sites: ElectricalSolutions | ElectronicsOnline | FoodProcessing | InMotionOnline | LabOnline | ProcessOnline | RadioComms | SafetySolutions | SustainabilityMatters | Voice&Data

Copyright © 2010 Westwick-Farrow Pty Ltd. All rights reserved.
About Us | Contact Us | TechTarget