windows server 2016

Data Deduplication on Windows Server 2016


First time data deduplication was introduced in Windows Server 2012. In the Windows Server 2016 represented the 3-rd version of deduplication component, significantly revised and improved. In this article we will take a look at the new deduplication features, settings and how it differs from previous implementations.

What’s New in Data Deduplication on Windows Server 2016

  1. First and the most important change in Windows Server 2016 data deduplication is the introduction of multi-threading. Windows Server 2012 R2 deduplication works in a single-threaded mode and can’t use more than one processor core on a single volume. This severely limits the performance, and to bypass this restriction it is necessary to split the disks on several volumes of smaller size. The maximum volume size should not be more than 10Tb.

On Windows Server 2016 a revised engine can run deduplication job in multi-threaded mode, each volume using multiple computing threads and I/O queues. The introduction of multi-threading and other changes in the engine affected the limits on the size of files and volumes. Because deduplication multithreading increases performance and eliminates the need to partition a disk into multiple volumes in Windows Server 2016, you can use deduplication for  64Tb volume. Also the maximum file size has increased, file deduplication is now supported up to 1TB.

  1. Support for virtualized backup applications. In Windows Server 2012 there was only one type of deduplication, designed primarily for the ordinary file servers. Deduplication of continuously running VM is not supported, since deduplication does not know how to work with open files.

Windows Server 2012 R2 deduplication started to use VSS, respectively, started to support deduplication of virtual machines. For such tasks there is a separate type of deduplication.

READ ALSO  Destination Path Too Long Error When Moving/Copying a File

In Windows Server 2016 added another, a 3-rd type of deduplication, designed specifically for virtualized backup servers (eg. DPM).

  1. Nano Server Support. Nano Server this option allows to deploy Windows Server 2016 operating system with a minimum number of installed components. Nano Server fully supports deduplication.
  2. Support of Cluster OS Rolling Upgrade. Cluster OS Rolling Upgrade a new feature of Windows Server 2016, which can be used in sequence to upgrade the operating system on each cluster node from Server 2012 R2 to Server 2016 without stopping the cluster. This is possible thanks to a special mixed-mode operation of the cluster, when it nodes at the same time can work under Windows Server 2012 R2 and Windows Server 2016.

Mixed mode means that the same data may be located at nodes with different versions of deduplication. Deduplication in Windows Server 2016 supports this mode and provides access to deduplicated data during the cluster upgrade process.

How to Install and Enable Deduplication Feature on Windows 2016

The first thing you need to enable the deduplication to install the appropriate server role. You can use «Server Manager». Run the Server Role wizard and add the file server role with the component «Data Deduplication».

data deduplication roles

Or execute the following PowerShell command:

Install-WindowsFeature -Name FS-Data-Deduplication -IncludeAllSubfeature -IncludeManagementTools

data deduplication powershell install

How to Enable and Configure Deduplication

After installing the components you need to enable deduplication for a specific volume (or multiple volumes). This can be done in 2 different ways from the graphics snap in or using PowerShell.

READ ALSO  How to remove the Welcome to your new Office screen using Group Policies

To configure component from the GUI, open the Server Manager, go to the section File and storage services -> Volumes, select the desired volume, right click and from the menu select Configure Data Deduplication.

data deduplication configure volumes

Then select the desired type of deduplication (General puprose file server, for example) and press Apply. Additionally, you can specify the types of files that should not be exposed to deduplication as well as to exclude certain directories.

deduplication settings windows server 2016

Next you need to set up a schedule by which the deduplication job will work. Сlick on the button Set Deduplication Schedule.

By default, the background optimization is enabled, and you can configure 2 additional tasks of throughput optimization. Here you will find a few settings available you can only select a days of the week, start time and work duration.

deduplication schedule

PowerShell provides you with many options to customize the deduplication. To enable deduplication, use the following command:

Enable-DedupVolume -Name D: -UsageType HyperV

List current deduplication jobs:

Get-DedupSchedule

As you can see, in the addition to the background task optimization, there are priority optimization job (PriorityOptimization), as well as jobs of garbage collection (GarbageCollection) and cleaning (Scrubbing). All these tasks can’t be seen from the GUI.

get dedup schedule

PowerShell allows you to fine-tune the parameters of the Dedup jobs. For example, create a new optimization task. The task should be started at 9 AM Monday through Friday and work for 11 hours, with normal priority, use no more than 20% of RAM and 20% CPU:

New-DedupSchedule -Name ThroughputOptimization -Type Optimization -Days @(1,2,3,4,5) -DurationHours 11 -Start (Get-Date ″12/8/2016 9:00 PM″) -Memory 20 -Cores 20 -Priority Normal

And disable the priority optimization:

Set-DedupSchedule -Name PriorityOptimization -Enabled $false

Manual deduplication run

If necessary, you can run deduplication job manually. For example, run a full optimization of the volume D with the highest priority:

Start-DedupJob -Volume D: -Type Optimization -Memory 75 -Cores 100 -Priority High -Full

start dedup job

Keep track of running deduplication jobs, you can use the command Get-DedupJob. Note that simultaneously can run only one task and the rest are in the queue and wait for its completion.

READ ALSO  How to Use All of Your Internet Plan Data

Viewing deduplication state

Data Deduplication state for the volume can be viewed using:

Get-DedupVolume -Volume D: | fl

So you can see the basic parameters of the volume the total volume size, used and saved space, compression level etc.

get dedup volume

To check the status of deduplication job use the command:

Get-DedupStatus -Volume D: | fl

get dedup status

How to disable data deduplication?

You can disable deduplication on a volume from GUI or by using PowerShell. For example:

Disable-DedupVolume -Name D:

disable dedup volume

Turning off deduplication for volume cancels all scheduled tasks. It also prevents the run of any deduplication tasks, except read-only operations (commands such as Get and unoptimization). The data can remain in the same condition in which it was before you turned off deduplication, just stop deduplication for a new files.

To return the data to its original state, use the procedure of un-deduplication. For example, the following command to disable deduplication for Volume D with the highest possible speed:

Start-DedupJob -Volume D: -Type Unoptimization -Memory 100 -Cores 100 -Priority High -Full

unoptimization dedup job

Please note that the additional space is required. If the free space on the volume is not enough, then the procedure will fail.


You may also like:

Configure Access Based Enumeration on Windows Serv... By default, when user open some shared network folder, SMB displays full list of files and folders on it (of course only if user have permission to ac...