With Windows Server 2008 R2 reaching end of life in January 2020, many organizations have been migrating their workloads to Windows Server 2016 or newer. Unfortunately with Active Directory (AD), an attempt to introduce a first Windows Server 2019 or version 1709 domain controller (DC) in a domain that still uses the File Replication Service (FRS) engine to replicate SYSVOL content fails. FRS deprecation intentionally blocks the installation. Because of this, a migration to the Distributed File System Replication (DFSR) service is required before the AD upgrade. My aim in this blog post is to highlight a change that was introduced in Windows Server 2008 R2 and 2012 (not R2) that may cause headaches if not catered for in your upgrade plan. That is, how the DFSR service recovers from dirty or unexpected shutdown (manual versus automatic).
A fellow blogger Jose Rodrigues posted useful information on the importance of the Microsoft Product Lifecycle Dashboard, which can help identify if products are no longer supported or reaching end of life and keep your environment supported.
FRS to DFSR Migration
The process of migrating from FRS to DFSR is pretty straightforward and not in the scope of this post. However, the following series provides good guidance on the migration journey:
- SYSVOL Migration Series: Part 1 – Introduction to the SYSVOL migration process
- SYSVOL Migration Series: Part 2 – Dfsrmig.exe: The SYSVOL migration tool
- SYSVOL Migration Series: Part 3 – Migrating to the ‘PREPARED’ state
- SYSVOL Migration Series: Part 4 – Migrating to the ‘REDIRECTED’ state
- SYSVOL Migration Series: Part 5 – Migrating to the ‘ELIMINATED’ state
It is always recommended to test any changes in a designated, isolated test environment before rolling out in production. In my case, I am using one of the Azure Quickstart Templates (https://azure.microsoft.com/en-us/resources/templates/) to build a lab in my Azure subscription with just a few mouse clicks:
Check them out if you have an Azure subscription as they save a lot of time and effort. A nice place to start would be to create a free trial here if you don’t already have a subscription.
For the purpose of this blog, I created a single-domain forest with domain controllers running Windows Server 2012.
DFSR Dirty Shutdown Recovery
The Understanding DFSR Dirty (Unexpected) Shutdown Recovery blog post does a great job in explaining how DFSR recovers from an unexpected shutdown. This document also points out a change to the DFS Replication (DFSR) service for Windows Server 2008 R2 through hotfix 2663685. The change is that the DFSR service no longer performs automatic recovery of the Extensible Storage Engine database after the database experiences a dirty shutdown. Instead, when the new DFSR behaviour is triggered, event ID 2213 is logged in the DFSR log. An administrator must manually resume replication after a dirty shutdown is detected by DFSR. This change in behaviour also applies to Windows Server 2012, but not in later versions of Windows Server.
Taking a look at a Windows Server 2012 DC using DFSR for replicating SYSVOL content, this is what would be in the registry by default – the StopReplicationOnAutoRecovery key is set to 1 (automatic recovery is turned off):
With this configuration in place, this is what would happen after an unexpected shutdown. A warning event 2213 is logged in the DFSR log indicating that the DFS Replication service stopped replication on the volume. This event contains important information on how to recover from this situation and manual intervention is required.
Manual Recovery Steps
Event 2213 suggests that the administrator performs the following actions to recover:
- Back up the files in all replicated folders on the volume. Failure to do so may result in data loss due to unexpected conflict resolution during the recovery of the replicated folders.
- To resume the replication for this volume, use the WMI method ResumeReplication of the DfsrVolumeConfig class. For example, from an elevated command prompt, type the following command:
wmic /namespace:\\root\microsoftdfs path dfsrVolumeConfig where volumeGuid=”<GUID>” call ResumeReplication
Step 1 is self-explanatory. For the second step, just copy and paste the command provided in event 2213 into a command prompt as follows:
The GUID is included in the warning event 2213 so there is no additional effort required here.
After this has been performed, the following event (2212) will be logged stating that the DFS Replication service has detected an unexpected shutdown on the volume. This event further states that the service will rebuild the database if it determines it cannot reliably recover.
This will be followed by two informational events if everything went well:
- Event 2218 – The DFS Replication service is in the second step of replication database consistency checks after an unexpected shutdown. The database will be rebuilt if it cannot be recovered.
- Event 2214 – The DFS Replication service successfully recovered from an unexpected shutdown on the volume. This can occur if the service terminated abnormally (due to a power loss, for example) or an error occurred on the volume.
Recommendations for Domain Controllers
From this document, it is clear that the recommendation is to disable the Stop Replication functionality.
To enable automatic recovery, set the following registry key to the value of zero:
Once this is in place, DFSR will automatically recover from unexpected shutdowns. The same events we saw with manual recovery will be logged, but no user intervention is required with this configuration in place.
To sum up…
Migrating from FRS to DRSR is a straight forward process. Please just add an additional checkbox in your migration/upgrade plan to ensure that manual recovery will not cause unnecessary headaches while you proceed with the journey to eradicating systems reaching end of life soon.
Till next time…