It was a long time since I published my post about fix of slowness DPM 2016 with Modern Backup Storage [Solved] Slow/hangs ReFs/DPM 2016 with Modern Backup Storage. Unfortunately we spent few more months to solved it completely. Both teams from Microsoft (DPM and Storage) worked on this case.
Basically I got same problems what I’ve had before:
- DPM jobs never completed in time
- Some of DPM jobs are failed
- DPM servers hang (no RDP, no WinRM, but ping is up)
We have considered to revert our backup servers to DPM 2012 R2, but there are a lot of drawbacks:
- We will get again LDM limitation as we had before and it was the reason why we decided to migrate to for DPM 2016 with Modern Backup Storage
- DPM 2012 R2 is old product
- You will need not only to reinstall DPM, but operating system as well.
I don’t want to dive into details and steps of our communication with Microsoft, just post some recommendations which are helped us:
If you will ask me or Microsoft, how much RAM do I need – nobody can answer. It’s really depends on your workload. If you backup only few VMs or Exchange database, maybe 32GB will be enough for you. In my case there are hundreds of VMs and Exchange databases, it’s completely different story. ReFS likes a lot of RAM 😊
Back to our case. Our servers were purchased with 24 GB/32 GB RAM. It perfectly fine for DPM 2012 R2, but is not enough for DPM 2016 with Modern Backup Storage. So we doubled RAM size up to 64 GB on some servers. Maybe it’s overkill, but we can’t assume correct RAM size.
Disable Storage Calculation
I don’t suggest you disable storage calculation before RAM increase. In my case servers became unstable and I have to enabled storage calculation again. Follow this link for details – https://docs.microsoft.com/en-us/system-center/dpm/dpm-release-notes?view=sc-dpm-1801
Disable storage calculation – Program Files\Microsoft System Center 2016\DPM\DPM\bin\Manage-DPMDSStorageSizeUpdate.ps1 StopSizeAutoUpdate
Enable storage calculation – Program Files\Microsoft System Center 2016\DPM\DPM\bin\Manage-DPMDSStorageSizeUpdate.ps1 StartSizeAutoUpdate
Configure WMI handle count
Check your Windows events logs and probably you will see some WMI operations are failed. Increase WMI handle count up to 8 GB / 12 GB (default setting for Windows 2016 is 4 GB), this article explains everything in details – https://blogs.technet.microsoft.com/askperf/2014/08/12/wmi-how-to-troubleshoot-wmi-high-handle-count/
I suggest to increase WMI handle count even you don’t see any errors in logs.
Optional: set ReFS and DPM registry keys
I think with larger RAM size you don’t need to tune ReFS and you can live with default settings.
But I will post my reg settings, just remember your environment is different than mine and you need to adopt them
Windows Registry Editor Version 5.00[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem] “DisableDeleteNotification”=dword:00000000
Windows Registry Editor Version 5.00[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Data Protection Manager\Configuration\DiskStorage] “DisableReFSStorageComputation”=”1”
What is final result?
Servers are not hang anymore. No more failed jobs. Most of the jobs are completed in time, but some of them delays for few hours. Maybe I have to play with reg keys to improve situation. But it’s much better than before.