*** Update 2 ***
Fixed issue with SCOMprecentageCPUTime where it would also measure other processes with healthservice as part of it's name.
*** Update ***
I have done some extensive refactoring of the code.
SCOMprecentageCPUTime could occationally terminate in a divide by zero error
SCOMprecentageCPUTime input did not translate as bool making diagnostic logic not work
Ability to run scripts in PS console and get detailed output for troubleshooting
I have moved the fixes for the internal MP here:
This is a temporary fix for rules and monitors in the System Center Core Monitoring MP shipped with SCOM 2016 (UR3). Issues arise when using WinRM to extract WMI information for some configurations. The issue is reported to Microsoft, though until they make a fix this is the only workaround except from disabling them.
Below is a summary of the MP resources that have problems:
WMI Health Monitor
Agent processor utilization
Collect agent processor utilization
The attached MP will disable all these rules and monitors and replace them with similar named ones though with addendum added to the end. As a bouns I have removed the syncTime and replaced it with SpreadInitializationOverInterval for the SCOMprecentageCPUTime... rule and monitor.
Read here why sync time is bad: https://blogs.technet.microsoft.com/kevinholman/2014/01/08/using-the-sync-time-property-in-workflows-and-overrides/ and here https://systemcenterom.uservoice.com/forums/293064-general-operations-manager-feedback/suggestions/18601399-reduce-cpu-impact-for-scompercentagecputimecounter
There are overrides for all values, and one can also set the log level of the scripts with Debug, Information, Warning and Error. The default is information, and this will log the start of the script and end, it will also log the runtime. Using debug alot more information will be added to the operations manager log, and is good to use to troubleshoot a single agent.
The monitors and rules will run each 300 sec (5min) and the spred intervall is also set to 300 sec. These values can be changed by overriding them. The termination time is lowered to 120 sec, so not to let the script run for too long.
The MP is called Microsoft.SystemCenter.2007.Addendum, and the source code can be found here: https://github.com/mortenlerudjordet/lerunTools/tree/master/SCOM/MPs/Microsoft.SystemCenter.2007.Addendum
These fixes are running in production where we have a mix of OS'es from 2008R2 to 2016 and some workgroup server. It's a single domain so I could not test the behaviour on a forest environment.