Sunday, September 30, 2018

Exchange 2010 - OWA blank or white page after rollup installation

The installation of an Exchange rollup can break OWA, leaving the user with a blank or white page. If we examine the url, we would see something like this:

https://EX13-1.mynet.lan/owa/auth/logon.aspx?url=https://EX13-1.mynet.lan/owa/&reason=0

Please realize that, in this case, I had already changed "[...] mail.mitserv.net/owa [...]" to EX13-1.mynet.lan/owa in an effort to troubleshoot. Normally, we would not use the name of a specific server in the url, especially if we are load-balancing between two servers.

The url ends with "reason=0".

This confirms  that something is wrong and provides some indication of what we need to do for the solution (see below).

A quick search with terms...

owa reason=0

reveals that the most common cause of the problem is the installation of a rollup.

Now, before I present the solution, I'll remind the reader that the preferred method to install an Exchange rollup manually is to download the .msi file, open a command prompt as administrator and navigate to the location of the .msi file that we therefore execute with elevated privileges.  This procedure does not apply if we install the update automatically via Windows updates. I presented some rollup tips and best practices in a previous blog post:

Exchange rollup tips and best practices

When OWA breaks, we might be tempted to recreate the virtual directory but that is not necessary in this case and may not solve the problem (to be honest, I did not test that option). When faced with the blank OWA page, we can run the PowerShell script UpdateCAS.ps1 located here (by default):

"C:\Program Files\Microsoft\Exchange Server\V14\Bin\UpdateCas.ps1"

We should see output like this:


[PS] E:\Program Files\Microsoft\Exchange Server\V14\bin>.\UpdateCas.ps1

[10:35:03] ***********************************************

[10:35:03] * UpdateCas.ps1: 9/1/2018 10:35:03 AM

Creating a new session for implicit remoting of "Get-ExchangeServer" command...

[10:35:10] Updating OWA/ECP on server EX13-1

[10:35:10] Finding ClientAccess role install path on the filesystem

[10:35:09] Updating OWA to version 14.3.409.0

[10:35:09] Copying files from 'E:\Program Files\Microsoft\Exchange Server\V14\ClientAccess\owa\Current' to 'E:\Program

Files\Microsoft\Exchange Server\V14\ClientAccess\owa\14.3.409.0'

[10:35:13] Found 1 OWA virtual directories.

[10:35:13] Updating OWA virtual directories

[10:35:13] Processing virtual directory with metabase path 'IIS://EX13-1.mynet.lan/W3SVC/1/ROOT/owa'.

[10:35:13] Metabase entry 'IIS://EX13-1.mynet.lan/W3SVC/1/ROOT/owa/14.3.409.0' exists. Removing it.

[10:35:15] Creating metabase entry IIS://EX13-1.mynet.lan/W3SVC/1/ROOT/owa/14.3.409.0.

[10:35:16] Configuring metabase entry 'IIS://EX13-1.mynet.lan/W3SVC/1/ROOT/owa/14.3.409.0'.

[10:35:17] Saving changes to 'IIS://EX13-1.mynet.lan/W3SVC/1/ROOT/owa/14.3.409.0'

[10:35:17] Saving changes to 'IIS://EX13-1.mynet.lan/W3SVC/1/ROOT/owa'

[10:35:17] Update OWA done.

[10:35:17] Updating ECP to version 14.3.409.0

[10:35:17] Copying files from 'E:\Program Files\Microsoft\Exchange Server\V14\ClientAccess\ecp\Current' to 'E:\Program Files\Microsoft\Exchange Server\V14\ClientAccess\ecp\14.3.409.0'

[10:35:17] Update ECP done.


As for many others (judging from conversations on various Internet technical forums), this script solved the problem and there is once again a functional OWA page where users can login. 



Friday, September 28, 2018

NLTEST - what can we do with it?

What can we do with NLTEST? This tool can be used for a variety of tasks: testing domain trusts, verifying secure channels between Active Directory domain members and domain controllers and some other operations such a registering certain DNS records. In the lines that follow, I'll present some of the commands I find most useful as well as some other commands (NETDOM or PowerShell) that can accomplish the same tasks.   As I have a single forest/domain/site, I will not demonstrate the use of NLTEST for the verification of domain trusts.

***


First of all, we can list the domain controllers of a domain:

nltest /dclist:mynet.lan


Two remarks:
  • In general, we can use the short name of the domain "mynet" or the FQDN of the domain "mynet.lan".
  • The command targets the local domain controller unless we indicate another explictly with the parameter /server:


We can also use NLTEST to validate the secure channel between a domain member and the domain with either of these two commands (indicating our domain after the colon):

nltest /sc_query:mynet
nltest /sc_verify:mynet



When a computer joins a domain, a secure channel (or "trust") between that computer and the domain is established by the means of a password stored locally as a "LSA secret" and remotely (from the perspective of the local machine) in Active Directory. I have seen both commands (query and verify) presented as methods to test the condition of this channel.

With the "verify" option, "If the secure channel does not work, this parameter removes the existing channel, and then builds a new one" (according to "help"). I was not able to determine if that would make the /sc_reset command (presented below) unnecessary.

One particularity to keep in mind is that both tests will fail if run on the domain controller holding the PDCe role with the surprising error message "ERROR_NO_SUCH_DOMAIN":




This is expected and no corrective action is required.

Various sources also mention that the query is not entirely reliable:

"The results of an Nltest /sc_query are unreliable — it returns the status of the channel when it was used last time and not the current status."


Some would also argue that the NLTEST command is not very useful to the extent that successful results from other common domain controller related commands like dcdiag and repadmin imply that the secure channel is intact.

***

If the channel is broken, we can attempt to reestablish it with this NLTEST command:

nltest /sc_reset:mynet

or

nltest /sc_reset:DC1

Remark: we indicate either the name of the domain or the name of a domain controller.


***


Other options include, NETDOM, PowerShell and the GUI.

NETDOM

For a workstation or server:

netdom reset SVR1 /Domain MYNET /UserO admin@mynet.lan /PasswordO *

We can abbreviate by using only the uppercase letters (yes, that is a capital "o" - not zero):

netdom reset SVR1 /D MYNET /UO admin@mynet.lan /PO *

For a domain controller, I found this variation:

netdom resetpwd /s:DC1 /ud:mynet\admin /pd:*


PowerShell

Test-ComputerSecureChannel -Repair

Remark: we can use the cmdlet without the parameter -repair to test the secure channel.

GUI

We reset the computer account in Active Directory and then rejoin the computer to the domain.


***


We can see the different functions of the domain controller with this command (see the "flags" section in the output):

nltest /dsgetdc:mynet

(Note that the related switch "dnsgetdc" does not seem to produce any output)



This domain controller (DC1) is, among other things, the PDC (emulator), a global catalog, a time server and offers LDAP and Kerberos services. Some of these (like LDAP and Kerberos) will be present on all domain controllers, others (like the PDCe role) will not.

Therefor, the results are slightly different on a domain controller that does not hold the PDCe role:



This command shows specifically the PDCe:

nltest /dcname:mynet




***

We should keep in mind that besides our GUI options, there are other commands that are probably more useful for displaying the "FSMO" roles or the global catalogs, for example:

netdom query fsmo

Get-ADForest | fl *master*
Get-ADDomain | fl *master*,*emulator*

dsquery server -isgc

***


NLTEST also provides a method to re-register certain DNS records that may be missing or have been deleted.

In this example, I "accidently" delete a SRV record in DNS:







Only the SRV record for DC1 remains:




But if I run the following command...

nltest /dsregdns


The record (along with others that may be missing) is recreated.


***

The NLTEST command may be our best option because the others could potentially be much more disruptive.

We could restart the NETLOGON service...

net stop netlogon
net start netlogon

or

Restart-Service netlogon

But stopping the NETLOGON service will prevent clients from connecting and authenticating to the domain controller. Restarting the server altogether would have the same effect (and negative consequences).

One more option:

dcdiag /fix

But this command could potentially make changes that we would not need on an otherwise perfectly functional domain controller.


***

This NLTEST command verifies that all DC-specific DNS records were updated without errors:

nltest /dsquerydns




NLTEST can show us what site we are in:

nltest /dsgetsite




And lastly (for this blog post), it can verify that a user account exists in the domain:



Saturday, September 15, 2018

Windows Server 2016 - Full server recovery (domain controller)

In a previous blog post, I restored an entire Windows 2008 R2 domain controller in the context of a forest recovery (in French):


I'm working more an more with Windows Server 2016 and wanted to attempt the same operation (but only the recovery of a single server in this case). The procedure is similar with some differences in the details.


***


I have already peformed the backup using Windows Server backup. If you are not familiar with this procedure, please consult other sources online or elsewhere. The optimal scenario (presented here) consists in restoring the backup image to identical or at least very similar hardware. I use VMware workstation for my virtual test environment and had installed Windows Server 2016 to a virtual machine with these specifications:


For recovery operations (and for other reasons), it is a good idea to document the configuration of your domain controllers. In this case, we will restore the entire server. In others, we might remove the failed domain controller permanently, rebuild it and then let normal replication from other domain controllers populate its database (ntds.dit). In this case, it is very useful (if not crucial) to know the name of the server and its TCP/IP configuration, even if much of that could be determined otherwise. We would not want to assign a name or IP address already in use, although error messages would probably alert us before we cause to much chaos.

In this scenario, we aim to restore the entire server from a backup stored in a remote location (shared folder on the network). We must first boot the server to be restored from the installation DVD - or the .iso image - used to install the OS. In this virtual environment, I will boot from an .iso image.

That objective may require us to modify the boot order. We access the BIOS of the server by entering a key combination at boot time (F2 for example). In VMware, we click on the VM tab, "Power" and finally "Power On to Firmware":



In the BIOS (which may differ from server to server), we modify the boot order so the CD-ROM (or DVD) drive is in first place:


We save settings as necessary (often F10) and exit.


If the CD/DVD is present, or if the .iso file is linked correctly, we should boot into Windows setup:




We click Next (above) but then, instead of "Install now", we click on "Repair your computer":





The graphic interface in Windows 2016 has a different appearence compared to previous server versions but the functionality is very similar. We click on "Troubleshoot"...




Then (under "Advanced options") "Command Prompt":




On my first attempt, I selected "System Image Recovery" right away but soon encountered an obstacle: at this point the server has no IP address (which makes it impossible to access the remote location where the server backup file is located). In fact, if we execute the command...

netsh interface ip show interface

(abbreviated in the screenshot below)

We see that there is not even an interface to configure:


Note: the lack of an IP address is not a problem if our backup file is stored on locally attached storage, an external hard drive for example.


Windows PE (WinPE) is the solution. With the following command, we initiate the detection of hardware resources and the now detected network interface can be manually configured with an IP address (and other TCP/IP parameters):

start /w wpeinit


Note: WinPE is available by booting from the DVD/.iso image (at least with Windows Server 2016) - no special installation is required.

Windows PE (WinPE)


If our server is on a subnet where DHCP is available, an IP address may be configured automatically. In my case, there is no DHCP server available and ipconfig shows a locally configured IP address (APIPA):




This is not a major obstacle. We can use NETSH to configure an IP address appropriate for the recovery operation, for example:

netsh int ip set addr "Ethernet0" static 10.0.0.11 255.0.0.0 10.1.1.3



We now have these parameters:



Let's try to ping the IP address of the server holding the backup file (of course, this is documented so we know what address to target):




We can now exit from the command prompt which brings us back to our options. We select "Troubleshoot"...




And then "System Image Recovery":




Windows cannot find a system image so we must close the dialog for more options:




I choose the option "Select a system image":




And then "Advanced" so I can add a network location:



I choose the first option here:




Once again, we have to document our recovery procedure well enough to know how to access the location of the backup file. Since DNS may not be functional, I indicate the path with the IP address:




I enter credentials granting access to the backup location:




Now I can see the available backups (only one exists in our test situation):




We click on "Next" as needed and make additional selections as necessary:




We have yet some more restore options:




I have no disks to exclude but this is where we would select them:





We can install additional drivers if necessary:




These are the advanced options (selected by default):



I confirm my choices...







And the restore operation starts:




The restore failed once and I had to try again:




I tried unchecking the "Format and repartition disks" option...



But that only produced a different error:




After a second try (with the box checked), I was able to continue: 





The restore completes and the computer restarts automatically after a minute (unless we cancel the restart):



The restart seems to be successful...



And I am able to login:




Our devices are present with no error or warning icons: 





And Windows remains activated:




In reality, if this was a production domain controller, we would also want to verify Active Directory health and replication with tools like DCDIAG and REPADMIN (among others).


***


In the scenario above, I make some assumptions that should not be taken for granted in a real recovery scenario, for example, the availability of our backup location, or the ability to authenticate successfully when accessing this location. These aspects (and others) require careful consideration when constructing a real recovery plan but exceed the scope of this blog post.