Monday, August 29, 2016

Exchange 2010 (SP3) - Message Tracking - 04

In reverse order of complexity, I finished by sending a message from one internal user (Alan Reid) to another internal user (Alannah Shaw) who both have their mailbox in the same mailbox database. I then replied to the message.

As a reminder, we have two Exchange servers (EX13-1 and EX13-2) configured in a Database Availability Group (DAG) with the active copy of the (sole) mailbox database currently on EX13-2.



First message - Alan Reid to Alannah Shaw


The Message Tracking logs on EX13-2 show two events (EventID): a NOTIFYMAPI event which provides some information about the message and then a SUBMIT event.

Note: I had not included the NOTIFYMAPI events in my previous blog posts but will show one example below simply so we can see what it looks like. And once again, I have edited out some parameters (essentially those without values) for simplicity and concision.


EX13-2

So Alan Reid sends a message to Alannah Shaw: "int to int 1".

As we just said, there is a NOTIFYMAPI event and then a SUBMIT event:


[PS] C:\>get-messagetrackinglog -Server "EX13-2" -Start "8/27/2016 10:45:00 PM" -End "8/27/2016 11:00:00 PM" | fl





Note: yes, you can click to enlarge.

According to one discussion I followed, it appears that the first entry is associated with the Mailbox role that notifies the Hub Transport role of the presence of a message (in the Outbox of the user?).

Then the Mailbox role submits the message.


Note: the STOREDRIVER is in fact a component of the Transport role.
We should keep in mind that the Hub Transport role is responsible for ALL message delivery, even when a message goes from one mailbox to another in the same mailbox database.

In that context, it is interesting to observe (in the screenshot above) that the Hub Transport role on EX13-2 (where the active copy of the mailbox database is located) submits the message to EX13-1 (here again, the Hub Transport role manages the mail flow).

But what happens next? To see, we have to change our perspective to EX13-1 where the message is briefly in transit (presumably in the Transport database - yes there is a transport database also, where messages reside for a more or less short period while in transit).


EX13-1

In the two screenshots below, we see that EX13-1 receives the message from EX13-2 (EventID = RECEIVE) and then delivers the message to the recipient Alannah Shaw (EventID = DELIVER).

Since there is no DELIVER event in the EX13-2 message tracking logs, it seems clear that EX13-1 places the message in the recipient mailbox.

[PS] C:\>get-messagetrackinglog -Server "EX13-1" -Start "8/27/2016 10:45:00 PM" -End "8/27/2016 11:00:00 PM" | fl






Second Message - Alannah Shaw to Alan Reid (reply)

For the reply, the message sequence is the same. I'll post the screenshots all the same so you can compare yourself.


EX13-2

[PS] C:\>get-messagetrackinglog -Server "EX13-2" -Start "8/27/2016 11:00:00 PM" -End "8/27/2016 11:30:00 PM" | fl






EX13-1


[PS] C:\>get-messagetrackinglog -Server "EX13-1" -Start "8/27/2016 11:00:00 PM" -End "8/27/2016 11:30:00 PM" | fl





***

There is an abundance of additional information related to the Hub Transport role and mail flow online. In particular, there is a chart that helps visualize the processing of messages, the "Overview of the transport pipeline in Exchange 2010" chart available here:



This article by Jaap Weeselius also provides another perspective on the various components of the Hub Transport role (MSEXCHANGE.ORG):



Sunday, August 28, 2016

Exchange 2010 (SP3) - Message Tracking - 03

After my first blog post on message tracking, I wanted to confirm the sequence of operations when one of "our" users (test user Alan Reid in this case) sends an email to an external recipient (a Gmail user).

The results confirmed the sequence in the first blog post.

In the scenario where we have two mail servers in a Database Availability Group, and when the mailbox database holding our user's mailbox (Alan Reid) is "active" on our first Exchange server (EX13-1), the following operations take place:
  • SUBMIT (EX13-1) : EX13-1 submits the message to EX13-2

-
  • RECEIVE (EX13-2) : EX13-2 receives the message from EX13-1
  • TRANSFER (EX13-2) : EX13-2 determines the route for the message (and performs content conversion).
  • SEND (EX13-2) : EX13-2 sends the message. We can obtain details of the session with the remote server from the protocol logs.


If I move the active copy of the mailbox database to EX13-2, the process is the exact opposite: EX13-2 submits the message to EX13-1 and the RECEIVE, TRANSFER and SEND operations take place on EX13-1.

The location of the DAG Primary Active Manager (PAM) does not seem to play a role.

I sent a total of 12 test messages, with the active copy of the mailbox database on EX13-1 for the first 3 tests, on EX13-2 for the next 3 tests, on EX13-1 for the following 3 tests and on EX13-2 for the last 3 tests. I see little usefulness in posting output or screenshots for each of these tests.

***

Now I want to send a message from Gmail to Alan Reid.

What entries does that produce in the Message Tracking logs ?

Before I proceed, I want to clarify the path taken by inbound messages.

Messages from Gmail first arrive at a hardware firewall that performs 1-to-1 NAT and forwards these messages to a Citrix NetScaler VPX load balancer. This device then forwards the messages to either of the two Exchange servers (EX13-1 or EX13-2).

So...

Gmail -> [ CLOUD] -> perimeter firewall (1-to-1 NAT) -> NetScaler VPX -> Exchange servers



***


First Message (Gmail - INBD - 001)


It looks like the incoming message from Gmail passes through the Netscaler load-balancer (10.0.0.33) and then is received by EX13-2.

Since the mailbox of recipient Alan Reid is active on EX13-1, EX13-2 passes the message to EX13-1 where it it is delivered.

Get-Messagetrackinglog -Server "EX13-2" -Start "8/27/2016 5:15:00 PM" -End "8/27/2016 5:30:00 PM" | fl




Note: you can click on the images to enlarge.

Note: I have edited out some of the parameters for simplicity and concision, especially parameters with no values.

It seems (since this is the first time I analyze this) that unlike the send operation to an external recipient with the following sequence of operations...

  • SUBMIT


  • RECEIVE
  • TRANSFER
  • SEND


We have a simple Receive and Deliver combination (both happening on EX13-2):


  • RECEIVE
  • DELIVER


But which server actually delivers the message?

Apparently, EX13-1 plays no role in the delivery of the message. In any case, no operations are recorded in the logs:

[PS] C:\>Get-Messagetrackinglog -Server "EX13-1" -Start "8/27/2016 5:15:00 PM" -End "8/27/2016 5:30:00 PM" | fl
[PS] C:\>

In the example above, there is no result for EX13-1 so it seems that EX13-2 peforms the delivery operation (even though the DELIVER operation entry lists EX13-1 for the ServerHostname).



Second Message (Gmail - INBD - 002)

Same thing for second message...

[PS] C:\>get-messagetrackinglog -Server "EX13-1" -Start "8/27/2016 5:45:00 PM" -End "8/27/2016 6:00:00 PM" | fl
[PS] C:\>
[PS] C:\>get-messagetrackinglog -Server "EX13-2" -Start "8/27/2016 5:45:00 PM" -End "8/27/2016 6:00:00 PM" | fl







Third Message (Gmail - INBD - 003)

Here it looks like EX13-1 receives the message (not EX13-2, contrary to the first two messages) but then sends it to EX13-2, even though the mailbox resides on EX13-1 (it is part of a single mailbox database that is now "active" on EX13-1):

[PS] C:\>get-messagetrackinglog -Server "EX13-1" -Start "8/27/2016 5:53:00 PM" -End "8/27/2016 5:55:00 PM" | fl






So, in this case, the sequence is:

  • RECEIVE
  • ROUTING
  • SEND

And EX13-1 sends the message to EX13-2, only for it to deliver the message to the mailbox that is "active" on EX13-1 (in fact, it is the copy of mailbox database that is "active" on EX13-1):





Here (above) we see the same Receive - Deliver combination that we observed for the first two messages.


Fourth Message (Gmail - INBD - 004)

As for the 4th message, EX13-2 receives it from the NetScaler as for the Message 1 and 2 above, and then passes it to EX13-1 where it is delivered. Since the sequence is exactly the same, I will not waste space by posting screenshots that illustrate exactly the same thing.


The load balancer seems to play a role here:



It is configured to use "Round Robin" between the two Exchange servers and could deliver to either one, although the alternance between the two does not seem to be strict. One would have thought that 2 messages would go to EX13-1 and 2 messages to EX13-2 but in fact (or in this case) it is 3 messages for EX13-2 and only 1 for EX13-1.

I will not test moving the PAM. I doubt very much that this clustering component has any effect at all on the forwarding decisions of the load balancer or even on the inter-Exchange server routing.


***


I changed the active copy of the database from EX13-1 to EX13-2. For the next two test messages, I observed the same behavior.
  1. If the server that does not hold the active copy of the mailbox database receives the message directly from the load balancer, it delivers the message. This happened for the first of two test messages.
  2. If the server that DOES hold the active copy of the mailbox database receives the message directly from the load balancer, it sends it to its partner which in return delivers the message (to the mailbox currently "active" on the first server). This happened for the second of two test messages.
After some reflection, I have opted not to post the screenshots of the operations that took place on the servers after moving the active copy of the mailbox database, since it would simply be a repetition of the sequences examined earlier.




Saturday, August 20, 2016

Exchange 2010 (SP3) - Message Tracking - 02

This post continues the previous post in which we examined message tracking logs to follow the path of an outbound message sent to a Gmail recipient.

As we discussed in the previous post, when we send a message to an external recipient, there should be a SEND entry in the logs that indicates the message was sent:


Note: this screenshot is the same as the one used in the previous blog post.

If we need more details about the transaction between our Exchange server and the remote mail server (or "smart host" in the case of an external relay server situated between our server and the final recipient), we can consult the Exchange "protocol logs", in this case the SMTP protocol.

Even for what I will call a "basic" transaction, or session, the output in the protocol log is quite verbose. If TLS encryption is negociated, the content of the logs is even more plentiful.

I will opt to show screenshots of one of the sessions that took place when I sent a test message to Gmail, concentrating on these "basic" operations, and excluding the TLS negociation.

I will then analyze  the elements of each sequence of the session. The objective is to recognize what happens when this all works so we can compare that to the output of a failed communication between mail servers.

***

This is what a single session will look like in the protocol log:

[snip]

I need to clarify some points before moving forward:
  • The most observant may remark that the dates do not coincide with the dates in the previous blog post. For some reason, I ended up using logs from a different date. That does not change anything fundamentally as for the analysis of the log.
  • I have changed the external recipient address for confidentiality.
  • I have "snipped" the section pertaining to TLS negociation for both simplicity and concision.
  • Although the external recipient is a Gmail user, the external mail server referenced in the logs is in fact a "smart host" through which my outbound messages are relayed.
  • A single protocol log file normally contains numerous SMTP sessions distinguished by the "session-id". 

The log is organized in columns (or "fields") separated by commas (we could open the log file in Excel for a better view). At the top of each column is a header that is more or less descriptive of the column's function (line 5 in the first of the two screenshots above):

#Fields: date-time,connector-id,session-id,sequence-number,local-endpoint,remote-endpoint,event,data,context

Let's dissect the first couple lines. What I will call the "first" line is the sixth line from the top of the log and has the "sequence-number" of 0 (zero). As we will see, each step of the communication between the two servers is designated by a number (starting with 0, not 1). If another message had been sent, the sequence would start all over again at zero. And as we mentioned before, each inter-server communication, or "session", has its own "session-id" that identifies it from the others in a log file that could include hundreds of sessions.


date-time

2016-05-21T18:43:39.389Z

This is the date and time of the event, measured to fractions of a second (.389) and expressed in UTC (Z = Zulu or "Zulu Time").

connector-id

MYNET-SendConn-1

This is the name of the send connector, in my case: MYNET-SendConn-1.

session-id

08D381A3C5B12312

The session ID distinguishes the events related to the sending of "message A" from events related to "message B" and "message C" and so forth. 

sequence-number

0

A sequence number marks each step of the sending of the email.

local-endpoint
,,

There is no local endpoint in first line. In following lines, it will be the IP address of the local Exchange server (10.0.0.24 in my case). 

remote-endpoint

8.23.224.61:25

This is the IP address of the remote mail server or smart host (note the port 25 - SMTP). In my case, it is, in fact, a relay (No-IP).

event

*

Please see the paragraph below (protocol logging event values). In this case, the " * " designates an informational event (no data has been transfered in either direction).

data

,, (none)

No data has been transfered so far.

context

attempting to connect

This is almost self-explanatory : my server is "attempting to connect" to the remote mail server (or smart host).


Protocol logging event values

For the event field, these are the possible values:

  • +   Connect
  • -   Disconnect
  • >   Send
  • <   Receive
  • *   Information

For additional information, please consult this MSDN document:


If we look at the two screenshots above, we can see that there are 55 operations that take place during this single session, some send, some receive, and so forth. Even if we exclude the SSL entries, that is quite a lot of data to analyze. Therefore, I will limit myself to the next line.

Since most of the first fields do not change (except for the fractions of second for date and time), I'll start with the session-id to underline that fact that it does not change but that the following field, the sequence-number does change.


session-id

08D381A3C5B12312

This number will remain the same for the entire session.


sequence-number

1

Remember: we start with 0.


local-endpoint

10.0.0.24:9484

This was not present in the first sequence (0). It is the IP address of the sending server plus a randomly selected port. Note that, while the port of the destination host will always be 25 (barring some exceptional reconfiguration), the port used by the sending server can be something else. I would have thought this port would have been in the 49152–65535 "ephemeral" port range but apparently it can be selected in the registered port range (1024 - 49151).


remote-endpoint

8.23.224.61:25

This remains the same.


event

+

According to the list above, this is a "connection" event.


data and context

Both these fields are blank on the second line (of the session).


***

Below is another perspective of the session. I've copied only the event, data and context fields (columns) so we can concentrate on the "conversation" between the two end-points. Note that, for the most part, the context field is not populated, the first line, with the value "attempting to connect" being an exception:

*,,attempting to connect
+,,
<,220 smtp-auth.no-ip.com ESMTP ready,
>,EHLO mail.mitserv.net,
<,250-smtp-auth.no-ip.com,
<,250-PIPELINING,
<,250-SIZE 36700160,
<,250-ENHANCEDSTATUSCODES,
<,250-8BITMIME,
<,250-DSN,
<,250-AUTH=PLAIN LOGIN,
<,250-AUTH PLAIN LOGIN,

Note : at this point, probably because of the SSL negociation which we have omitted in our analysis, the first steps above are repeated, but this time (apparently) within the protected SSL connection:

>,EHLO mail.mitserv.net,
<,250-smtp-auth.no-ip.com,
<,250-PIPELINING,
<,250-SIZE 36700160,
<,250-ENHANCEDSTATUSCODES,
<,250-8BITMIME,
<,250-DSN,
<,250-AUTH=PLAIN LOGIN,
<,250 AUTH PLAIN LOGIN,
>,AUTH LOGIN,
<,334 <authentication information>,
>,<Binary Data>,
<,334 <authentication information>,
>,<Binary Data>,
<,235 2.0.0 OK,
*,44,sending message
>,MAIL FROM:<Alan.Reid@mitserv.net> SIZE=6147,
>,RCPT TO:<gmail.recipient@gmail.com>,
<,250 2.1.0 Ok,
<,250 2.1.5 Ok,
>,DATA,
<,354 End data with <CR><LF>.<CR><LF>,
<,250 2.0.0 Ok: queued as 1E53E4005DE,
>,QUIT,
<,221 2.0.0 Bye,
-,,Local


I will not analyze each step but will point out that all the 2xx SMTP codes (220, 250, 221) indicate that the operation was a "complete success". 3xx codes (334, 354) indicate success as well but more information is expected. 4xx codes indicate a temporary or "transient" failure (example: maximum number of recpients reached - for a single operation, but the server can try again later) and 5xx codes indicate permanent errors (example: incorrect address - the message will never arrive unless the sender corrects the address).

***


In the context of message tracking and troubleshooting, I wanted to examine the SMTP connection process between two mail servers, without necessarily dissecting each and every step in detail, so I would have a "good copy" to which I could refer when attempting to resolve a failed connection.

Like many administrators, I often juggle several projects both at work and outside of work and the time I can dedicate to the analysis of a particular operation is not unlimited. Therefore, I opted not to analyze in greater detail each of the steps presented in the previous section.

However, for those that would like more information on the SMTP protocol, you could (besides performing your own searches) consult the following documents:

How IT Works: How Simple is SMTP? (Technet)

Simple Mail Transfer Protocol (Wikipedia)

TurboSMTP - good summary of codes

SMTP - Enhanced Status Codes Registry (IANA)

SMTP - extremely detailed IETF document

Monday, August 8, 2016

Exchange 2010 (SP3) - Message Tracking - 01

From time to time, I have to track a message to determine if it was delivered and if not, where it may have been blocked. In most situations, this process goes beyond the scope of Exchange and requires that we take into account perimeter devices (such as Barracuda or Ironport) and even online mail hygiene systems like Exchange Online Protection.

In one of the more common Exchange configurations, two multi-role servers configured as a Database Availability Group (or DAG), I have observed certain behavior that I wanted to verify by sending and resending a message in various scenarios, for example :
  • Send to external recipient
  • Receive from external sender
  • Send to internal recipient
  • Receive from internal sender.
Before I begin, however, I'm allowing myself a brief review of message tracking and protocol logs. Not having analyzed some of these types of logs for a while, I wanted to take another look at some of the basic concepts. My review is not necessarily complete, so feel free to consult other sources if necessary, and official Microsoft documentation in particular.

Note: since this blog post was becoming longer and longer, and since the analysis of the protocol logs related to my test message run the risk of being long and perhaps confusing as well, I've decided to look at the message tracking logs only in this first post and reserve the exam of the protocol logs to the following post. However, I will present the protocol logs (location, status) in this first post.


***

Where do we look when tracking a message ?

There are two types of logs we can examine : message tracking logs and protocol logs:
  • Message tracking - fairly readable.
  • Protocol (SMTP) - can provide very verbose details on connections established with remote SMTP servers but can be difficult to interpret.

The message tracking logs are stored here (by default):

C:\Program Files\Microsoft\Exchange Server\v14\TransportRoles\Logs\MessageTracking

We can use two tools in the Exchange 2010 "Toolbox" to track messages: "Message Tracking" (with a web interface) and "Tracking Log Explorer" with a MMC interface. I will use the latter.


As we shall see later, we can also view the content of the message tracking logs with PowerShell.

These are the default message tracking settings (click to enlarge):


For example, the maximum age for message tracking logs is 30 days. Older log files are deleted.

Message tracking is enabled by default. We can disable message tracking with this cmdlet:

Set-MailboxServer EX13-2 -MessageTrackingLogEnabled $false

Message tracking is a feature on Exchange servers holding either the Mailbox or Hub Transport role (or the less frequently used Edge Transport role).

If the server is only a Hub Transport server, we could use this cmdlet to disable message tracking:

Set-TransportServer EX13-2 -MessageTrackingLogEnabled $false

If both roles are present, we can use either cmdlet.

In any case, there is only one set of logs.

For those allergic to the command line, we can enable and disable message tracking here as well:




The "protocol" logs (in our case, SMTP protocol logs) are stored in the following location (we have a glimpse of this path in the last screenshot):

C:\Program Files\Microsoft\Exchange Server\v14\TransportRoles\Logs\ProtocolLog

There are three subfolders at this location: SmtpReceive and SmtpSend (plus a HTTPClient folder that does not play a role in mail transport). The logs are stored in these subfolders in the form of text files (just like the message tracking logs).

Unlike message tracking, SMTP protocol logging is not enabled by default. If we forget to enable it, we will not have much to work with. Moreover, SMTP protocol logging is not per server but rather per connecter; receive connector or send connector that is. We could enable protocol logging for receive connectors but not send connectors, or only certain receive connctors.

We can enable logging for a send connector with the following cmdlet (replace the send connector name with the name of your send connector(s)):

Set-SendConnector MYNET-SendConn-1 -ProtocolLoggingLevel Verbose

We can disable logging with this cmdlet:

Set-SendConnector MYNET-SendConn-1 -ProtocolLoggingLevel None

Note: "None" and "Verbose" are the only choices here. "None" is the default value.

Once again, if we prefer the EMC, we can enable/disable logging here:


The procedure would be the same for other send connectors or for receive connectors.


***


This is my configuration for our experiments in message tracking...

We have two Exchange servers: EX13-1 and EX13-2.
  • Each has the following roles: Client Access, Hub Transport, Mailbox.
  • A DAG has been configured.
  • EX13-1 is currently the PAM and holds the "active" copy of the sole mailbox database: DB01

Here's proof:

[PS] C:\>Get-MailboxDatabase | Get-MailboxDatabaseCopyStatus | fl Name,Status

Name   : DB1\EX13-1
Status : Mounted

Name   : DB1\EX13-2
Status : Healthy

[PS] C:\>Get-DatabaseAvailabilityGroup -s | fl Name,Servers,OperationalServers,PrimaryActiveManager

Name                     : DAG1
Servers                   : {EX13-2, EX13-1}
OperationalServers   : {EX13-1, EX13-2}
PrimaryActiveManager : EX13-1


What happens when... we send a message to an external recipient (at Gmail for example)?


Message Tracking logs

For the first experiment, our test user Alan Reid sends a message to a Gmail user. Alan Reid's mailbox is on EX13-1 (DB01). He accesses his mailbox Via Outlook. We are not using Exchange cached mode so the mailbox is being accessed directly (no OST files).

I begin the message tracking on EX13-1 with the following parameters so all operations (SUBMIT, RECEIVE, SEND, etc.) are captured:



I will use the PS equivalent (visible at the bottom of the sceenshot above) to display the details of each operation.

Note: I will also edit addresses for privacy and "snip" some of the output for concision and readability (especially when the values for the parameter in question is null).

On EX13-1, the only operation recorded is a SUBMIT event (other than a NOTIFYMAPI entry with essentially no data).

Here are the details of the SUBMIT event (with, once again, some edits):

Get-Messagetrackinglog -Start "7/18/2016 11:00:00 AM" -End "7/18/2016 12:05:00 PM" | fl

Timestamp               : 7/18/2016 11:57:15 AM
ClientIp                    : fe80::a179:11fd:2f31:9329%16
ClientHostname       : EX13-1
ServerIp                    :
ServerHostname       : EX13-2
ConnectorId              :
Source                      : STOREDRIVER
EventId                     : SUBMIT
InternalMessageId    :
MessageId               : <5386E0BEF5FB[...]15075DC27@EX13-1.mynet.lan>
Recipients                : {}
RecipientStatus       : {}
MessageSubject       : OutBound-001
Sender                     : Alan.Reid@mitserv.net
ReturnPath              :
MessageInfo             : 2016-07-18T15:57:14.276Z;LSRV=EX13-1.mynet.lan:TOTAL=1


It looks like the STOREDRIVER on EX13-1 submits the message to EX13-2, the other server of the pair:
  • ClientHostname = EX13-1
  • ServerHostname = EX13-2

I also noted that for the protocol logs (SmtpSend and SmtpReceive), there is no file for the current date (2016-7-18)  on EX13-1.

So, apparently, EX13-1 has submitted the message to EX13-2 and we have to consult the message tracking logs (and possibly the protocol logs) on EX13-2 to follow  our outbound email.

This is what we have on EX13-2 :



We can see the details better with the format-list option (fl) :

Get-messagetrackinglog -Start "7/18/2016 11:00:00 AM" -End "7/18/2016 12:05:00 PM" | fl

RECEIVE


The RECEIVE entry on EX13-2 seems to correspond with the SUBMIT entry on EX13-1 : in both cases, EX13-1 is the "client" (ClientHostName) and EX13-2 is the "server" (ServerHostName).

Some more parameters now have values, notably "Recipients" and "Returnpath".

Now that EX13-2 has the message "Outbound-001", what does it do with the message?


TRANSFER



At this point, EX13-2 is peforming ROUTING and content conversion. It determines the IP address of what is (at first glance) the recipient's mail server and the IP address of this server.

In fact, "smtp-auth.no-ip.com" is a "smart host" through which messages traverse before reaching the ultimate destination, in this case Gmail.

A send connector has been selected, the message appears to have been queued ("as 054E2400836") and the recipient status is Ok. It seems like the message has been sent.

In any case, this is the last event recorded in Message Tracking :

SEND



Note: the Gmail recipient did indeed receive the message.


Protocol logs (to be continued...)

I entitled this blog post "Message Tracking" even though I'm examining not only the "message tracking" logs but also protocol logs. These can be useful if we track, for example, an outgoing message that apparently cannot reach the target mail server, or a message that an external remote server cannot send to our organization. If, for some reason, the servers cannot "talk" to each other in general, then it is clear that specific messages will not be delivered. If the servers can communicate, we can at least rule out a general communication failure as the cause of the problem.

 As I studied the protocol log file associated with the message sent above, I was reminded that it is not only challenging to read the logs but also present them. For this reason, I have decided to divide the analysis of this first (outbound) message into two parts: "Message Tracking" per se, that we have examined in this blog post, and the "dissection" of a send protocol log that will be the subject of my next blog post.

Sunday, August 7, 2016

Exchange 2010 (SP3) - PowerShell 4 and scripts

PowerShell 4.0 (PS4) is now compatible with Exchange 2010 SP3 provided that Rollup 5 (or above) is installed.

This is a welcome change for those using NetApp storage, as the most recent versions of the SnapDrive and Snap Manager for Exchange (backup) software require PS4 as a pre-requisite.

However, some users have noticed that although PS4 is now supported, scripts like the StartDagMaintenance.ps1 script will not execute:

StartDagServerMaintenance fails during checkdatabaseredundancy

Indeed, this is stated in the Exchange 2010 Supportability Matrix:

Exchange Server Supportability Matrix

Windows Management Framework 3.0 and Windows Management Framework 4.0 can be used to perform operating system-related management tasks on a computer that's running Exchange 2010 SP3 RU5 or later. However, Exchange 2010 cmdlets and Exchange 2010 scripts require Windows PowerShell 2.0. Using Exchange 2010 cmdlets and scripts with Windows Management Framework 3.0 or Windows Management Framework 4.0 isn't supported. 

Note: PowerShell is part of the Windows Management Framework (WMF). If we want to have PS4, for example, we install WMF4.

Having observed some problems with the Exchange DAG maintenance scripts myself (besides the discussion in the link above), I wanted to study the interaction between PowerShell 4 and these scripts in a test environment. If the problem is confirmed, I intend to elaborate a workaround solution (probably executing the cmdlets in the script manually).

***

Here is the status quo of my two Exchange 2010 servers...

The PowerShell version is 2:




I have installed Rollup 14 (Exchange 2010 SP3 Rollup 5 is the lowest version compatible with PS4):




The Exchange servers are named EX13-1 and EX13-2 (despite the "13" these are indeed Exchange 2010 servers).

We have a Database Availability Group (DAG) configured. EX13-1 and EX13-2 are both members. At this time, EX13-2 holds both the "active" copy of the mailbox database "DB1" and the "Primary Active Manager" (PAM) role:

Get-MailboxDatabase | Get-MailboxDatabaseCopyStatus | fl name,status

Name   : DB1\EX13-1
Status : Healthy

Name   : DB1\EX13-2
Status : Mounted


Get-DatabaseAvailabilityGroup -s | fl *primary*

PrimaryActiveManager : EX13-2



Let's verify that the DAG script works now (with PowerShell 2.0):



That seems to be the case (no news - or error messages - is good news with PowerShell).

Note: click on the images to enlarge.


Also notice the change concerning the mailbox database, now mounted on EX13-1, and the PAM:

Get-MailboxDatabase | Get-MailboxDatabaseCopyStatus | fl name,status

Name   : DB1\EX13-1
Status : Mounted

Name   : DB1\EX13-2
Status : Healthy


Get-DatabaseAvailabilityGroup -s | fl *primary*

PrimaryActiveManager : EX13-1


So the script functions perfectly fine with PowerShell 2.


***


Now we'll install PS4 and see what happens.

PowerShell 4 (PS4) is part of the Windows Management Framework 4.0 package (WMF4), so we obtain PS4 by installing WMF4 on the server.

We can download WMF4 at this link:

Windows Management Framework 4.0

This is the file that is appropriate for Windows 2008 R2 (SP1):



Note: if you have Windows 2012, there is a different file.

Once again, WMF 4.0 is presented as being compatible with Exchange 2010 SP3 (if Rollup 5 - or above - is installed):


I will not illustrate each step of the installation with screenshots. It's essentially a matter of clicking "OK", "Next", etc., and restarting the server.


What version of PowerShell do we have now?

Surprise...

In the Exchange Management Shell (EMS), it is still PS2 but in the "regular" PS, it is now PS4:



Note: the same is true on the other Exchange server after installation of WMF4.

I noted that the shortcut to the EMS does refer to PS2:

C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -version 2.0 -noexit -command ". 'C:\Program Files\Microsoft\Exchange Server\V14\bin\RemoteExchange.ps1'; Connect-ExchangeServer -auto"

Moreover, our DAG scripts seem to execute.

In fact, they do execute, consistently (numerous attempts were made), on both Exchange servers.



Note: It looks like new versions of PS uninstall previous versions - with a particular exception for PS2:


So PS4 can still run scripts in "PS2 mode" but PS2 is no longer present as an autonomous program or set of commands.


***


This is a good result but not consistent with what others have reported (even though, supposedly, they have a rollup equal or superior to RU5) and what I have observed elsewhere myself.

At this point, I am going to proceed as I would have, if the scripts did not run.

How could we perform each of the operations (manually) if the script will not execute for some reason (or not completely)?

Besides finding a workaround, useful if we encounter the problem elsewhere, this is also an opportunity to understand the function of the DAG script better.

What actually happens?

I will use these directions:

StartDagServerMaintenance.ps1 script fails in an Exchange Server 2010 environment


Step 1

We first determine if there is another "healthy" (and "non-lagged") Exchange server to which we can fail over the mailbox databases:

Get-MailboxDatabase | Get-MailboxDatabaseCopyStatus | fl name,status

Name   : DB1\EX13-1
Status : Mounted

Name   : DB1\EX13-2
Status : Healthy


Yes, the mailbox database DB1 is "mounted" on EX13-1 and we have a healthy copy on EX13-2 that we can make active (that, essentially, is the "fail over" process).


Step 2

We move the active copy of the mailboxes off the server (that will undergo maintenance) with this cmdlet:

Move-ActiveMailboxDatabase -Server EX13-1



Note: we could also use this cmdlet that specifies the database, the server on which we want to activate the database, and the override setting:

Move-ActiveMailboxDatabase -Identity 'DB1' -ActivateOnServer 'EX13-2' -MountDialOverride 'None'


Step 3

We move the PAM role with this command:

cluster dag1.mynet.lan group "Cluster Group" /moveto:EX13-2



We can confirm the PAM was moved:

Get-DatabaseAvailabilityGroup -s | fl *primary*

PrimaryActiveManager : EX13-2


Some notes on this command (not a PS cmdlet although we can execute it in the EMS):

Do I need to move my cluster core resources?




Step 4

We suspend activation on the mailbox copies on the server that will undergo maintenance. We want to prevent activation of these mailboxes while maintenance is taking place.

Here is the status quo:

Get-MailboxDatabase | Get-MailboxDatabaseCopyStatus | fl name,status

Name   : DB1\EX13-1
Status : Healthy

Name   : DB1\EX13-2
Status : Mounted


I execute this cmdlet (suspending activation only):

Get-MailboxDatabaseCopyStatus -Server EX13-1 | Suspend-MailboxDatabaseCopy -ActivationOnly:$true


I notice that the "Status" of the mailbox does not change but the "ActivationSuspended" value does:

Get-MailboxDatabase | Get-MailboxDatabaseCopyStatus | fl name,activation*

Name                : DB1\EX13-1
ActivationSuspended : True

Name                : DB1\EX13-2
ActivationSuspended : False



Step 5

I pause the cluster node (Exchange server) where maintenance will take place:

cluster dag1.mynet.lan node EX13-1 /pause




Step 6

Lastly, I block activation of the (mailbox) server itself:

Set-MailboxServer EX13-1 -DatabaseCopyAutoActivationPolicy:Blocked



Now the server in question cannot host the active copy of a mailbox database, which we want to prevent during maintenance.


***


At this point, we perform maintenance (install updates, etc.) restart the server if necessary and are ready to resume normal operations.


Step A

We resume cluster operations on the node in question:

cluster dag1.mynet.lan node EX13-1 /resume




Step B

We unblock activation on the server:

Set-MailboxServer EX13-1 -DatabaseCopyAutoActivationPolicy:Unrestricted



Step C

We resume mailbox database copy operations (in fact, it was only activation that we suspended so that is all that needs to be resumed):

Get-MailboxDatabaseCopyStatus -Server EX13-1 | Resume-MailboxDatabaseCopy


***

Even though the DAG maintenance scripts seemed to function just fine, I was able to test the manual procedure above and everything seemed to work there as well.



Monday, August 1, 2016

Office 365 - new Hybrid Configuration Wizard

This post marks a pause in my series on CentOS Linux (composed in French). I continue to work with Office 365 and wanted to run the Hybrid Configuration Wizard (HCW) included in the Exchange 2010 Management Console. This was in the context of troubleshooting another problem that I will not address here. In any event, the following error message displayed:




However, Exchange Online has clearly been added as an additional forest:




I had not executed the HCW for a while but know for a fact that it functioned correctly at one time (perhaps several months ago).

An online search led me to this Microsoft article:

You must add your Exchange Online tenant as an additional forest message

We learn that:

The Hybrid Configuration wizard that's included in the Exchange Management Console in Microsoft Exchange Server 2010 is no longer supported. Therefore, you should no longer use the old Hybrid Configuration wizard. Instead, use the Office 365 Hybrid Configuration wizard [...].

There is a direct download link for the wizard but for some reason the installation failed, supposedly due to some missing files. So I thought I would just login to my Office 365 tenant and attempt the operation from there.

***

But first, as the article explains, we must remove and then re-add the Office 365 / Exchange Online "forest" in the EMC. I right-click on the Office 365 entry and select "Remove" in the menu:




I then re-add Office 365 as an Exchange forest:




We enter a name for the "new" forest and select Exchange Online for the "FQDN" (if you click on the arrow, it is a choice - althought technically not a "FQDN"):



I enter my credentials when prompted:




And Office 365 has been re-established as an Exchange forest in the EMC:




***


Once I re-add Exchange Online as an additional forest, I login to Office 365 and find myself at this page:



As indicated by the red dots in the screenshot above, we click on the Applications icon and then on "Exchange", which leads us to this page (you can click on the image to enlarge):




As for the HCW, we click on the "Configure" button (shown above). We are then prompted to download the application:

Note: if we perform this operation on the Exchange server itself, we may have to disable the Internet Explorer Enhanced Security Configuration, at least temporarily.


We respond to any prompts by allowing the installation and once all is complete, we should see the following page:




We first specify (or detect) our on-premises Exchange servers and then select an Office 365 host. In my case, it is "Office 365 Worldwide" (there is also a special option for China and Germany):




We then enter our credentials for both on-premises Exchange and the Office 365 tenant:




Connections and credentials are validated (successfully in this case):



Now the on-premises Exchange servers will be configured to function with Exchange Online. By default, email to and from external senders passes though Exchange Online first and messages for on-premise mailboxes are then routed to these mailboxes. This requires a modification of MX records. I have not changed these records and want mail to continue to transit via my on-premises Exchange servers for the moment. Therefore, I will check the "Enable centralized mail transport" option:



We select our Hub Transport servers (if they are not already selected automatically):



We enter our public IP address (if we are running the HCW for a second time, many of these fields will already be populated):



We indicate the organization FQDN:



And we are "ready to update" (Click "Next" or "Finish" as appropriate):



The HCW is working...




And the process completes:




***

One final comment: I encountered an error on the first try because a perimeter firewall rule was blocking port 443 inbound (sometimes open, sometimes closed, depending on the experiments I conduct in my text network). Once I adjusted the rule and ran the HCW again, the configuration completed as shown above.