Hello all! Having worked for a while with various computer systems, primarily Active Directory and Exchange, I wanted to share some of my experiences with two objectives in mind: 1) obtain feedback to improve my mastery of those systems and 2) help others working on the same subject. Other posts are about CentOS, Citrix NetScaler, and VMware.
NOTE: most of my posts are in English but some others in French, with a summary in English. However, some of the CentOS blog posts lack this summary.
In reverse order of complexity, I finished by sending a message from one internal user (Alan Reid) to another internal user (Alannah Shaw) who both have their mailbox in the same mailbox database. I then replied to the message.
As a reminder, we have two Exchange servers (EX13-1 and EX13-2) configured in a Database Availability Group (DAG) with the active copy of the (sole) mailbox database currently on EX13-2.
First message - Alan Reid to Alannah Shaw
The Message Tracking logs on EX13-2 show two events (EventID): a NOTIFYMAPI event which provides some information about the message and then a SUBMIT event.
Note: I had not included the NOTIFYMAPI events in my previous blog posts but will show one example below simply so we can see what it looks like. And once again, I have edited out some parameters (essentially those without values) for simplicity and concision.
So Alan Reid sends a message to Alannah Shaw: "int to int 1".
As we just said, there is a NOTIFYMAPI event and then a SUBMIT event:
According to one discussion I followed, it appears that the first entry is associated with the Mailbox role that notifies the Hub Transport role of the presence of a message (in the Outbox of the user?).
We should keep in mind that the Hub Transport role is responsible for ALL message delivery, even when a message goes from one mailbox to another in the same mailbox database.
In that context, it is interesting to observe (in the screenshot above) that the Hub Transport role on EX13-2 (where the active copy of the mailbox database is located) submits the message to EX13-1 (here again, the Hub Transport role manages the mail flow).
But what happens next? To see, we have to change our perspective to EX13-1 where the message is briefly in transit (presumably in the Transport database - yes there is a transport database also, where messages reside for a more or less short period while in transit).
In the two screenshots below, we see that EX13-1 receives the message from EX13-2 (EventID = RECEIVE) and then delivers the message to the recipient Alannah Shaw (EventID = DELIVER).
Since there is no DELIVER event in the EX13-2 message tracking logs, it seems clear that EX13-1 places the message in the recipient mailbox.
There is an abundance of additional information related to the Hub Transport role and mail flow online. In particular, there is a chart that helps visualize the processing of messages, the "Overview of the transport pipeline in Exchange 2010" chart available here:
After my first blog post on message tracking, I wanted to confirm the sequence of operations when one of "our" users (test user Alan Reid in this case) sends an email to an external recipient (a Gmail user).
The results confirmed the sequence in the first blog post.
In the scenario where we have two mail servers in a Database Availability Group, and when the mailbox database holding our user's mailbox (Alan Reid) is "active" on our first Exchange server (EX13-1), the following operations take place:
SUBMIT (EX13-1) : EX13-1 submits the message to EX13-2
RECEIVE (EX13-2) : EX13-2 receives the message from EX13-1
TRANSFER (EX13-2) : EX13-2 determines the route for the message (and performs content conversion).
SEND (EX13-2) : EX13-2 sends the message. We can obtain details of the session with the remote server from the protocol logs.
If I move the active copy of the mailbox database to EX13-2, the process is the exact opposite: EX13-2 submits the message to EX13-1 and the RECEIVE, TRANSFER and SEND operations take place on EX13-1.
The location of the DAG Primary Active Manager (PAM) does not seem to play a role.
I sent a total of 12 test messages, with the active copy of the mailbox database on EX13-1 for the first 3 tests, on EX13-2 for the next 3 tests, on EX13-1 for the following 3 tests and on EX13-2 for the last 3 tests. I see little usefulness in posting output or screenshots for each of these tests.
Now I want to send a message from Gmail to Alan Reid.
What entries does that produce in the Message Tracking logs ?
Before I proceed, I want to clarify the path taken by inbound messages.
Messages from Gmail first arrive at a hardware firewall that performs 1-to-1 NAT and forwards these messages to a Citrix NetScaler VPX load balancer. This device then forwards the messages to either of the two Exchange servers (EX13-1 or EX13-2).
Here it looks like EX13-1 receives the message (not EX13-2, contrary to the first two messages) but then sends it to EX13-2, even though the mailbox resides on EX13-1 (it is part of a single mailbox database that is now "active" on EX13-1):
And EX13-1 sends the message to EX13-2, only for it to deliver the message to the mailbox that is "active" on EX13-1 (in fact, it is the copy of mailbox database that is "active" on EX13-1):
Here (above) we see the same Receive - Deliver combination that we observed for the first two messages.
Fourth Message (Gmail - INBD - 004)
As for the 4th message, EX13-2 receives it from the NetScaler as for the Message 1 and 2 above, and then passes it to EX13-1 where it is delivered. Since the sequence is exactly the same, I will not waste space by posting screenshots that illustrate exactly the same thing.
The load balancer seems to play a role here:
It is configured to use "Round Robin" between the two Exchange servers and could deliver to either one, although the alternance between the two does not seem to be strict. One would have thought that 2 messages would go to EX13-1 and 2 messages to EX13-2 but in fact (or in this case) it is 3 messages for EX13-2 and only 1 for EX13-1.
I will not test moving the PAM. I doubt very much that this clustering component has any effect at all on the forwarding decisions of the load balancer or even on the inter-Exchange server routing.
I changed the active copy of the database from EX13-1 to EX13-2. For the next two test messages, I observed the same behavior.
If the server that does not hold the active copy of the mailbox database receives the message directly from the load balancer, it delivers the message. This happened for the first of two test messages.
If the server that DOES hold the active copy of the mailbox database receives the message directly from the load balancer, it sends it to its partner which in return delivers the message (to the mailbox currently "active" on the first server). This happened for the second of two test messages.
After some reflection, I have opted not to post the screenshots of the operations that took place on the servers after moving the active copy of the mailbox database, since it would simply be a repetition of the sequences examined earlier.
This post continues the previous post in which we examined message tracking logs to follow the path of an outbound message sent to a Gmail recipient.
As we discussed in the previous post, when we send a message to an external recipient, there should be a SEND entry in the logs that indicates the message was sent:
Note: this screenshot is the same as the one used in the previous blog post.
If we need more details about the transaction between our Exchange server and the remote mail server (or "smart host" in the case of an external relay server situated between our server and the final recipient), we can consult the Exchange "protocol logs", in this case the SMTP protocol.
Even for what I will call a "basic" transaction, or session, the output in the protocol log is quite verbose. If TLS encryption is negociated, the content of the logs is even more plentiful.
I will opt to show screenshots of one of the sessions that took place when I sent a test message to Gmail, concentrating on these "basic" operations, and excluding the TLS negociation.
I will then analyze the elements of each sequence of the session. The objective is to recognize what happens when this all works so we can compare that to the output of a failed communication between mail servers.
This is what a single session will look like in the protocol log:
I need to clarify some points before moving forward:
The most observant may remark that the dates do not coincide with the dates in the previous blog post. For some reason, I ended up using logs from a different date. That does not change anything fundamentally as for the analysis of the log.
I have changed the external recipient address for confidentiality.
I have "snipped" the section pertaining to TLS negociation for both simplicity and concision.
Although the external recipient is a Gmail user, the external mail server referenced in the logs is in fact a "smart host" through which my outbound messages are relayed.
A single protocol log file normally contains numerous SMTP sessions distinguished by the "session-id".
The log is organized in columns (or "fields") separated by commas (we could open the log file in Excel for a better view). At the top of each column is a header that is more or less descriptive of the column's function (line 5 in the first of the two screenshots above):
Let's dissect the first couple lines. What I will call the "first" line is the sixth line from the top of the log and has the "sequence-number" of 0 (zero). As we will see, each step of the communication between the two servers is designated by a number (starting with 0, not 1). If another message had been sent, the sequence would start all over again at zero. And as we mentioned before, each inter-server communication, or "session", has its own "session-id" that identifies it from the others in a log file that could include hundreds of sessions.
This is the date and time of the event, measured to fractions of a second (.389) and expressed in UTC (Z = Zulu or "Zulu Time").
This is the name of the send connector, in my case: MYNET-SendConn-1.
The session ID distinguishes the events related to the sending of "message A" from events related to "message B" and "message C" and so forth.
A sequence number marks each step of the sending of the email.
There is no local endpoint in first line. In following lines, it will be the IP address of the local Exchange server (10.0.0.24 in my case).
This is the IP address of the remote mail server or smart host (note the port 25 - SMTP). In my case, it is, in fact, a relay (No-IP).
Please see the paragraph below (protocol logging event values). In this case, the " * " designates an informational event (no data has been transfered in either direction).
No data has been transfered so far.
attempting to connect
This is almost self-explanatory : my server is "attempting to connect" to the remote mail server (or smart host).
Protocol logging event values
For the event field, these are the possible values:
For additional information, please consult this MSDN document:
If we look at the two screenshots above, we can see that there are 55 operations that take place during this single session, some send, some receive, and so forth. Even if we exclude the SSL entries, that is quite a lot of data to analyze. Therefore, I will limit myself to the next line.
Since most of the first fields do not change (except for the fractions of second for date and time), I'll start with the session-id to underline that fact that it does not change but that the following field, the sequence-number does change.
This number will remain the same for the entire session.
Remember: we start with 0.
This was not present in the first sequence (0). It is the IP address of the sending server plus a randomly selected port. Note that, while the port of the destination host will always be 25 (barring some exceptional reconfiguration), the port used by the sending server can be something else. I would have thought this port would have been in the 49152–65535 "ephemeral" port range but apparently it can be selected in the registered port range (1024 - 49151).
This remains the same.
According to the list above, this is a "connection" event.
data and context
Both these fields are blank on the second line (of the session).
Below is another perspective of the session. I've copied only the event, data and context fields (columns) so we can concentrate on the "conversation" between the two end-points. Note that, for the most part, the context field is not populated, the first line, with the value "attempting to connect" being an exception:
I will not analyze each step but will point out that all the 2xx SMTP codes (220, 250, 221) indicate that the operation was a "complete success". 3xx codes (334, 354) indicate success as well but more information is expected. 4xx codes indicate a temporary or "transient" failure (example: maximum number of recpients reached - for a single operation, but the server can try again later) and 5xx codes indicate permanent errors (example: incorrect address - the message will never arrive unless the sender corrects the address).
In the context of message tracking and troubleshooting, I wanted to examine the SMTP connection process between two mail servers, without necessarily dissecting each and every step in detail, so I would have a "good copy" to which I could refer when attempting to resolve a failed connection.
Like many administrators, I often juggle several projects both at work and outside of work and the time I can dedicate to the analysis of a particular operation is not unlimited. Therefore, I opted not to analyze in greater detail each of the steps presented in the previous section.
However, for those that would like more information on the SMTP protocol, you could (besides performing your own searches) consult the following documents:
From time to time, I have to track a message to determine if it was delivered and if not, where it may have been blocked. In most situations, this process goes beyond the scope of Exchange and requires that we take into account perimeter devices (such as Barracuda or Ironport) and even online mail hygiene systems like Exchange Online Protection.
In one of the more common Exchange configurations, two multi-role servers configured as a Database Availability Group (or DAG), I have observed certain behavior that I wanted to verify by sending and resending a message in various scenarios, for example :
Send to external recipient
Receive from external sender
Send to internal recipient
Receive from internal sender.
Before I begin, however, I'm allowing myself a brief review of message tracking and protocol logs. Not having analyzed some of these types of logs for a while, I wanted to take another look at some of the basic concepts. My review is not necessarily complete, so feel free to consult other sources if necessary, and official Microsoft documentation in particular.
Note: since this blog post was becoming longer and longer, and since the analysis of the protocol logs related to my test message run the risk of being long and perhaps confusing as well, I've decided to look at the message tracking logs only in this first post and reserve the exam of the protocol logs to the following post. However, I will present the protocol logs (location, status) in this first post.
Where do we look when tracking a message ?
There are two types of logs we can examine : message tracking logs and protocol logs:
Message tracking - fairly readable.
Protocol (SMTP) - can provide very verbose details on connections established with remote SMTP servers but can be difficult to interpret.
The message tracking logs are stored here (by default):
There are three subfolders at this location: SmtpReceive and SmtpSend (plus a HTTPClient folder that does not play a role in mail transport). The logs are stored in these subfolders in the form of text files (just like the message tracking logs).
Unlike message tracking, SMTP protocol logging is not enabled by default. If we forget to enable it, we will not have much to work with. Moreover, SMTP protocol logging is not per server but rather per connecter; receive connector or send connector that is. We could enable protocol logging for receive connectors but not send connectors, or only certain receive connctors.
We can enable logging for a send connector with the following cmdlet (replace the send connector name with the name of your send connector(s)):
What happens when... we send a message to an external recipient (at Gmail for example)?
Message Tracking logs
For the first experiment, our test user Alan Reid sends a message to a Gmail user. Alan Reid's mailbox is on EX13-1 (DB01). He accesses his mailbox Via Outlook. We are not using Exchange cached mode so the mailbox is being accessed directly (no OST files).
I begin the message tracking on EX13-1 with the following parameters so all operations (SUBMIT, RECEIVE, SEND, etc.) are captured:
I will use the PS equivalent (visible at the bottom of the sceenshot above) to display the details of each operation.
Note: I will also edit addresses for privacy and "snip" some of the output for concision and readability (especially when the values for the parameter in question is null).
On EX13-1, the only operation recorded is a SUBMIT event (other than a NOTIFYMAPI entry with essentially no data).
Here are the details of the SUBMIT event (with, once again, some edits):
The RECEIVE entry on EX13-2 seems to correspond with the SUBMIT entry on EX13-1 : in both cases, EX13-1 is the "client" (ClientHostName) and EX13-2 is the "server" (ServerHostName).
Some more parameters now have values, notably "Recipients" and "Returnpath".
Now that EX13-2 has the message "Outbound-001", what does it do with the message?
At this point, EX13-2 is peforming ROUTING and content conversion. It determines the IP address of what is (at first glance) the recipient's mail server and the IP address of this server.
In fact, "smtp-auth.no-ip.com" is a "smart host" through which messages traverse before reaching the ultimate destination, in this case Gmail.
A send connector has been selected, the message appears to have been queued ("as 054E2400836") and the recipient status is Ok. It seems like the message has been sent.
In any case, this is the last event recorded in Message Tracking :
Note: the Gmail recipient did indeed receive the message.
Protocol logs (to be continued...)
I entitled this blog post "Message Tracking" even though I'm examining not only the "message tracking" logs but also protocol logs. These can be useful if we track, for example, an outgoing message that apparently cannot reach the target mail server, or a message that an external remote server cannot send to our organization. If, for some reason, the servers cannot "talk" to each other in general, then it is clear that specific messages will not be delivered. If the servers can communicate, we can at least rule out a general communication failure as the cause of the problem.
As I studied the protocol log file associated with the message sent above, I was reminded that it is not only challenging to read the logs but also present them. For this reason, I have decided to divide the analysis of this first (outbound) message into two parts: "Message Tracking" per se, that we have examined in this blog post, and the "dissection" of a send protocol log that will be the subject of my next blog post.
Windows Management Framework 3.0 and Windows Management Framework 4.0 can be used to perform operating system-related management tasks on a computer that's running Exchange 2010 SP3 RU5 or later. However, Exchange 2010 cmdlets and Exchange 2010 scripts require Windows PowerShell 2.0. Using Exchange 2010 cmdlets and scripts with Windows Management Framework 3.0 or Windows Management Framework 4.0 isn't supported.
Note: PowerShell is part of the Windows Management Framework (WMF). If we want to have PS4, for example, we install WMF4.
Having observed some problems with the Exchange DAG maintenance scripts myself (besides the discussion in the link above), I wanted to study the interaction between PowerShell 4 and these scripts in a test environment. If the problem is confirmed, I intend to elaborate a workaround solution (probably executing the cmdlets in the script manually).
Here is the status quo of my two Exchange 2010 servers...
The PowerShell version is 2:
I have installed Rollup 14 (Exchange 2010 SP3 Rollup 5 is the lowest version compatible with PS4):
The Exchange servers are named EX13-1 and EX13-2 (despite the "13" these are indeed Exchange 2010 servers).
We have a Database Availability Group (DAG) configured. EX13-1 and EX13-2 are both members. At this time, EX13-2 holds both the "active" copy of the mailbox database "DB1" and the "Primary Active Manager" (PAM) role:
Yes, the mailbox database DB1 is "mounted" on EX13-1 and we have a healthy copy on EX13-2 that we can make active (that, essentially, is the "fail over" process).
We move the active copy of the mailboxes off the server (that will undergo maintenance) with this cmdlet:
Move-ActiveMailboxDatabase -Server EX13-1
Note: we could also use this cmdlet that specifies the database, the server on which we want to activate the database, and the override setting: Move-ActiveMailboxDatabase -Identity 'DB1' -ActivateOnServer 'EX13-2' -MountDialOverride 'None'
We move the PAM role with this command:
cluster dag1.mynet.lan group "Cluster Group" /moveto:EX13-2
This post marks a pause in my series on CentOS Linux (composed in French). I continue to work with Office 365 and wanted to run the Hybrid Configuration Wizard (HCW) included in the Exchange 2010 Management Console. This was in the context of troubleshooting another problem that I will not address here. In any event, the following error message displayed:
However, Exchange Online has clearly been added as an additional forest:
I had not executed the HCW for a while but know for a fact that it functioned correctly at one time (perhaps several months ago).
An online search led me to this Microsoft article:
The Hybrid Configuration wizard
that's included in the Exchange Management Console in Microsoft Exchange
Server 2010 is no longer supported. Therefore, you should no longer use
the old Hybrid Configuration wizard. Instead, use the Office 365 Hybrid
Configuration wizard [...].
There is a direct download link for the wizard but for some reason the installation failed, supposedly due to some missing files. So I thought I would just login to my Office 365 tenant and attempt the operation from there.
But first, as the article explains, we must remove and then re-add the Office 365 / Exchange Online "forest" in the EMC. I right-click on the Office 365 entry and select "Remove" in the menu:
I then re-add Office 365 as an Exchange forest:
We enter a name for the "new" forest and select Exchange Online for the "FQDN" (if you click on the arrow, it is a choice - althought technically not a "FQDN"):
I enter my credentials when prompted:
And Office 365 has been re-established as an Exchange forest in the EMC:
Once I re-add Exchange Online as an additional forest, I login to Office 365 and find myself at this page:
As indicated by the red dots in the screenshot above, we click on the Applications icon and then on "Exchange", which leads us to this page (you can click on the image to enlarge):
As for the HCW, we click on the "Configure" button (shown above). We are then prompted to download the application:
Note: if we perform this operation on the Exchange server itself, we may have to disable the Internet Explorer Enhanced Security Configuration, at least temporarily.
We respond to any prompts by allowing the installation and once all is complete, we should see the following page:
We first specify (or detect) our on-premises Exchange servers and then select an Office 365 host. In my case, it is "Office 365 Worldwide" (there is also a special option for China and Germany):
We then enter our credentials for both on-premises Exchange and the Office 365 tenant:
Connections and credentials are validated (successfully in this case):
Now the on-premises Exchange servers will be configured to function with Exchange Online. By default, email to and from external senders passes though Exchange Online first and messages for on-premise mailboxes are then routed to these mailboxes. This requires a modification of MX records. I have not changed these records and want mail to continue to transit via my on-premises Exchange servers for the moment. Therefore, I will check the "Enable centralized mail transport" option:
We select our Hub Transport servers (if they are not already selected automatically):
We enter our public IP address (if we are running the HCW for a second time, many of these fields will already be populated):
We indicate the organization FQDN:
And we are "ready to update" (Click "Next" or "Finish" as appropriate):
The HCW is working...
And the process completes:
One final comment: I encountered an error on the first try because a perimeter firewall rule was blocking port 443 inbound (sometimes open, sometimes closed, depending on the experiments I conduct in my text network). Once I adjusted the rule and ran the HCW again, the configuration completed as shown above.