The Current State of SMTP STARTTLS Deployment
A lot of sensitive data is sent over email, so we encrypt emails in transit via STARTTLS when available. STARTTLS has been around for 15 years, but we’d heard that it wasn’t widely deployed. To test that perception, we decided to see how many of the notification emails we send are successfully encrypted.
We found that 76% of unique MX hostnames that receive our emails support STARTTLS. As a result, 58% of notification emails are successfully encrypted. Additionally, certificate validation passes for about half of the encrypted email, and the other half is opportunistically encrypted. 74% of hosts that support STARTTLS also provide Perfect Forward Secrecy.
It’s clear to us that STARTTLS has achieved critical mass and there is immediate value in deploying it. We encourage anyone who has not already deployed STARTTLS to at least deploy it for opportunistic encryption. As more systems support email encryption, the value increases for everyone.
Methodology
Facebook sends several billion emails to several million domains every day. This is mostly comprised of notification emails about various activities on Facebook as well as account-related emails such as registration confirmations and password resets. We used a single day’s worth of our notification email logs from our production system for this report, since our goal here is to show a snapshot of current deployments rather than configuration changes over time. These logs contain the kind of data you would expect to find in any email server logs, such as the sender and recipient, where the email came from, and where we are sending it. For the purposes of this report we only concern ourselves with the STARTTLS results, the recipient’s domain, the MX hostname we connected to, and the receiving email server’s IP address.
The majority of email addresses we send to are assumed to be for personal use. Given the large number of addresses and domains we send to, we feel that our data provides a good representative sample of personal and general purpose mailbox providers. Government and corporate email systems are likely underrepresented in this report.
Our system attempts to negotiate TLS encryption with every SMTP server it connects to which advertises the STARTTLS capability. If the negotiation is successful, we encrypt the email and send it on. If we can’t successfully negotiate, then we send the email unencrypted. We log the results in either case, including the negotiated cipher suite and attributes of the certificate presented by the server when we are successful. We then load the logs into Hadoop for further analysis. It’s also worth noting that the performance impact of enabling TLS for outbound connections was negligible.
Data and Observations
The following graphs show the log data aggregated in various ways. For graphs that show STARTTLS results, we show the relative percentages of ‘Strict’, ‘Opportunistic’, ‘Failure’, and ‘None’. These categories are defined as follows:
Strict: A TLS cipher suite was successfully negotiated and the presented certificate passed strict validation. Strict validation means that the certificate was not expired, was signed by a trusted certificate authority, and matched the hostname we connected to. We allow wildcarded certificates.
Opportunistic: A TLS cipher suite was successfully negotiated but the presented certificate did not pass strict validation for one or more reasons.
Failure: The SMTP server advertised STARTTLS, but we could not successfully negotiate a cipher suite. This could be due to a lack of acceptable cipher suites or other configuration issues. As a result, the email was sent unencrypted.
None: The SMTP server did not advertise STARTTLS. The email was sent unencrypted.
Figure 1 – Overall STARTTLS Results
Figure 1 shows the overall results of STARTTLS behavior. From the ‘All Email’ bar on the left we can see that nearly 60% of all emails are sent via an encrypted connection, but only about 30% pass strict validation. 60% is an encouragingly high percentage, but this number is potentially skewed since the bulk of email volume is sent to a small number of large mailbox providers. We need to aggregate the data in a few different ways in order to compensate for this and get a clearer picture of STARTTLS behavior across all email systems. The other three bars in Figure 1 are based on unique counts of the following identifiers:
Domain: The domain portion of the recipient email address.
MX Hostname: The hostname returned by querying the MX record of the domain.
IP Address: The IP address of the receiving SMTP server.
The relationships between these three identifiers vary as inbound email infrastructure is deployed and configured as needed, and operators use different techniques to manage their infrastructure at different scales. For example, 25.76% of unique recipient address domains pass strict validation, while 7.97% of unique MX hostnames pass strict validation and only 6.63% of unique server IP addresses pass strict validation. This is because a single MX hostname can handle traffic for many domains and can have multiple unique IP addresses behind it, a single domain can have multiple MX hostnames, etc.
The relationships between these three identifiers vary as inbound email infrastructure is deployed and configured as needed, and operators use different techniques to manage their infrastructure at different scales. For example, 25.76% of unique recipient address domains pass strict validation, while 7.97% of unique MX hostnames pass strict validation and only 6.63% of unique server IP addresses pass strict validation. This is because a single MX hostname can handle traffic for many domains and can have multiple unique IP addresses behind it, a single domain can have multiple MX hostnames, etc.
The ‘Domain’, ‘MX Hostname’, and ‘IP Address’ bars show a higher percentage of encrypted traffic but a lower percentage of strict validations than the ‘All Email’ bar. These results show that STARTTLS support is widely deployed, but that there are also widespread issues with certificates. Also of note, in all cases the number of failures is very small.
Figure 2 – Overall reasons for strict validation failure
Figure 2 shows the top reasons why strict validation fails as a percentage of opportunistically encrypted traffic. Some reasons or combinations of reasons are not listed, such as ‘Expired and Mismatched’. Those have been omitted because they account for less then 1% for each identifier. The failure reasons are as follows:
Self Signed: The presented certificate was signed by the domain itself instead of a certificate authority.
Untrusted CA: The presented certificate was signed by a certificate authority that we consider untrustworthy.
Mismatched: The presented certificate does not match the hostname exactly or via wildcard.
Expired: The presented certificate has passed its expiration date.
Mismatched certificates are the single largest reason why strict certificate validation fails across all identifiers. 99.35% of all opportunistically encrypted emails fail validation simply because the certificate does not match the hostname; the certificates are otherwise acceptable. The next three largest categories include mismatched certificates as part of the reason, but have additional issues.
Figure 3 – Successfully negotiated cipher suites
The strength of supported cipher suites is a common concern, as weak or vulnerable ciphers can be easily defeated. Figure 3 shows the successfully negotiated cipher suites broken down by identifier. The majority of encrypted email is sent with the ECDHE-RSA-RC4-SHA or DHE-RSA-AES256-SHA cipher suite. This is likely due to those being the preferred cipher suites of the major providers. DHE-RSA-AES128-SHA, however, is the preferred cipher suite for the largest percentage of deployments. AES128-SHA is the next most prevalent, which is concerning because it does not provide Perfect Forward Secrecy.
Figure 4 – Perfect Forward Secrecy support in negotiated cipher suites
Although the second most prevalent cipher suite does not provide Perfect Forward Secrecy, the majority of preferred cipher suites do—as shown in Figure 4.
Conclusion
STARTTLS encryption is widely supported and has achieved critical mass despite some issues with certificate management. A system deploying STARTTLS support for the first time can expect more than half of its outbound email to be encrypted. Also, the majority of deployments provide Perfect Forward Secrecy. We see two high priority areas for improvement. First, we encourage the industry to work together to develop better tools for preventing mismatched certificates. Second, we encourage everyone to deploy support for opportunistic encryption via STARTTLS.
About The Author
Michael Adkins is a Mail Integrity Engineer at Facebook.