Why Do Certificate Revocation Checking Mechanisms Never Work?
The certificate revocation system, just like any other cyber security mechanisms is needless, as long as everything is going well, but becomes essential when things go wrong, namely a certificate has been compromised. A compromised certificate makes it possible for a malicious third party to create a service that acts like the legitimate one or insert itself as a man-in-the-middle eavesdropping any data transferred between the parties without any knowledge of the client or the server. The revocation system is about to give the chance to the client to learn that a server certificate is revoked, as it has been compromised. Nothing shows better that we are facing a serious problem than the fact that several thousand certificates are revoked day by day.
Unfortunately the revocation checking mechanisms have several theoretical and practical difficulties. Although the basis of the mechanisms is the fact of revocation, this alone is far from sufficient. There should be a revocation checking mechanism which could work under real circumstances. Each stakeholder — client application developers, server administrators, middleware solution manufacturers, certificate authorities — have their own requirement and none of the existing revocation check mechanisms can satisfy all the requirements. It must be stated when a certificate gets compromised — until it is revoked and the clients learned the fact of revocation — it should be assumed that a malicious third party — who has no authority to the host(s), which the certificate belongs to — could impersonalize any kind of service (web, mail, remote access, etc.) with any domain names the certificate is validated to.
Revocation Check Mechanisms
First off all take stock of the possibilities in chronological order and summarize how they work and what are the main issues of them.
- Certificate Revocation List (CRL): The oldest mechanism which is nothing else than the collection of the revoked, but not expired certificates managed and published by certificate authorities. It is useless to keep the expired certificates on the revocation list as the expired certificates are handled as invalid by the clients independently from the fact that they have been revoked or not. As the Let’s Encrypt CAA checking issue showed it can happen that millions of certificates must be revoked at the same time. In that case, the CRL size dramatically grows and all the client applications (eg: browsers, mail clients, …) that want to check the revocation state of a certificate must download the CRL file.
- Online Certificate Status Protocol (OCSP): The mechanism aims to solve the size issue that CRL suffers from by making it possible to ask a central server, OCSP responder, about the revocation state. Any type of client can send a request to an OCSP responder — managed by the certificate authority — whether a certificate, issued by that certificate authority, is revoked or not. In terms of information theory it is just a bit of information which is not affected by the number of actually revoked certificates, but makes it possible to track the client by its requests.
- OCSP Stapling: OCSP Stapling is currently the best method for checking certificate validity. In case of that mechanism, the server proves to the client that the certificate which will be shown is not revoked. The server obtains a time stamped (“stapled”) proof from certificate authority, which has issued the server certificate, about the revocation state and caches it. When a client connects to the server, it offers the cached proof to the client during the TLS handshake. This mechanism transfers the responsibility of the certificate revocation state serving from the certificate authority to the server maintainers. As the mechanism is optional revocation check state serving depends on the capability and the configuration of the server.
- OCSP Multi-Stapling: The mechanism is almost the same as the OCSP Stapling. The difference is that part of OCSP Stapling only one staple can be sent, but OCSP Multi-Staple makes it possible to send multiple staples — not only for the server certificate, but also for the intermediate ones — during the TLS handshake. Although this mechanism makes it possible to check the whole certificate chain, the most popular cryptographic library (OpenSSL), does not support it.
The evolution of the certificate revocation check mechanisms is not a success story. There are several stakeholders and they have sometimes slightly different requirements against the certificate revocation system. However the primary influence factor should be the security, the operability challenges necessarily undermine this principle. Revocation check mechanisms cannot fulfill all the requirements either of security professionals, or software vendors, or user needs, which causes revocation check mechanisms to fail.
As it can be seen there is no silver bullet of certificate revocation check mechanisms. Newer and newer mechanisms have appeared, but they could not solve the issues, only changed pre-existing issues to brand new ones.
The first issue we have to discuss is how an automated system can know what the source of revocation state information is. Without this information none of the stakeholders, client software (eg: web, mail, remote access clients), middleware devices (eg: firewalls) cannot acquire the essential information. The problem itself is independent from the type of the client software, it only depends on the type of revocation check mechanism.
- Certificate Revocation List (CRL): There is X509 extension in the issued certificate, called CRL distribution points (CDP), which may contain one or more URIs pointing to the CRL, managed by the issuer certificate authority (CA). The problem is that the CDP extension is non-critical, so it is not necessarily found in a certificate. However the vast majority of the certificate authorities fill the extension (filled in 2207 out of 2233 intermediate CA certificates), but not all of them.
Middleware devices could use certificate revocation lists very effectively as they have the necessary resources to prefetch the information which certificates have been revoked and also could regularly update that. The list of distribution points of all the certificate authorities is a prerequisite, but there is no official list, even though the PKI is a highly centralized system. Mozilla has an intermediate CA certificate collection, can be used to achieve the CDP locations of the intermediate certificates, but cannot be used in case of server certificates and the latter would be more important as the probability of compromise is much higher in case of a server certificate compared to an intermediate one. It is also considerable that the number of server certificates is many times the number of intermediate certificates. Anyway CRL fetching could be done on-the-fly, when a connection has been initiated, but it may take a relatively long time and until the fetch has finished the client must be awaited, which may cause significant latency for the client.
- Online Certificate Status Protocol (OCSP): The working method is almost the same compared to the CRL. There is an X509 extension in the issuer certificate, called Authority Information Access, which may contain one or more URIs pointing to the OCSP responder of the CA. The problem is that the extension is non-critical, so it is not necessarily found in a certificate. Although the vast majority of the certificate authorities fill the extension (filled in 1994 out of 2233 intermediate CA certificates), there is still a shortage.
- Middleware devices could use OCSP responders to prefetch, cache and update the revocation information of the most popular sites, but there is a prerequisite — the OCSP responder locations of all the certificate authorities — which is not satisfied lack of an official list of responder URLs, even if the fact that Mozilla’s intermediate CA certificate collection can also be used to achieve the OCSP responder locations of the intermediate certificates, but cannot be used in case of server certificates.
- OCSP Stapling: The question of the source does not occur according to the working method as the proof about the revocation status of the server certificate is sent by the server itself, so no extra information (URI) is needed to achieve it.
Data integrity is an important question when we are talking about the revocation mechanisms, as the client must be sure that the revocation information cannot be modified by a malicious third party. The revocation information goes without encryption on the network, but this in itself is not a problem.
- Certificate Revocation List (CRL): The most commonly used protocol to achieve the CRL is HTTP, regardless of that HTTP cannot guarantee either the integrity of the CRL or the whole HTTP traffic. In fact the CRL is digitally signed by the certificate authority that manages and publishes the CRL and issued the certificate which has been revoked, so a malicious third party cannot forge it without the knowledge of the user, assuming that it verifies the signature. Not the HTTP traffic, which may be modified during a man-in-the-middle attack, which could not compromise the CRL directly, but can cause disruption indirectly by modifying HTTP related parameters forcing clients to use a soft fail mechanism. It must be noted that the hash algorithm of the CRL’s digital signature is as important an issue as in the case of certificate signing.
- Online Certificate Status Protocol (OCSP): OCSP responders also use the HTTP protocol to serve the responses, just like CRL distribution points, as the responses are digitally signed by the certificate authority manages the responder and issued the certificate which ones revocation state is in question, so the integrity implemented in a same way as for CRL distribution points, so it could be disrupted in the same way.
- OCSP Stapling: OCSP staple is served by the server itself, during the clear text part of the TLS handshake. As the staple is digitally signed by the certificate authority, the server certificate, so the integrity is implemented in the same way as for CRL distribution points and the OCSP responders. Anyway the TLS handshake closes with a verification, where the parties calculates the checksum of the handshake received by them from the other party and compares it with the checksum sent by the other party in encrypted form, so the integrity is double checked in that way.
Client applications may have difficulties accessing revocation state information, especially when the internet access is controlled by a firewall or other type of security system, as the revocation information comes from another server than the original server that the client wanted to be accessed.
- Certificate Revocation List (CRL): Revocation information comes from different IP address (CDP IP address instead of the server’s one), port (80 instead of 443) and uses different protocol (HTTP instead of HTTPS) than the original server, so it is not guaranteed that the client can access the CDP along with that it can access the server itself.
- Online Certificate Status Protocol (OCSP): The problem is exactly the same as it is in case of CRL distribution point, as OCSP responders are also on different IP addresses (OCSP responder IP address instead of the server’s one), port (80 instead of 443) and use different protocol, so the OCSP could not solve the problem occurred on CRL distribution points.
- OCSP Stapling: As OCSP staple is served by the server itself, during the TLS handshake, so the problem is not applicable, as the client application has already accessed the server. Only the OCSP stapling could fully solve the problem of access.
As the Let’s Encrypt CAA checking issue showed it can happen that millions of certificates must be revoked at the same time, so a revocation check mechanism must be able to handle that case.
- Certificate Revocation List (CRL): The CRL size can grow dramatically and all the client applications (eg: browsers, mail clients, …) that want to check the revocation state of a certificate must download the CRL file.
For client software, the revocation list is an unsuitable mechanism as the size may extend suddenly. Even small revocation lists would cause significant memory usage on mobile devices because all the revocation lists of the significant amount of intermediate certificates should be stored in memory. Regular update of the revocation lists would cause considerable traffic charge which may have financial implications on mobile devices.
Major browsers do not use this mechanism at all. Chrome and Firefox have their own mechanism respectively called CRLSets and OneCRL to solve the problem. Both of them are CRL collections maintained by Google and Mozilla, which are updated regularly in the background, without the user needing to restart or update their browser. It is fair enough if we are talking about Chrome/Chromium and Firefox, but several other browsers and also other types of clients (eg: mailing clients) do not use CRLs at all.
For middleware devices, revocation lists could be the optimal solution, they could prefetch, regularly update and cache the revocation lists, as the vast majority of certificate authorities support the revocation list both in intermediate and root certificates, so revocation lists could be used to check a full certificate chain. However there are huge deficiencies. First one is the fact that the most popular certificate authority does not support CRL for server certificates, so almost 200 million actively used certificates cannot be checked that way. The second one is that prefetch cannot be done as there is no official list of all CRL distribution points.
For certificate authorities, it is a comfortable mechanism as It is not used by a large number of client applications (eg: browsers), means there is no huge load on CRL distribution points which came from distributed sources, which makes defense difficult against a DDoS attack.
- Online Certificate Status Protocol (OCSP): The mechanism solves the size issue that CRL suffers from by making it possible to ask a central server, OCSP responder, about the revocation state. Any type of client can send a request to an OCSP responder, managed by the certificate authority, whether a certificate, issued by that certificate authority, is revoked or not. In terms of information theory it is just a bit of information which is not affected by the number of actually revoked certificates.
For client software, OCSP completely solved the size problem as it is small enough to download it even for all new connections.
For middleware devices the size problem has never been significant as in a typical environment where they are being used, necessary bandwidth can be considered to be given, nevertheless OCSP is an imperfect mechanism as OCSP responses cannot be prefetched and used as a database like a certificate revocation list, however it can be fetched on-demand and can be cached for later use for a moderate amount of time.
For certificate authorities, it is a comfortable mechanism as in practice there is no client which uses that mechanism to check the certificate revocation, so it causes minimal load on the OCSP responder.
- OCSP Stapling: As OCSP staple has almost the same size as an OCSP response, the size problem has been solved for both client software and middleware devices, especially that the OCSP staple is served by the server itself, so it is just some extra bytes during the TLS handshake.
For certificate authorities, OCSP stapling is a demanding mechanism as If a certificate authority had 200 million active certificates, which is the actual case of Let’s Encrypt, and only third of servers would support OCSP stapling and a validity period of the staple would be 5 minutes it would result 200.000 requests per second on the OCSP responder, which is apparently causing difficulties.
We can think of certificate revocation states as a database, where the fact of a query is sensitive data, as if you perform a query for a certificate of a certain website it is quite likely that you are just visiting that site, so you can be tracked in that way. As the overwhelming majority of the web traffic nowadays is encrypted it means that the overwhelming majority of the traffic can be tracked in that way.
- Certificate Revocation List (CRL): A certificate revocation list can be considered as a database, so the problem is not applicable, as the whole database can be downloaded and it can be queried locally not on the certificate authority side.
- Online Certificate Status Protocol (OCSP): An OCSP response contains verified information about the revocation state of a certificate which has been issued by a CA managing the OCSP responder, so the OCSP request which has been sent by the client must contain information which identifies the certificate in question. A certificate is issued for a domain that the client wants to visit, so by an OCSP request the CA learns what domain the client visited.
For a client software it is a considerable risk as this mechanism makes the user of the client software trackable by a third party whose responsibility would be to increase the privacy level not to decrease it.
For a middleware device it can be a considerable risk, as the organisation using the device can be tracked, but a specific member of the organisation is not.
- OCSP Stapling: OCSP staple is served by the server itself, so the problem is not applicable, as the client application has already accessed the server, so it can be tracked by the server anyway.
Certificate revocation checking should not be limited to the server certificate itself. Revocation check of the issuer certificates is also essential, as if an intermediate certificate is compromised any server certificate can be issued to any domain by the malicious third party that compromised the certificate. Revocation check of a root CA certificate is like a serpent eating its own tail, they are self-signed, so they verify themselves.
- Certificate Revocation List (CRL): As the vast majority of the certificates, both intermediate and leaf ones, contain CRL distribution points their revocation state can be checked by CRLs, so the full certificate chain can be checked in practice.
- Online Certificate Status Protocol (OCSP): As more than ten percent of the intermediate CA certificates do not contain OCSP responders their revocation state cannot be checked by the OCSP itself. The revocation check must be combined with CRL checking if we want to be sure.
- OCSP Stapling: As it was discussed earlier this mechanism depends on the certificate authority OCSP support, so customers of almost ten percent of the certificate authorities are not able to provide OCSP stapling as a mechanism to the clients. Another major problem is that only one revocation state can be sent by the server, prior to TLS 1.3, so in contrast to CRL and OCSP, the full certificate chain cannot be validated, but only the server certificate itself.
For client software, OCSP stapling might even be suitable, as intermediate certificates can be checked by mechanisms like CRLSet or OneCRL.
For middleware devices, OCSP stapling could be a perfect mechanism as server certificates could be checked by OCSP stapling, intermediate certificates could be checked by CRL, but we must not forget about the relatively low level of server support, 34–36 percent among the top 150 thousand domains, in that case.
The situation is exacerbated by the fact that more than one percent of the intermediate CA certificates contain neither CRL distribution points, nor OCSP responder, which makes impossible the full chain revocation check in case of these certificate authorities.
Certificate status information — independently from the revocation check mechanism — must be verified by the certificate authority which issued the certificate in question. Without this the status information would be unauthoritative and could be modified by a malicious third party without the knowledge of the peer which tries to verify the status information.
- Certificate Revocation List (CRL): Revocation lists can be signed by the certificate authority which issues the server certificate directly or indirectly. It can be verified easily — when the revocation list is signed by the issuer of the server certificate — as the peer must have all the issuer certificates to verify the server certificate itself. There is an additional step if the revocation list is signed indirectly by the issuer of the server certificate, as the revocation state of that intermediate certificate — which has signed the revocation list — has to be checked also, which means that we have to acquire revocation list for this intermediate certificate.
This revocation check mechanism highly depends on digital signatures, as the integrity of the revocation list is guaranteed by a signature. Weak signature algorithms (eg: MD5, SHA1) have completely disappeared as a certificate signature algorithm — according to Qualys SSL Pulse — but already used as a signature algorithm of the revocation lists, which potentially makes possible compromising the revocation information.
Revocation list has a lifecycle, controlled by two fields of the list. This update field shows the date when the revocation list has been issued and the revocation list will be issued no later than the value of the next update field, independently from the fact whether the content of the revocation has been changed. Between these two dates a revocation list is considered to be valid — apart from delta CRLs — however updating the revocation list before a validity check is a must.
- Online Certificate Status Protocol (OCSP): OCSP responses, just like revocation lists, can be signed by the certificate authority which issues a server certificate directly or indirectly, so the problem is the same that it was in case of revocation list. There could be an intermediate certificate which revocation state should also be checked, so an extra OCSP response or revocation list should be acquired.
OCSP responses also depend on digital signatures weak signature algorithms are still used by OCSP responders in spite of the fact that OCSP is much more modern revocation check mechanism than the revocation list. OCSP requests and responses contain the hash of the issuer’s distinguished name and the hash of the issuer’s public key to identify the issuer, so there is another place where weak hash algorithms are possibly used.
OCSP responses contain this update and next update fields, as well as revocation lists, define a recommended validity interval. This interval corresponds to the this update and next update interval in revocation lists. OCSP responses whose next update value is earlier than or this update value is later than the current time should be considered unreliable.
- OCSP Stapling: The situation is almost the same as it was in case of OCSP, but there is an extra issue we have to take into account. Along with the fact that OCSP staple is served by the server, the validation cannot be fully performed without checking the revocation state of the staple issuer which needs acquiring an OCSP or a revocation list from another port of another server with notable latency.
- OCSP Multi Stapling: OCSP multi stapling cannot solve all the issues above. Although staples could be sent about the revocation state of the server certificate and all the intermediate certificates, but the number of OCSP staples must not be higher than there were number of certificates in the chain, so delegated OCSP issuer certificates are uncovered. TLS 1.3 deprecates OCSP multi stapling, but OCSP staple can be attached to each certificate in the chain, that would seem correct at the first sight, but it suffers from the same problem as OCSP multi-stapling. If delegated certificates are used to issue the OCSP responses their revocation status cannot be sent to the client part of the TLS handshake, so the comprehensive revocation check needs either extra OCSP requests cause extra latency or prefetched revocation lists which traces back OCSP stapling to the plain old revocation list.
- Certificate Revocation List (CRL): As CRL is the oldest revocation check mechanism, the original RFC (2459) was published more than twenty one years and the latest major update was released seven years ago, cryptographic libraries support it very well.
- Online Certificate Status Protocol (OCSP): As OCSP is not a fresh story, the related RFC (6960) was released eight years ago, cryptographic library support is strong.
- OCSP Stapling: The mechanism cannot be considered new, as there is a proposed standard from eight years ago, and the technical background is the same as OCSP, so both cryptographic libraries and major web servers have been supporting the mechanism for several years, but it must be noted that early TLS versions do not support OCSP Staple extension.
For a server software, OCSP stapling is a demanding mechanism, because they have to take over a part of the burden from the OCSP responder as they have to be the reliable sources of the staple.
For server maintainers, OCSP stapling is a risky mechanism as for security reasons the OCSP stapling should go together with OCSP Must-Staple extension in the certificate which tells the client that if a certificate with that extension is served, an OCSP staple must also be served, but if there is an outage on OCSP responder side or any fetching problem on server side, it cause validation failure on client side. However, browsers do soft fail revocation check, meaning if the requested staple is not served or not served fast enough they ignore it. Because of that or not, only the third of the top 150.000 sites provides OCSP stapling and the dynamic growth in recent years (3–5 percent point per year) seems to have come to halt.
- OCSP Multi Stapling: Although OCSP multi stapling RFC (6961) published right after the OCSP RFC, the most popular cryptographic library, OpenSSL, does not support it neither in its latest stable release (1.1.1g), nor in its development release (3.0 alpha3), even though one of its competitor, wolfSSL, has OCSP multi stapling support since 2017.
Nowadays the focus is on the major browsers, however the problem is the same when we are using a minor text-based web browser or a popular command line tool in an automated environment. Not to mention that although the majority of the internet traffic is HTTPS, we should not forget about the several other protocols ( FTP, IMAP, LDAP, LMTP, MySQL, NNTP OpenVPN, POP3, PostgreSQL, RDP, Sieve, SIP, SMTP, XMPP, …) which use TLS, and some others ( IPSec/ IKE, WireGuard, QUIC, …) which are not, but may also be affected by the problems of certificate revocation.
To make the long story short the revocation check system has some theoretical and several practical problems, which make the situation difficult for all the stakeholders (client, server, middle application, cryptographic library developers). There is no revocation check mechanism, which can fulfill the requirements alone and the combination of them also had serious difficulties. The TLS version 1.3 solves the theoretical problems — making possible to serve OCSP staple for each certificate in the chains — in vain, if it increases the load on the shoulders of client and server software developers and decentralises the problem by involving new player in the game, namely the system administrator, who is responsible for making this best mechanism available for the customers. In vain the newer and newer revocation check mechanisms — OCSP, OCSP stapling and OCSP multi stapling — if all of them may trace back to the plain old CRL. In vain is the high penetration of CRL if there are certificate authorities which do not support this mechanism. In vain that fact that PKI is highly centralized if there is no official collection of revocation check related resources. There are too many tripping hazards to do it flawlessly.
Originally published at https://pfeifferszilard.hu on September 8, 2020.