§ Trackr.Live

Public Key Infrastructure (PKI)

Public Key Infrastructure is the set of policies, procedures, software, and trusted parties that lets a public-key cryptosystem work at scale. The underlying primitives — RSA, ECDSA, Ed25519 — solve the cryptographic problem of authenticating that a specific private key was used to sign a specific message. They do not, by themselves, solve the question of whose key it is. PKI is the layer that answers that question.

The discipline is more politics than mathematics. The mathematics of public-key cryptography is rigorous and well-understood. The PKI on top of it is a thirty-year accumulation of standards, governance bodies, root programs, certificate authorities, revocation mechanisms, transparency logs, and the operational scar tissue from roughly a dozen significant historical failures. Most working systems use PKI without their operators ever needing to think about how it works, which is the goal — but when it fails, the failures tend to be both spectacular and slow to recover from.

This page is the deep-dive companion to the Cryptography umbrella overview. The scope here is the trust infrastructure itself: certificates, certificate authorities, the trust chain, revocation, Certificate Transparency, and the recurring failure patterns. Key management — which is the harder discipline that turns these certificates into running systems — has its own page.

The binding problem

The core problem PKI solves is binding a public key to an identity in a way that a third party can verify. Without that binding, a public key is just a string of bits — useful for cryptographic operations, but with no inherent connection to any particular person, server, organization, or device.

Several trust models have been proposed and deployed:

Hierarchical trust, the dominant model on the modern web, uses a small set of designated certificate authorities (CAs) that act as trusted intermediaries. The CA verifies the identity of a key holder (to varying degrees), then issues a certificate that binds the public key to the verified identity. Anyone who trusts the CA can verify any certificate the CA has issued. The trust set is small (a few hundred CAs in the public web ecosystem) and the verification mechanism is universal (every browser, every TLS client).

Web of trust, the model used by PGP and GnuPG, allows any user to sign any other user’s key. Trust accumulates from peer-to-peer signatures rather than from a central authority. Conceptually attractive, operationally a failure outside niche communities. The model places the trust-evaluation burden entirely on the verifier, which most users cannot meaningfully perform.

Trust on first use (TOFU), used by SSH and many peer-to-peer protocols, accepts the public key the first time a connection is made and stores it for subsequent verification. The first connection is unverified; every subsequent connection is verified against the stored key. Simple, robust, and depends on the assumption that the first connection is not under attack. The model works well for cases where parties expect to communicate repeatedly with the same endpoints.

Certificate pinning, used by some mobile applications and by some high-security web services, embeds expected public keys directly in the client and refuses connections to any other key. Eliminates the CA as a trust dependency at the cost of making key rotation operationally painful.

In practice, the public web runs on hierarchical PKI, and almost every other context layers something on top — TOFU for SSH, pinning for some mobile apps, private PKI for corporate networks, and so on. The rest of this page is primarily about hierarchical PKI because that is where the operational complexity and historical failure modes live.

X.509 — the certificate format

The standard certificate format is X.509, originally specified in 1988 as part of the ITU-T X.500 directory standard and refined in a long series of RFCs (RFC 5280 is the current profile for internet PKI). An X.509 certificate is a structured binary document that contains a public key, identifying information about the key holder, identifying information about the CA that issued the certificate, validity period information, and a signature from the issuer.

The key fields in an X.509 v3 certificate:

Serial number — a unique identifier within the issuing CA.
Signature algorithm — which algorithm (and which hash) was used to sign the certificate. Typically SHA-256 with RSA, ECDSA with P-256, or Ed25519 in modern certificates.
Issuer — the distinguished name (DN) of the CA that issued this certificate.
Validity period — notBefore and notAfter dates bounding when the certificate is considered valid.
Subject — the DN of the entity the certificate identifies. For web server certificates, the DN’s common name (CN) traditionally held the hostname.
Subject public key info — the algorithm identifier and the actual public key bytes.
Extensions — a structured list of additional attributes that constrain or extend the certificate’s use.

The extensions are where the operational meaning lives. The ones that matter most:

Subject Alternative Name (SAN) — the list of hostnames (or other identifiers) the certificate is valid for. Modern browsers ignore the CN entirely and rely on the SAN extension. A certificate issued for example.com with a SAN list including example.com and www.example.com is valid for both names.
Basic Constraints — whether this is a CA certificate (allowed to issue further certificates) or an end-entity certificate (not allowed to issue). The CA flag is what distinguishes intermediate CAs from leaf certificates.
Key Usage and Extended Key Usage — what cryptographic operations the key is allowed to perform (digital signature, key encipherment, certificate signing, OCSP signing, server authentication, client authentication, code signing). These flags are enforced by clients and limit the blast radius of a compromised key.
Authority Information Access (AIA) — URLs pointing to the issuer’s certificate and to the OCSP responder, used for chain building and revocation checking.
CRL Distribution Points — URLs to the certificate revocation lists.
Signed Certificate Timestamp (SCT) list — embedded proofs that the certificate has been logged to Certificate Transparency logs.

X.509 has historical baggage. The DN format inherited from X.500 is needlessly complex, the ASN.1 encoding (DER) is a frequent source of parser vulnerabilities, and the v1 and v2 formats are still encountered in legacy systems. The format is what we have, however, and every TLS library and every certificate-handling tool deals with it.

Certificates are typically encoded in either binary DER (the canonical format) or in PEM (Privacy-Enhanced Mail) format, which is the DER bytes base64-encoded and wrapped in -----BEGIN CERTIFICATE----- / -----END CERTIFICATE----- markers. PEM is the format you encounter when working with certificates as files; DER is what crosses the wire in TLS.

The trust chain

A certificate is not trusted on its own. A relying party trusts a small set of root CAs, identified by their public keys, which are typically distributed as part of an operating system or browser root store. A certificate is considered valid only if a chain of signatures connects it back to one of these trusted roots.

The chain looks like:

End-entity certificate (the cert your web server presents): signed by an intermediate CA.
Intermediate CA certificate(s): signed by another intermediate or directly by a root CA.
Root CA certificate: self-signed, but trusted because its public key is in the relying party’s root store.

The intermediate layer exists for two reasons. Operational: root CA private keys are extremely valuable and should be used as rarely as possible (typically stored offline in HSMs and brought online only for ceremonies that issue or rotate intermediates). Risk-containment: a compromise of an intermediate can be addressed by revoking the intermediate and reissuing under a different one without requiring users to update their root store.

Certificate path validation, specified in RFC 5280 section 6, is the algorithm a relying party runs to verify a chain. Roughly: start at the end-entity certificate, verify the signature against the named issuer’s public key, walk up the chain, check that each certificate is within its validity period, has appropriate Basic Constraints, has appropriate Key Usage, has not been revoked, and ends at a trusted root. Failure of any step rejects the chain.

The set of trusted roots is what determines who can issue certificates that the relying party will accept. Operating systems and browsers maintain root programs with specific policies for which CAs are admitted, what audits they must pass, and what behavior gets them removed. The major root programs are:

Microsoft Root Certificate Program — applies to Windows, IE, Edge legacy.
Apple Root Certificate Program — applies to macOS, iOS, Safari.
Mozilla Root Program — applies to Firefox and is the most influential because many other systems (Linux distributions, embedded systems) inherit Mozilla’s list rather than maintain their own.
Chrome Root Program — historically inherited from the platform but increasingly independent, now shipping its own root store on Chrome.

A CA accepted into one program is not automatically accepted into the others, though the programs broadly coordinate through the CA/Browser Forum and have largely aligned requirements.

The CA ecosystem and how it’s governed

The set of public CAs is small in absolute terms — on the order of 100-200 organizations operate trusted CAs at any given time — and is governed by a layered structure of standards bodies, root programs, audits, and a long-running adversarial relationship between the CAs (which want loose rules to enable easier issuance) and the browser vendors (which want tight rules to limit damage when things go wrong).

The CA/Browser Forum (CAB Forum) is the multilateral body that publishes the Baseline Requirements, the minimum standards a public CA must meet to be accepted into the major root programs. The Baseline Requirements cover validation procedures, certificate contents, key management, audit requirements, incident response obligations, and dozens of other operational details. Updates to the Baseline Requirements happen continuously through the CAB Forum’s ballot process, and the changes have real teeth — a CA that fails to comply has been removed from root programs multiple times.

CAs are audited annually under one of several frameworks. WebTrust for CAs, administered by the American Institute of CPAs, is the dominant framework in North America. ETSI EN 319 411 is the European counterpart. The audit reports are public and are part of how root programs evaluate ongoing CA trustworthiness.

The validation levels a CA can perform have been progressively tightened over the years:

Domain Validation (DV) — proves only that the requester controls the domain name. The simplest and cheapest validation; what Let’s Encrypt issues. Modern DV is automated via the ACME protocol and produces certificates in seconds.
Organization Validation (OV) — verifies organizational identity in addition to domain control. The CA confirms the requesting organization is a real legal entity with the claimed name. Slower and more expensive than DV, and the resulting certificate includes the organization information in the subject DN.
Extended Validation (EV) — adds further organizational verification under tighter rules. EV was the format that, when first introduced, lit up the address bar in green with the organization name. The green-bar UI is now gone from all major browsers — UX research showed users didn’t notice or understand it — and EV’s market position has weakened substantially.

The ACME protocol (RFC 8555) and Let’s Encrypt deserve specific mention because they together transformed the certificate ecosystem starting in 2015. Before ACME, getting a TLS certificate was a manual process involving CSR generation, validation challenges that varied by CA, and typically a non-trivial fee. ACME automated the entire flow and Let’s Encrypt offered DV certificates at no cost. The combination drove HTTPS adoption on the open web from roughly 40% of page loads in 2015 to over 95% by 2023. Other CAs (ZeroSSL, Buypass, Google Trust Services) now offer ACME-compatible issuance as well.

The revocation problem

Certificates expire on a known schedule defined by their notAfter field. They can also be revoked before that date if the corresponding private key is compromised, if the issuing CA discovers the certificate was issued incorrectly, or if the certificate holder requests it. Revocation is structurally hard, and the revocation infrastructure has been one of the weakest parts of PKI for the entire history of the discipline.

Three revocation mechanisms have been deployed:

Certificate Revocation Lists (CRLs) are signed lists of revoked certificate serial numbers, published by the CA at a URL specified in the certificate’s CRL Distribution Points extension. Relying parties download the CRL and check whether the certificate they’re validating appears on it. The mechanism is simple but does not scale — modern CAs issue millions of certificates, and CRLs for large CAs grow to tens of megabytes. Browsers stopped doing real-time CRL checking years ago because of the download cost.

Online Certificate Status Protocol (OCSP), specified in RFC 6960, replaces the bulk-download model with a query model: the relying party asks the CA’s OCSP responder whether a specific certificate is currently valid, and the responder returns a signed answer. OCSP solves the scale problem but introduces a privacy problem (the CA now sees every certificate every browser checks) and a reliability problem (if the OCSP responder is down, what does the relying party do?). The latter has caused major outages — when an OCSP responder fails, the choice between fail-open (accept the certificate, defeating the purpose of revocation) and fail-closed (reject the certificate, breaking all access to sites under that CA) is unappetizing in both directions. Most modern clients fail open, which means OCSP provides essentially no real-time protection.

OCSP stapling, specified in RFC 6066 section 8, lets the server fetch its own OCSP response from the CA and present it to clients during the TLS handshake. This shifts the OCSP traffic from the client to the server, addressing the privacy and latency concerns. OCSP Must-Staple, specified in RFC 7633, is a certificate extension that requires clients to refuse a connection if the OCSP response is missing — which fixes the fail-open problem but at the cost of brittleness if the server fails to staple. Deployment of OCSP stapling has been uneven; deployment of Must-Staple is rare.

The modern browser response has been to abandon real-time revocation checking entirely and replace it with aggregated revocation pushes: the browser vendor builds a compressed representation of all known revocations and pushes it to clients as a regular update. CRLite (Mozilla, 2017 onward) and OneCRL (Mozilla’s earlier mechanism) are the production examples. Chrome’s CRLSets does something similar with a narrower scope. The model trades real-time revocation for fast, reliable, scalable distribution of revocation data.

The underlying reality is that revocation has never really worked at internet scale. The combination of short certificate lifetimes (most public certificates now expire within 90 days, and the CAB Forum is moving toward 47-day maximums in 2027) and aggregated revocation push lists has produced a system that is good enough in practice — most compromises are caught quickly, most compromised certificates expire on their own within months — but it remains a structural weakness.

Certificate Transparency

After roughly a decade of CA failures — DigiNotar in 2011, Comodo in 2011, multiple TURKTRUST incidents, ANSSI in 2013, India CCA, the Symantec series ending in 2018 — the response from the major browser vendors was Certificate Transparency (CT), specified initially in RFC 6962 and revised in RFC 9162.

CT works by requiring every publicly-trusted certificate to be logged in at least two CT logs before browsers will accept it. The logs are append-only, signed, publicly readable, and operated by a diverse set of organizations. The certificate carries embedded Signed Certificate Timestamps (SCTs) as proof of inclusion.

The mechanism does not prevent CA failures; a malicious CA can still issue a fraudulent certificate. What CT provides is detection: every certificate the CA issues is publicly visible, which lets domain owners monitor CT logs for unauthorized certificates issued for their domains, and lets researchers identify CAs behaving anomalously. The shift from “trust the CAs to behave correctly” to “watch the CAs continuously” is the structural change CT enables.

CT adoption was rolled out gradually starting in 2018 and is now mandatory for all publicly-trusted certificates. The CT log infrastructure includes logs operated by Google, Cloudflare, DigiCert, Let’s Encrypt, Sectigo, and several others, with coordination through the CT log monitor ecosystem.

For domain owners, the practical CT workflow is: monitor public CT data feeds for any certificate issued naming your domain, alert on any certificate you didn’t authorize, and report unauthorized certificates to the issuing CA for revocation. Several free monitoring services (crt.sh, Cert Spotter, Facebook’s CT monitor) make this operationally tractable.

The history of CA failures

A short and incomplete catalog of CA incidents worth knowing as context:

DigiNotar (2011) — Dutch CA compromised, attacker issued fraudulent certificates for Google and many other major sites. Detected by Iranian Gmail users seeing unexpected certificates. DigiNotar was removed from root programs and went bankrupt within weeks.
Comodo (2011) — Reseller account compromised, fraudulent certificates issued for Google, Yahoo, Skype, and Mozilla addons sites. Detected because Comodo identified the breach itself and reported it.
TURKTRUST (2013) — Turkish CA accidentally issued intermediate CA certificates as end-entity certificates, allowing the recipient to issue further certificates. Discovered when one of the recipients used the misconfiguration to issue a Google certificate. TURKTRUST was distrusted by the major browsers.
ANSSI (2013) — French government CA discovered to have issued an intermediate certificate to a third party that was using it to inspect TLS traffic for a private network. The intermediate was misused to issue Google certificates that ended up in the wild.
Symantec series (2015-2018) — Symantec’s CA business (which through acquisitions included VeriSign, GeoTrust, Thawte, and RapidSSL) was found to have repeatedly issued certificates without proper validation. Google led a multi-year distrust process culminating in the gradual removal of Symantec-issued certificates from trust stores by 2018, forcing every Symantec customer to migrate to a different CA.
WoSign / StartCom (2016) — Chinese CA WoSign was found to have backdated certificates to bypass the SHA-1 deprecation, among other issues. Both WoSign and its subsidiary StartCom were distrusted.
Camerfirma (2021) — Spanish CA distrusted after a long sequence of audit failures and misissuance incidents.
TrustCor (2022) — distrusted following reports linking the CA to a US government contractor with intelligence connections.

The pattern across these incidents is consistent: a CA either fails to perform validation correctly, has its issuance infrastructure compromised, or is found to have policies that conflict with the Baseline Requirements. The detection mechanism varies — sometimes internal CA reports, sometimes affected users, increasingly Certificate Transparency monitoring. The response is removal from root programs, which forces every customer of the distrusted CA onto a different one within months.

The cumulative effect of two decades of incidents is the current PKI governance environment: short certificate lifetimes, mandatory CT logging, automated DV via ACME, and a CAB Forum that meets regularly to tighten the Baseline Requirements. The system is not pretty, but it is meaningfully more robust than it was in 2011.

Private PKI

Enterprises run their own PKI for internal purposes — authenticating servers on internal networks, authenticating users via smart cards, signing code, encrypting email — using private CAs that are trusted only within the organization. Private PKI is structurally similar to public PKI but with different operational constraints.

On Windows networks, Active Directory Certificate Services (AD CS) is the dominant private PKI implementation. AD CS issues certificates based on certificate templates that define what the certificate is for, what validation is required, and what permissions are needed to request the certificate. The system is powerful and operationally complex, and the certificate template configuration has produced a notable series of privilege escalation attacks (ESC1 through ESC15 and continuing) that exploit misconfigured templates to obtain certificates that authenticate as privileged users. AD CS hardening is its own discipline; the SpecterOps writeups on the ESC attack family are the canonical reference.

Outside Windows, HashiCorp Vault, Smallstep CA, CFSSL (Cloudflare’s open-source CA), and several cloud-managed services (AWS Private CA, Google Cloud CAS, Azure Key Vault Certificates) provide private PKI capabilities. The choice depends on the integration profile — cloud-native applications typically use the cloud provider’s offering; Kubernetes deployments often use cert-manager with an internal CA.

Mutual TLS (mTLS) authentication, in which both the client and the server present certificates, is the most common use of private PKI for service-to-service authentication. mTLS is widely deployed in service meshes (Istio, Linkerd), in zero-trust network architectures, and in API gateways. The operational cost is real — certificates have to be issued, distributed, rotated, and revoked across every service — and the tooling has matured substantially in the last five years.

Code signing PKI is a parallel ecosystem. Authenticode on Windows, Apple’s notarization service, sigstore for open-source code, and the various platform-specific app stores all run their own PKI infrastructures with their own root trust models. The principles are the same as web PKI; the operational details differ enough that code signing is treated as a separate topic in most enterprise security programs.

Post-quantum PKI

The post-quantum transition will affect PKI more visibly than it affects most other parts of the cryptographic stack. Certificates contain public keys; those public keys will need to migrate from RSA and ECDSA to ML-DSA and SLH-DSA. The chain validation logic does not need to change, but the keys, signatures, and certificate sizes change substantially.

The size impact is non-trivial. Where an ECDSA P-256 certificate carries a 64-byte signature and a 32-byte public key, an ML-DSA-65 certificate carries a roughly 3.3 KB signature and a 2 KB public key. SLH-DSA signatures are larger still (7-49 KB depending on parameter set). For TLS handshakes that include the full certificate chain, the size increase translates directly to handshake latency and bandwidth, which is a meaningful operational concern for high-volume services.

The deployment pattern emerging in 2026 is hybrid certificates: certificates that carry both a classical public key (ECDSA P-256 or RSA) and a post-quantum public key (ML-DSA), with both signed by the issuing CA. Relying parties that support post-quantum verify both signatures; relying parties that don’t fall back to the classical key. This is operationally awkward (everything is larger) but provides a migration path that does not break legacy clients.

The X.509 format itself is being extended to support post-quantum keys. RFC 9090 covers the IANA registry assignments; specific algorithm OIDs and certificate profiles for ML-DSA and SLH-DSA are being standardized through 2026 and 2027. Production deployment of post-quantum PKI is still early; Cloudflare, Google, and Apple have begun shipping hybrid certificates in selected products, but the public CA ecosystem will take years to fully transition.

Standards and references

RFC 5280 — Internet X.509 PKI Certificate and CRL Profile.
RFC 6960 — OCSP (Online Certificate Status Protocol).
RFC 6962 and RFC 9162 — Certificate Transparency.
RFC 7633 — OCSP Must-Staple.
RFC 8555 — ACME (Automatic Certificate Management Environment).
CAB Forum Baseline Requirements — the operational standard for public CAs.
NIST SP 800-32 — Introduction to Public Key Technology and the Federal PKI Infrastructure.
WebTrust for CAs — the dominant audit framework.

What to actually use in 2026

For public web servers, the practical pattern is: get certificates from Let’s Encrypt or another ACME-compatible CA, automate renewal so certificates never expire under load, use ECDSA P-256 or Ed25519 keys where supported, enable OCSP stapling, monitor Certificate Transparency for unauthorized issuance against your domains, and treat certificate management as continuous infrastructure rather than as a periodic ticket.

For internal corporate networks: use a managed private PKI (cloud KMS-based, Vault PKI, or AD CS with very careful template review) rather than running raw OpenSSL CAs, automate certificate distribution and rotation across services, and treat mTLS as the default for service-to-service authentication where the cost is bearable.

For code signing: use the platform-native signing infrastructure for the platform you ship on (Authenticode for Windows, notarization for macOS), use sigstore for open-source supply chains, and store signing keys in HSMs or cloud KMS — never on developer workstations.

For post-quantum readiness: start watching the CAB Forum and root program discussions, plan for the size impact on TLS handshakes, and begin pilot deployments of hybrid certificates in services where the protocol allows experimentation. The PQC transition is going to be a multi-year operational program for any organization that runs significant PKI infrastructure.

The mathematics underneath PKI is solid. The governance, the revocation, and the certificate-management operations are where the work lives, and where the failures keep happening. The discipline is genuinely thirty years old and still rough around the edges; the parts of it that work well do so because they have been progressively hardened in response to specific past failures.