>

IPSec: When Cryptography Meets Performance Reality

Scott MorrisonNovember 15, 2025 0 views
PSec VPN IKE AES-GCM encryption performance route-based VPN policy-based VPN NAT traversal ESP network security
IPSec's journey from CPU-crushing single-core bottleneck to multi-gigabit performance showcases how hardware acceleration (AES-NI) and better ciphers (AES-GCM) finally made strong encryption practical. Despite its complexity, rigid policy-based configurations, and NAT-breaking tendencies fixed by UDP encapsulation, IPSec remains the standard because it works, it's secure, and replacing it would be worse than living with its quirks.

IPSec is what happens when cryptographers and network engineers design a protocol together without quite understanding each other's constraints. The cryptographers wanted strong encryption, perfect forward secrecy, and authentication of every packet. The network engineers wanted something that didn't melt CPUs, could traverse NAT, and didn't require a PhD to configure. What we got is a protocol suite so complex that even experienced engineers occasionally need to consult RFCs mid-configuration, so performance-hungry that early implementations could barely push 100 Mbps, and so NAT-unfriendly that we had to bolt on UDP encapsulation just to make it work in the real world. Yet IPSec remains the standard for VPNs, site-to-site connectivity, and secure tunnels because, despite its flaws, it actually provides strong security when configured correctly. This is the story of how IPSec went from crushing CPUs to leveraging hardware acceleration, from rigid policy-based tunnels to flexible route-based designs, and from NAT-breaking mess to UDP-encapsulated workaround.

Let's explore IPSec's architecture, its performance evolution, why cipher choices matter more than you'd think, and how modern implementations finally made it fast enough for the gigabit age.

IPSec Architecture: ESP, AH, and IKE

IPSec isn't a single protocol, it's a framework comprising multiple protocols working together. This complexity is both its strength (flexibility, strong security) and weakness (configuration nightmares, interoperability issues).

ESP: Encapsulating Security Payload

ESP (RFC 4303) provides confidentiality, authentication, and integrity for IP packets. It's the workhorse of IPSec, what most people actually use.

ESP can operate in two modes:

Transport Mode: Encrypts only the payload, leaves the original IP header intact. Used for host-to-host communication where you want the original routing to work. The problem is middleboxes (firewalls, NAT) can still see and potentially modify the IP header, reducing security.

Tunnel Mode: Encapsulates the entire original IP packet in a new IP packet, then encrypts the original packet. The outer header has the tunnel endpoint addresses, the inner header has the actual source and destination. This is what most VPNs use because it protects everything including the original IP header.

ESP provides:

  • Confidentiality: Through encryption (AES, ChaCha20, etc.)
  • Authentication: Through HMAC or AEAD modes
  • Replay Protection: Through sequence numbers
  • Integrity: Cryptographic verification that packets weren't tampered with

AH: Authentication Header

AH (RFC 4302) provides authentication and integrity but not confidentiality. It authenticates the entire IP packet including parts of the IP header (those that don't change in transit).

In practice, almost nobody uses AH anymore because:

  • ESP can provide authentication too (when configured properly)
  • AH's authentication of the IP header breaks NAT (changing source/destination breaks the authentication)
  • If you need both confidentiality and authentication, ESP does both
  • If you only need authentication without encryption, ESP can do that too (NULL encryption)

AH exists mostly in RFCs and certification requirements. Real networks use ESP.

IKE: Internet Key Exchange

ESP and AH handle the actual data protection, but how do the two ends agree on keys, algorithms, and parameters? Enter IKE.

IKE is the control plane of IPSec. It:

  • Authenticates the peers (pre-shared keys, certificates, or EAP)
  • Negotiates encryption and authentication algorithms
  • Establishes Security Associations (SAs)
  • Generates session keys
  • Handles re-keying when SAs expire
  • Provides perfect forward secrecy through Diffie-Hellman

IKE operates in two phases:

Phase 1 (IKE SA): Establishes a secure, authenticated channel between peers. This SA protects Phase 2 negotiation. Phase 1 can use Main Mode (six messages, more secure but reveals identity to attackers) or Aggressive Mode (three messages, faster but leaks identity).

Phase 2 (IPSec SA): Uses the IKE SA to negotiate the actual IPSec parameters, creating one or more IPSec SAs for protecting data traffic.

This two-phase design provides flexibility and security but adds complexity and latency.

IKEv1 vs IKEv2: Finally Fixing The Mistakes

IKEv1 (RFC 2409, 1998) was complex, had multiple modes that interoperated poorly, and required many messages to establish a tunnel. IKEv2 (RFC 7296, 2014, though effectively from 2005) fixed many issues:

Fewer Messages: IKEv2 establishes a tunnel in four messages (two round trips) compared to IKEv1's minimum of six (Main Mode) or three (Aggressive Mode, but less secure).

Built-in NAT-T: IKEv2 includes NAT traversal in the base specification. IKEv1 added it later as an extension, causing compatibility issues.

Reliability: IKEv2 includes sequence numbers and acknowledgments for its messages. IKEv1 relied on retransmissions without acknowledgment, making it less reliable over lossy links.

EAP Authentication: IKEv2 natively supports EAP (Extensible Authentication Protocol), enabling certificate-less authentication methods like EAP-TLS or EAP-MSCHAPv2 (the latter being terrible, but enterprises love it).

Mobility Support (MOBIKE): IKEv2 supports changing IP addresses mid-session, critical for mobile devices switching between Wi-Fi and cellular.

Simplified Configuration: Fewer options and modes reduce configuration complexity and improve interoperability.

Better Error Handling: IKEv2 provides clearer error messages and recovery mechanisms.

IKEv2 is unambiguously better than IKEv1. Yet as of 2025, IKEv1 still exists in production because:

  • Legacy devices don't support IKEv2
  • "If it ain't broke" mentality (even though it kind of is broke)
  • Configuration inertia (nobody wants to reconfigure hundreds of tunnels)
  • Some network engineers learned IKEv1 and never learned IKEv2

Modern deployments should use IKEv2 exclusively. If your vendor doesn't support it, that's a vendor problem.

Cipher Suites: More Choices Than You Want

IPSec's flexibility means dozens of possible cipher combinations. For encryption:

3DES (Triple DES): Ancient (1990s), slow, 64-bit blocks (vulnerable to birthday attacks), small 168-bit effective key size. Should be dead but lingers in legacy configurations. Don't use it.

AES (Advanced Encryption Standard): The modern standard. Available in 128-bit, 192-bit, and 256-bit key sizes. AES-128 is generally sufficient, AES-256 provides no meaningful security improvement in most contexts but is slower.

ChaCha20: A stream cipher designed by Dan Bernstein. Faster than AES on systems without AES hardware acceleration (older mobile devices). Part of the ChaCha20-Poly1305 AEAD suite.

For modes of operation:

CBC (Cipher Block Chaining): The traditional mode, used with HMAC for authentication. Vulnerable to padding oracle attacks if implemented incorrectly. Works but dated.

GCM (Galois/Counter Mode): AEAD mode providing encryption and authentication in one operation. More efficient, especially with hardware support. The modern choice.

ChaCha20-Poly1305: AEAD alternative to AES-GCM, particularly good for mobile devices.

For authentication (when not using AEAD):

HMAC-MD5: Broken, don't use HMAC-SHA1: Deprecated, being phased out HMAC-SHA2-256/384/512: Current standard HMAC-SHA3: Newer, not yet widely deployed

For key exchange:

Diffie-Hellman Groups: DH Group 2 (1024-bit) is broken, Groups 14 (2048-bit) and 15 (3072-bit) are current standard, Groups 19-21 (ECC) are modern and faster.

Modern recommended configuration:

  • IKEv2
  • AES-GCM-128 or AES-GCM-256 (or ChaCha20-Poly1305)
  • HMAC-SHA2-256 for IKE integrity
  • DH Group 19 (256-bit ECC) or Group 14 (2048-bit) minimum
  • Certificate-based authentication (or at least long random PSKs)

The reality is most networks use whatever the vendor defaults to, which may or may not be secure.

The Performance Problem: When Encryption Meets CPU Reality

Early IPSec implementations (late 1990s, early 2000s) had terrible performance. A high-end server might push 50-100 Mbps of IPSec traffic before the CPU maxed out. Compared to line-rate gigabit Ethernet, this was pathetic.

The bottleneck was encryption. AES operations, particularly key expansion and the round functions, are computationally expensive when done in software. Every packet needs:

  1. Encryption/decryption (multiple AES rounds per block)
  2. Authentication (HMAC computation over the entire packet)
  3. Packet processing overhead
  4. Context switching and interrupt handling

The Single-Core Bottleneck

Early IPSec implementations had another problem: they were single-threaded. All IPSec processing happened on one CPU core. Even if you had a quad-core system, IPSec would max out one core and leave the others idle.

Why? Several reasons:

Sequential Processing: Each packet must be encrypted in order (to maintain sequence numbers for replay protection). Parallelizing this is non-trivial.

Locking: Shared state (SA databases, sequence number counters) required locks, serializing operations.

Hardware Limitations: Crypto accelerators (when available) often had single-queue architectures.

Software Design: Many IPSec stacks were written before multi-core systems were common and never refactored for parallelism.

This resulted in "core pinning" where network engineers would manually bind IPSec processes to specific cores to avoid cache thrashing, but this didn't solve the fundamental single-core bottleneck.

A 2005-era server with four 3 GHz CPU cores might achieve:

  • 3 Gbps of plain forwarding
  • 150 Mbps of IPSec VPN traffic

The CPU utilization would show one core at 100% and three cores mostly idle. Painful.

AES-NI: Hardware Acceleration Saves The Day

Intel's AES-NI (AES New Instructions), introduced in 2010 with Westmere processors, changed everything. AES-NI added CPU instructions specifically for AES operations:

AESENC/AESENCLAST: Perform one round of AES encryption AESDEC/AESDECLAST: Perform one round of AES decryption

AESKEYGENASSIST: Assist with AES key expansion AESIMC: Perform the InvMixColumns transformation

With AES-NI, AES operations became 3-7x faster. Suddenly, the CPU could handle gigabit IPSec traffic without breaking a sweat.

AMD added similar instructions (AMD-V) around the same time. ARM added AES instructions to ARMv8 (2013). By 2015, almost all mainstream CPUs had hardware AES support.

The impact was dramatic:

  • Before AES-NI: 150 Mbps per core
  • After AES-NI: 1-2 Gbps per core

Combined with other optimizations (better crypto libraries, improved network stacks), IPSec performance jumped by an order of magnitude.

AES-GCM and Multi-Core Scaling

AES-NI solved single-core performance, but IPSec still struggled with multi-core scaling. The breakthrough came from AES-GCM (Galois/Counter Mode).

Why GCM Matters for Parallelism:

Traditional CBC mode requires sequential processing, you can't encrypt block N+1 until you've encrypted block N. GCM uses counter mode for encryption, which is parallelizable. Each block is encrypted independently:



Ciphertext_block_N = Plaintext_block_N XOR AES(Key, Counter_N)

Since Counter_N is known in advance, you can compute multiple AES operations in parallel, even for the same packet.

Galois Field Multiplication: GCM's authentication uses Galois field multiplication, which can also be parallelized and has hardware acceleration (PCLMULQDQ instruction on x86).

With AES-GCM and modern hardware:

  • Multiple packets can be processed across multiple cores
  • Multiple blocks within a packet can be processed in parallel (with pipelining)
  • Authentication and encryption overlap

Modern implementations achieve:

  • 10 Gbps per core with AES-GCM-128 on recent CPUs
  • Near line-rate 10/25/40 Gbps with proper tuning
  • 100+ Gbps with specialized hardware (smartNICs, FPGAs)

The single-core bottleneck is gone. Modern IPSec implementations scale across all available cores, limited mainly by network bandwidth, not CPU.

ChaCha20-Poly1305 for Mobile

While AES-GCM is perfect for servers with AES-NI, mobile devices often lack hardware AES support (though this is improving). ChaCha20-Poly1305 was designed to be fast in software, particularly on ARM processors.

For mobile VPNs, ChaCha20-Poly1305 often performs better than AES-GCM when hardware acceleration isn't available. Both are excellent AEAD ciphers, the choice depends on your platform.

Policy-Based vs Route-Based VPNs: A Religious War

IPSec has two main approaches to determining what traffic should be encrypted:

Policy-Based VPNs

Policy-based IPSec uses "crypto ACLs" or "proxy IDs" to define interesting traffic. You configure rules like:



Traffic from 192.168.1.0/24 to 10.0.0.0/8 = encrypt with ESP
Traffic from 192.168.1.0/24 to 172.16.0.0/12 = encrypt with ESP
Everything else = don't encrypt

When traffic matches a policy, IKE negotiates a Security Association for that specific traffic selector. Each unique traffic pair can have its own SA.

Advantages:

  • Granular control (different encryption for different traffic flows)
  • More efficient (only encrypted traffic uses IPSec)
  • Split-tunneling naturally supported

Disadvantages:

  • Complex configuration (ACLs must match exactly on both sides)
  • Difficult to troubleshoot (did traffic match the right ACL?)
  • Doesn't work well with dynamic routing protocols
  • Adding new subnets requires reconfiguring ACLs
  • Interoperability issues (different vendors interpret policies differently)

Route-Based VPNs

Route-based IPSec creates a virtual tunnel interface (VTI, virtual tunnel interface). All traffic routed to this interface gets encrypted. You use normal routing (static routes, OSPF, BGP) to determine what goes through the tunnel.



Create tunnel interface tunnel0
Bind tunnel0 to IPSec SA
Route 10.0.0.0/8 via tunnel0

Advantages:

  • Simple, clean configuration
  • Works naturally with routing protocols (OSPF over IPSec)
  • Easy to add networks (just add routes)
  • Troubleshooting is easier (standard routing tools work)
  • Better for dynamic environments

Disadvantages:

  • All traffic to the interface is encrypted (less granular)
  • Slight overhead from tunnel interface
  • Some platforms don't support it (legacy devices)

The Verdict: Route-based VPNs are generally superior for site-to-site tunnels in modern networks. The simplicity and flexibility outweigh the slightly reduced granularity. Policy-based has niche uses (very specific security requirements, legacy equipment) but route-based should be the default choice.

The networking community is slowly converging on route-based, but many enterprises still run policy-based because "that's how we've always done it."

NAT Traversal: Fixing IPSec's NAT Problem

IPSec and NAT hate each other. Here's why:

ESP Protocol: ESP uses IP protocol 50, not TCP (6) or UDP (17). Most NAT devices only understand TCP and UDP, they don't know how to translate ESP. Result: ESP packets get dropped by NAT.

Embedded Addresses: IPSec SAs include IP addresses. When NAT changes the source IP, it breaks the SA.

Integrity Protection: AH and ESP-with-auth verify packet integrity including checksums. NAT changes addresses and checksums, breaking authentication.

In the early 2000s, this was a huge problem. Enterprise VPN users behind home NAT routers couldn't connect. Multiple VPN clients behind the same NAT couldn't work (all appeared to come from the same IP).

NAT-T: UDP to the Rescue

NAT Traversal (NAT-T, RFC 3947 and 3948) solved this by encapsulating ESP in UDP:



Original: [IP Header][ESP Header][Encrypted Payload][ESP Trailer]
With NAT-T: [IP Header][UDP Header (port 4500)][ESP Header][Encrypted Payload][ESP Trailer]

NAT devices understand UDP, so they can translate the IP/UDP headers. The ESP payload inside remains unchanged.

NAT-T adds:

  • Detection of NAT between peers (during IKE)
  • UDP encapsulation (port 4500) for ESP packets
  • Keep-alive packets to maintain NAT mappings
  • Handling of port changes when NAT mappings update

The Overhead: UDP encapsulation adds 8 bytes per packet. For large packets, this is negligible. For VoIP or other small-packet applications, it's 1-2% overhead.

Why Not Do This Always?: Some purists argue pure ESP is "cleaner" without UDP overhead. In practice, NAT-T is so common that many implementations use it even when NAT isn't present, for simplicity.

Without NAT-T, IPSec in NAT environments is painful or impossible. With it, everything just works. This is why IKEv2 includes NAT-T as a core feature rather than an extension.

Split Tunneling: Security vs Usability

When connected to a VPN, where does non-VPN traffic go?

Full Tunnel: All traffic goes through the VPN. Your connection to random websites routes through your corporate network. This provides security (corporate firewall inspects everything) but kills performance (all traffic goes the long way).

Split Tunnel: Only traffic destined for corporate resources goes through VPN. Everything else uses your local Internet connection directly. Better performance, but corporate security doesn't see that traffic.

The security team wants full tunnel (inspect everything, prevent data exfiltration). Users want split tunnel (Netflix actually works). It's been a fight for 20+ years.

COVID-19 and remote work forced many enterprises toward split tunnel because their VPN concentrators couldn't handle full tunnel traffic for thousands of simultaneous users all streaming Teams calls. Turns out video conferencing through a VPN halfway across the country works poorly.

Modern approaches use split tunnel with:

  • Endpoint security (EDR, antivirus) to replace some network security
  • Cloud-based security inspection (Zscaler, Cloudflare Gateway) for Internet traffic
  • VPN only for corporate resource access

But many enterprises still mandate full tunnel because security theater.

IKE Authentication: PSK vs Certificates

IKE supports multiple authentication methods:

Pre-Shared Keys (PSK)

Both sides configure the same secret string. During IKE negotiation, they prove they know the secret without transmitting it.

Advantages:

  • Simple to configure
  • No PKI infrastructure needed
  • Works everywhere

Disadvantages:

  • Key distribution problem (how do you securely share the key?)
  • Same key often reused for multiple tunnels (compromise one = compromise all)
  • Weak keys are common ("password123")
  • No per-user authentication (everyone shares the same key)
  • Revocation is hard (change key = reconfigure everything)

Reality: PSK is used far more than it should be because it's easy and enterprises don't want to deploy PKI.

Certificate-Based Authentication

Each peer has a certificate signed by a trusted CA. During IKE, they exchange certificates and verify signatures.

Advantages:

  • Strong authentication (proper PKI)
  • Per-device certificates (revoke individual devices)
  • Scales better (no manual key distribution)
  • Industry best practice

Disadvantages:

  • Requires PKI infrastructure (CA, certificate management)
  • More complex configuration
  • Certificate expiration can break tunnels (needs monitoring)
  • Troubleshooting is harder (certificate chain issues, CRL/OCSP problems)

Reality: Large enterprises use certificates for site-to-site VPNs. Small businesses use PSK. Remote access VPNs often use a hybrid (server has certificate, users authenticate with username/password via EAP).

The right answer is certificates, but the easy answer is PSK, so PSK dominates.

Perfect Forward Secrecy: PFS vs Computational Cost

Perfect Forward Secrecy means that compromise of long-term keys doesn't compromise past session keys. For IPSec, this requires running Diffie-Hellman for each session.

Without PFS: Session keys are derived from the long-term authentication material (PSK or certificate private key). If that material is compromised, all past sessions can be decrypted.

With PFS: Each session uses ephemeral Diffie-Hellman. Even if you steal the PSK or certificate private key, you can't decrypt past sessions because the ephemeral DH keys were destroyed.

IKE Phase 1 always uses DH (establishing the IKE SA itself). The question is whether Phase 2 (IPSec SA) uses DH or just derives keys from the Phase 1 material.

PFS Group: Configuring a DH group for Phase 2 enables PFS. This requires periodic re-keying with new DH exchanges.

The Cost: DH operations are expensive (especially strong groups like DH 14+). Frequent re-keying with PFS impacts CPU, particularly on high-throughput tunnels.

The Tradeoff:

  • Enable PFS with reasonable re-key lifetime (hours, not minutes)
  • Use ECC DH groups (cheaper than traditional DH)
  • Balance security needs against performance

For site-to-site VPNs carrying sensitive data, PFS is worth it. For tunnels where performance is critical and long-term key compromise is less concerning, maybe skip it.

IPSec in the Cloud Era

Cloud providers love IPSec because it's standardized and secure, but implementing it is complex:

AWS VPN: Uses IPSec with IKEv1/IKEv2 support, route-based tunnels, BGP over IPSec for dynamic routing. Works well but requires proper configuration (PSK, cipher suites, DH groups).

Azure VPN Gateway: Similar, supports both policy and route-based. Picky about configurations, easy to create interoperability issues.

GCP Cloud VPN: Route-based, good performance with HA VPN option.

Site-to-Cloud VPN: Common for hybrid cloud. Works but adds latency and complexity compared to direct connectivity (Direct Connect, ExpressRoute).

The challenge is that cloud providers have specific requirements (supported cipher suites, IKE versions) that may not match your on-premises equipment. Reading compatibility matrices is mandatory.

Modern Alternatives: WireGuard

WireGuard is the "can we do better than IPSec" answer. It's:

  • Much simpler (4000 lines of code vs IPSec's hundreds of thousands)
  • Faster (modern crypto primitives only, no legacy cruft)
  • Easier to configure (no IKE phases, complex ACLs)
  • More auditable (smaller codebase)

WireGuard uses:

  • ChaCha20-Poly1305 for encryption/auth
  • Curve25519 for key exchange
  • BLAKE2s for hashing
  • HKDF for key derivation

Performance is excellent, configuration is trivial, and security is strong.

So why isn't everyone using WireGuard?

  • Relatively new (stable since 2020, but IPSec has 25 years)
  • Vendor support still limited (improving rapidly)
  • Enterprise features less mature (no central management, limited authentication options)
  • "Nobody got fired for buying IPSec"

WireGuard is the future for greenfield deployments and modern environments. IPSec is the present for enterprises with existing infrastructure, legacy requirements, and risk-averse security teams.

Living With IPSec's Complexity

IPSec is a monster of complexity born from the 1990s when cryptographers designed protocols without worrying too much about implementation reality. It's gotten better:

  • Hardware acceleration makes it fast enough for modern networks
  • IKEv2 fixed many of IKEv1's problems
  • Route-based tunnels simplified configuration
  • NAT-T made it work in real networks
  • Modern cipher suites (AES-GCM, ChaCha20-Poly1305) are both fast and secure

But it's still complex. A proper IPSec deployment requires understanding:

  • Encryption algorithms and their performance characteristics
  • IKE phases and negotiation
  • Policy vs route-based design decisions
  • NAT traversal implications
  • Certificate management (if doing it right)
  • Routing integration
  • Troubleshooting (parsing IKE logs is an acquired skill)

Most deployments use vendor defaults, which may or may not be secure or optimal. The vendor defaults have improved (fewer awful choices like 3DES), but you still need expertise to do it right.

The path forward for most organizations:

  • Use IKEv2 exclusively
  • Route-based tunnels for site-to-site
  • AES-GCM or ChaCha20-Poly1305
  • Strong DH groups (19+ for ECC, 14+ for traditional)
  • Certificate-based auth where possible
  • Monitor for tunnel failures and re-key events
  • Consider WireGuard for new deployments if your environment supports it

IPSec works, it's secure when configured properly, and it's fast enough with modern hardware. It's just not elegant, simple, or particularly enjoyable to troubleshoot at 3 AM when your site-to-site tunnel is flapping.

But it's what we have, and it's not going anywhere soon. Welcome to the world of encrypted tunnels, where complexity meets security meets performance, and everyone has a slightly different interpretation of the RFCs.