91 views
# IPSec Bound End-to-End Tunnel VPN with strongSwan [toc] --- > Copyright (c) 2025 Philippe Latu. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License". https://inetdoc.net ### Scenario How do you design or develop a cost-effective, self-hosted system architecture and Internet services? One answer to this question is to use a low-cost VPS cloud service as your Internet point of presence, and work on a lab mockup at home. To do this, you need to establish a secure communications channel that allows inbound traffic from the Internet to your self-hosted systems. This requires a site-to-site VPN technology with optimal overhead that is resilient to ISPs' somewhat shifting addressing plans and NAT traversal. As a longtime user of [**strongSwan**](https://strongswan.org/), I discovered an IPSec optimization I had never paid attention to: bound end-to-end tunnel mode. In the never-ending IPSec tunnel vs. transport mode debate, I have always preferred the transport mode with GRE because it allows all types of traffic to be secured, especially the OSPF routing protocol neighbor exchanges based on multicast. I am not in favor of tunnel mode because of its IP header overhead and the fact that you can only route one network by default. This document is a step-by-step guide to setting up an IPSec VPN using this compromise solution that combines Encapsulation Security Payload (ESP) and minimal header overhead. At both ends of the VPN tunnel, we use Debian GNU/Linux Trixie systems with version 6.0.1 for strongSwan. ![IPSec BEET encapsulation](https://md.inetdoc.net/uploads/133e18c8-983d-4f2b-8421-bfc05b2d0e3c.png) ## Part 1: IPSec BEET encapsulation Here is a brief description of IPsec BEET (Bound End-to-End Tunnel) mode, which combines aspects of both transport and tunnel modes to provide tunnel-like semantics with reduced overhead. Here's its encapsulation structure: - **Outgoing Packet Processing** Inner IP Header Verification : The original packet's IP addresses are checked against the BEET SA's bound inner addresses. Header Replacement : - Original IP header is discarded - New outer IP header (from SA parameters) is prepended - **Incoming Packet Processing** Decryption : ESP transport mode decryption occurs first Header Restoration : - Outer IP header is stripped - Pre-negotiated inner IP header is reconstructed from SA parameters For both inbound and outbound traffic, encryption and authentication follow the rules of the ESP transport mode, allowing NAT traversal. ## Part 2: Prepare network tunnel end interfaces and routing In this part, we start by enabling IPv4 and IPv6 routing at both ends of the IPsec VPN tunnel to allow traffic to flow in and out of dedicated interfaces. Over two decades ago, I learned that it is preferable to use independent logical interfaces to carry the configurations of routing protocols such as BGP or tunnels. The goal of this approach is to avoid dependencies on the link state and addressing scheme of physical interfaces. In the context of this document, we follow the main idea of using logical interfaces by setting up Linux Xfrm interfaces. Xfrm interfaces in Linux serve as virtual endpoints for IPsec Security Associations (SAs), including those operating in Bound End-to-End Tunnel (BEET) mode. Their primary role is to decouple IPsec policy enforcement from traditional routing mechanisms, allowing for flexible and scalable tunnel management. 1. SA-to-Interface Binding - Xfrm interfaces associate BEET-mode SAs with a unique interface ID, allowing multiple independent tunnels to coexist without address/SPI conflicts. - Traffic routed to the interface is automatically subjected to the bound SA’s encryption/decryption rules. 2. Header Transformation - On egress: Inner IP headers (fixed per BEET SA) are replaced with outer headers, following BEET’s transport-like encapsulation. - On ingress: Outer headers are stripped, and inner headers are reconstructed using SA-stored addresses. 3. Simplified Policy Management: - Policies are linked to interfaces via the if_id parameter, eliminating complex mark-based routing. - Supports wildcard traffic selectors (e.g., 0.0.0.0/0) since routing decisions are handled by the interface itself. Notice we don't use wildcard traffic selectors in this document. ### Step 1: Routing must be enabled at both ends Here is a bash code snippet to run on both the cloud VPS instance and the lab gateway. It enables IPv4 and IPv6 routing at the kernel level and is resilient to system reboots as long as the `/etc/sysctl.d/10-routing.conf` file exists. ```bash= cat << EOF | sudo tee /etc/sysctl.d/10-routing.conf net.ipv4.conf.default.rp_filter=2 net.ipv4.conf.all.rp_filter=2 net.ipv4.conf.all.log_martians=1 net.ipv4.ip_forward=1 net.ipv6.conf.all.forwarding=1 EOF sudo sysctl --system ``` ### Step 2: Set up the Xfrm interface at both ends At the time of this writing, Netplan.io doesn't manage Xfrm interfaces. Therefore, we need to drop down to systemd networkd and create each individual configuration file for the link and network layers. Here is the tunnel addressing plan: | | Cloud VPS | Lab gateway | |:-------- |:-------------------|:-------------------| | IPv4 | 10.254.0.1/30 | 10.254.0.2/30 | | IPv6 | fdc0:beef:1::1/128 | fdc0:beef:2::1/128 | Here is the Bash script code that creates and configures the `xfrm1` interface on the **Lab gateway**. ```bash= sudo tee /etc/systemd/network/20-xfrm1.netdev > /dev/null <<\EOF [NetDev] Name=xfrm1 Kind=xfrm [Xfrm] InterfaceId=1 EOF sudo tee /etc/systemd/network/20-xfrm1-assign.network > /dev/null <<\EOF [Match] Name=lo [Network] Xfrm=xfrm1 EOF sudo tee /etc/systemd/network/20-xfrm1.network > /dev/null <<\EOF [Match] Name=xfrm1 [Network] Address=10.254.0.2/30 Address=fdc0:beef:2::1/128 [Route] Destination=fdc0:beef:1::/64 [Link] RequiredForOnline=no EOF sudo systemctl restart systemd-networkd ``` After running the script, we can verify that the new interface is configured correctly. ```bash ip a ls dev xfrm1 ``` ```bash= 3: xfrm1@lo: <NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/none inet 10.254.0.2/30 brd 10.254.0.3 scope global xfrm1 valid_lft forever preferred_lft forever inet6 fdc0:beef:2::1/128 scope global valid_lft forever preferred_lft forever inet6 fe80::3dec:c20:5141:54f7/64 scope link stable-privacy proto kernel_ll valid_lft forever preferred_lft forever ``` At the other end of the not yet established tunnel VPN, we can apply a symmetric script that differs from the first only in its addresses. ```bash= sudo tee /etc/systemd/network/20-xfrm1.netdev > /dev/null <<\EOF [NetDev] Name=xfrm1 Kind=xfrm [Xfrm] InterfaceId=1 EOF sudo tee /etc/systemd/network/20-xfrm1-assign.network > /dev/null <<\EOF [Match] Name=lo [Network] Xfrm=xfrm1 EOF sudo tee /etc/systemd/network/20-xfrm1.network > /dev/null <<\EOF [Match] Name=xfrm1 [Network] Address=10.254.0.1/30 Address=fdc0:beef:1::1/128 [Route] Destination=fdc0:beef:2::/64 [Link] RequiredForOnline=no EOF sudo systemctl restart systemd-networkd ``` We can confirm that the configuration is correct after running the script. ```bash ip a ls dev xfrm1 ``` ```bash= 3: xfrm1@lo: <NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/none inet 10.254.0.1/30 brd 10.254.0.3 scope global xfrm1 valid_lft forever preferred_lft forever inet6 fdc0:beef:1::1/128 scope global valid_lft forever preferred_lft forever inet6 fe80::d0cd:8ae:3409:7010/64 scope link stable-privacy proto kernel_ll valid_lft forever preferred_lft forever ``` Now that the tunnel end interfaces are in place, we can move on to the authentication part. ## Part 3: Generate authentication secrets Among the many authentication solutions available with strongSwan, we choose the Ed25519/X25519 cryptographic approach using self-signed certificates. As usual, this is a compromise choice that achieves NIST Level 1 post-quantum security estimates while maintaining compatibility with legacy systems through RFC-compliant X.509 certificates. According to the EU Cybersecurity Regulation 2025, X25519/Ed25519 is mandatory for public tenders. ### Step 1: Install `pki` tool and bypass TPM We start by installing the simple public key infrastructure management tool provided by the `strongswan-pki` package and shared libraries for TPM2 Access Broker & Resource Management. ```bash sudo apt install strongswan-pki libtss2-tcti-tabrmd0 ``` Then we make a questionable choice by choosing to bypass the use of TPM2 in our secret management, since neither end of the mock setup used to create this document supports it. Therefore, at each end of the tunnel, we edit the `/etc/strongswan.conf` file and add the `exclude = tpm` line to the plugins section. ```bash= # strongswan.conf - strongSwan configuration file # # Refer to the strongswan.conf(5) manpage for details # # Configuration changes should be made in the included files charon { load_modular = yes plugins { include strongswan.d/charon/*.conf exclude = tpm # <-- ADD THIS TO BYPASS TPM } } ``` If we are using modern hardware and/or a cloud VPS compliant service, we should rely on TPM2 support. ### Step 2: Create a private key and a self-signed certificate for each end Here's some bash script code `gen-ecc-secrets.sh` to generate a private key if it doesn't already exist, and a self-signed certificate for the tunnel end, using the end name as the argument. ```bash= #!/bin/bash if [[ $(id -u) -ne 0 ]]; then echo "ERROR! Run this script as root." exit 1 fi if [[ -z "$1" ]]; then echo "ERROR! Provide the tunnel end name as an argument." exit 1 fi END_NAME="$1" # Create directories if missing mkdir -p /etc/swanctl/{private,x509} || exit 1 # Generate ECC private key using Ed25519 PRIVATE_KEY_FILE="/etc/swanctl/private/${END_NAME}-key.pem" if [[ ! -f ${PRIVATE_KEY_FILE} ]]; then echo "Generating Ed25519 private key for ${END_NAME}..." if ! pki --gen --type ed25519 --outform pem > "${PRIVATE_KEY_FILE}"; then echo "ERROR! Failed to generate private key for ${END_NAME}." exit 1 fi chmod 400 "${PRIVATE_KEY_FILE}" fi # Generate self-signed certificate (NOT a CA certificate) cert_temp=$(mktemp) || exit 1 if ! pki --self --lifetime 3652 \ --in "${PRIVATE_KEY_FILE}" \ --type ed25519 \ --dn "C=FR, O=inetdoc.net, CN=${END_NAME}" \ --san "${END_NAME}.inetdoc.net" \ --flag serverAuth \ --flag clientAuth \ --outform pem > "${cert_temp}"; then echo "ERROR! Failed to generate certificate for ${END_NAME}." rm -f "${cert_temp}" exit 1 fi mv "${cert_temp}" "/etc/swanctl/x509/${END_NAME}-cert.pem" chmod 600 "/etc/swanctl/x509/${END_NAME}-cert.pem" echo "Successfully generated ECC secrets for ${END_NAME}." exit 0 ``` Once this script is copied to each end, all we need to do is run it with the tunnel end name as the argument. - On the Lab Gateway system end ```bash sudo bash gen-ecc-secrets.sh lab-gw ``` - On the Cloud VPS end ```bash sudo bash gen-ecc-secrets.sh vps ``` ### Step 3: Copy the self-signed certificate from one end to the other After running the certificate creation script, each end of the tunnel has its own self-signed certificate. For mutual authentication, we need to copy the certificate from one end to the other. - Copy **lab-gw** certificate to **vps** In this communication direction, we can transfer using SSH. ```bash CERT_FILE="/etc/swanctl/x509/lab-gw-cert.pem" sudo cat ${CERT_FILE} |\ ssh vps 'sudo tee ${CERT_FILE} > /dev/null &&\ sudo chmod 600 ${CERT_FILE}' ``` - Copy **vps** certificate to **lab-gw** On the other tunnel end, NAT blocks direct communication. Therefore, we can only copy the contents of the certificate file and paste it into the new copy of the file. - On VPS host: ```bash sudo cat /etc/swanctl/x509/vps-cert.pem ``` - On Lab host: ```bash sudo vim /etc/swanctl/x509/vps-cert.pem ``` When the copies are done, we will find these files on each host. | | Cloud VPS | Lab gateway | |:-------- |:-------------------|:-------------------| | private key |`/etc/swanctl/private/vps-key.pem` | `/etc/swanctl/private/lab-gw-key.pem` | | self signed certificates | `/etc/swanctl/private/lab-gw-cert.pem` `/etc/swanctl/private/vps-cert.pem` |`/etc/swanctl/private/lab-gw-cert.pem` `/etc/swanctl/private/vps-cert.pem` | ## Part 4: Create strongSwan configuration files on both ends We are now ready to continue with the strongSwan configuration. Since we are using version 6.0.1 packages, our main configuration file is called `/etc/swanctl.swantcl.conf`. Before we dive into the configuration, let's check the list of installed packages. ```bash apt search --names-only '(strongswan|charon)' | grep install ``` ```bash= WARNING: apt does not have a stable CLI interface. Use with caution in scripts. charon-systemd/testing,now 6.0.1-1 amd64 [installé] libcharon-extauth-plugins/testing,now 6.0.1-1 amd64 [installé, automatique] libcharon-extra-plugins/testing,now 6.0.1-1 amd64 [installé, automatique] libstrongswan/testing,now 6.0.1-1 amd64 [installé, automatique] libstrongswan-extra-plugins/testing,now 6.0.1-1 amd64 [installé] libstrongswan-standard-plugins/testing,now 6.0.1-1 amd64 [installé, automatique] strongswan-libcharon/testing,now 6.0.1-1 amd64 [installé, automatique] strongswan-pki/testing,now 6.0.1-1 amd64 [installé] strongswan-swanctl/testing,now 6.0.1-1 amd64 [installé] ``` ### Step 1: Create the Lab Gateway configuration file Here is a copy of the `/etc/swanctl/swanctl.conf` file on the **Lab Gateway** host: ```bash= sudo cat /etc/swanctl/swanctl.conf connections { vps-cloud { local_addrs = XXX.XXX.XXX.XXX # <- YOUR LAB GATEWAY HOST ADDRESS HERE remote_addrs = AAA.BBB.CCC.DDD # <- YOUR CLOUD VPS HOST ADDRESS HERE local { auth = pubkey certs = /etc/swanctl/x509/lab-gw-cert.pem id = "C=FR, O=inetdoc.net, CN=lab-gw" # Matches certificate subject } remote { auth = pubkey certs = /etc/swanctl/x509/vps-cert.pem id = "C=FR, O=inetdoc.net, CN=vps" # Matches VPS certificate } children { beet-vpn-ipv4-self { local_ts = 10.254.0.0/30 # XFRM subnet remote_ts = 10.254.0.0/30 # Allow XFRM-to-XFRM communication mode = beet if_id_in = 1 if_id_out = 1 esp_proposals = chacha20poly1305-sha384-curve25519 set_mark_in = 0x1 set_mark_out = 0x1 start_action = start dpd_action=clear } beet-vpn-ipv4 { local_ts = 192.168.1.0/24 # Home lab network remote_ts = 10.254.0.0/30 # XFRM subnet mode = beet if_id_in = 1 if_id_out = 1 esp_proposals = chacha20poly1305-sha384-curve25519 set_mark_in = 0x1 set_mark_out = 0x1 start_action = start dpd_action=clear } beet-vpn-ipv6-self { local_ts = fdc0:beef:2::/64 remote_ts = fdc0:beef:1::/64 mode = beet if_id_in = 1 if_id_out = 1 esp_proposals = chacha20poly1305-sha384-curve25519 set_mark_in = 0x1 set_mark_out = 0x1 start_action = start dpd_action=clear } } version = 2 mobike = no proposals = chacha20poly1305-sha384-curve25519 encap = yes # Required for NAT traversal dpd_delay = 60s } } pools { ula-v6 { addrs = fdc0:beef:2::1/128 } } ``` Here is a descriptive list of the choices made to build this configuration file sections. connections : Defines VPN connection profiles. In this file, a single connection named vps-cloud is configured. - `local_addrs = XXX.XXX.XXX.XXX` Initiates connections from Lab Gateway local IP address. - `remote_addrs = AAA.BBB.CCC.DDD` Specifies the remote VPN peer's IP address. local block : Settings for the local endpoint. - `auth = pubkey` Use public key (certificate) authentication. - `certs = /etc/swanctl/x509/lab-gw-cert.pem` Path to the local certificate file. - `id = "C=FR, O=inetdoc.net, CN=lab-gw"` Local identity, matching the certificate subject. remote block : Settings for the remote endpoint. - `auth = pubkey` Use public key (certificate) authentication. - `certs = /etc/swanctl/x509/vps-cert.pem` Path to the remote peer's certificate file. - `id = "C=FR, O=inetdoc.net, CN=vps"` Remote identity, matching the remote certificate. children : Defines individual IPsec SAs (sub-connections/tunnels) for different traffic selectors. - beet-vpn-ipv4-self - `local_ts = 10.254.0.0/30` Local traffic selector (subnet). - `remote_ts = 10.254.0.0/30` Remote traffic selector (same subnet, for XFRM-to-XFRM). - `mode = beet` Use BEET mode (Bound End-to-End Tunnel). - `if_id_in = 1`, `if_id_out = 1` Interface IDs for inbound and outbound traffic. - `esp_proposals = chacha20poly1305-sha384-curve25519` ESP encryption/authentication and DH group proposals. - `set_mark_in = 0x1, set_mark_out = 0x1` Set packet marks for inbound/outbound traffic. - `start_action = start` Automatically start this child SA. - `dpd_action = clear` Clear SA if Dead Peer Detection (DPD) fails. - beet-vpn-ipv4 - `local_ts = 192.168.1.0/24` Lab network. - `remote_ts = 10.254.0.0/30` Remote XFRM subnet. - Other parameters identical to above. - beet-vpn-ipv6-self - `local_ts = fdc0:beef:2::/64` Local IPv6 ULA subnet. - `remote_ts = fdc0:beef:1::/64` Remote IPv6 ULA subnet. - Other parameters identical to above. Other Connection-level Settings : Common parameters - `version = 2` Use IKEv2 protocol. - `mobike = no` Disable MOBIKE (Mobility and Multihoming Protocol). - `proposals = chacha20poly1305-sha384-curve25519` IKE proposal (encryption, integrity, DH group). - `encap = yes` Enable UDP encapsulation for NAT traversal. - `dpd_delay = 60s` Dead Peer Detection interval. pools : Defines address pools for virtual IP assignment. - ula-v6 - `addrs = fdc0:beef:2::1/128` Single IPv6 address to assign to clients. This configuration establishes a secure IKEv2 VPN using certificate authentication, BEET mode tunnels for both IPv4 and IPv6, strong cryptographic proposals (chacha20poly1305, sha384, curve25519), and enables NAT traversal. It provides separate child SAs for different local/remote subnets and uses packet marking. ### Step 2: Create the Cloud VPS configuration file This second configuration file is very similar to the previous one. All Traffic Selectors (TS) are swapped and the Cloud VPS address is the local one. ```bash= connections { home-lab { local_addrs = AAA.BBB.CCC.DDD # <-- YOUR CLOUD VPS ADDRESS HERE remote_addrs = %any local { auth = pubkey certs = /etc/swanctl/x509/vps-cert.pem id = "C=FR, O=inetdoc.net, CN=vps" # Matches certificate subject } remote { auth = pubkey certs = /etc/swanctl/x509/lab-gw-cert.pem id = "C=FR, O=inetdoc.net, CN=lab-gw" # Matches lab certificate } children { beet-vpn-ipv4-self { local_ts = 10.254.0.0/30 # XFRM subnet remote_ts = 10.254.0.0/30 # Allow XFRM-to-XFRM communication mode = beet if_id_in = 1 if_id_out = 1 esp_proposals = chacha20poly1305-sha384-curve25519 set_mark_in = 0x1 set_mark_out = 0x1 start_action = start dpd_action=clear } beet-vpn-ipv4 { local_ts = 10.254.0.0/30 # XFRM subnet remote_ts = 192.168.1.0/24 # Home lab network mode = beet if_id_in = 1 if_id_out = 1 esp_proposals = chacha20poly1305-sha384-curve25519 set_mark_in = 0x1 set_mark_out = 0x1 start_action = start dpd_action=clear } beet-vpn-ipv6-self { local_ts = fdc0:beef:1::/64 remote_ts = fdc0:beef:2::/64 mode = beet if_id_in = 1 if_id_out = 1 esp_proposals = chacha20poly1305-sha384-curve25519 set_mark_in = 0x1 set_mark_out = 0x1 start_action = start dpd_action=clear } } version = 2 mobike = no proposals = chacha20poly1305-sha384-curve25519 encap = yes # Enable NAT-T dpd_delay = 60s } } pools { ula-v6 { addrs = fdc0:beef:1::1/128 } } ``` ## Part 5: Start the site-to-site VPN and verify that it is working With the tunnel configuration files in place, we are ready to start the security associations and perform ICMP testing between the ends. NAT traversal forces us to start the services in a particular order: 1. Start on the cloud VPS first to listen for new security association attempts. ```bash sudo systemctl restart strongswan ``` 2. Second, start on the lab gateway to initiate outbound security associations through the ISP NAT box. ```bash sudo systemctl restart strongswan ``` ### Step 1: List the security associations From the Lab Gateway side, we can list the security association states and attributes. ```bash sudo swanctl --list-sas ``` ```bash= vps-cloud: #3, ESTABLISHED, IKEv2, 71dbff6ad586fdea_i 700b6f4f2c529344_r* local 'C=FR, O=inetdoc.net, CN=bab-gw' @ XXX.XXX.XXX.XXX[4500] remote 'C=FR, O=inetdoc.net, CN=vps' @ AAA.BBB.CCC.DDD[4500] CHACHA20_POLY1305/PRF_HMAC_SHA2_384/CURVE_25519 established 348s ago, rekeying in 13876s beet-vpn-ipv4-self: #25, reqid 1, INSTALLED, TUNNEL-in-UDP, ESP:CHACHA20_POLY1305/CURVE_25519 installed 1293s ago, rekeying in 2095s, expires in 2667s in c118b489 (-|0x00000001), 0 bytes, 0 packets out c8dd9b29 (-|0x00000001), 0 bytes, 0 packets local 10.254.0.0/30 remote 10.254.0.0/30 beet-vpn-ipv6-self: #26, reqid 3, INSTALLED, TUNNEL-in-UDP, ESP:CHACHA20_POLY1305/CURVE_25519 installed 957s ago, rekeying in 2374s, expires in 3003s in cb14a284 (-|0x00000001), 0 bytes, 0 packets out c756533e (-|0x00000001), 0 bytes, 0 packets local fdc0:beef:2::/64 remote fdc0:beef:1::/64 beet-vpn-ipv4: #27, reqid 2, INSTALLED, TUNNEL-in-UDP, ESP:CHACHA20_POLY1305/CURVE_25519 installed 926s ago, rekeying in 2377s, expires in 3034s in c90a2c29 (-|0x00000001), 0 bytes, 0 packets out cdc0b8a4 (-|0x00000001), 0 bytes, 0 packets local 192.168.1.0/24 remote 10.254.0.0/30 ``` The output of `sudo swanctl --list-sas` shows an active IKEv2 VPN tunnel with multiple child security associations (SAs). The three child SAs handle specific traffic selectors: beet-vpn-ipv4-self : Designates the IPv4 addresses assigned at each tunnel end. beet-vpn-ipv6-self : Corresponds to the IPv6 loopback addresses also assigned at each tunnel end. beet-vpn-ipv4 : Allows IPv4 communication from the VPS to the local lab network behind the NAT ISP box. :::success Remember that SA addresses are used to discard or replace addresses in BEET mode encapsulation. Therefore, it is important to check the SA traffic selectors to confirm that the VPN is working. ::: ### Step 2: List Xfrm interface policy Another way to check the address pairing is to list the Xfrm interface policies. This provides more detailed information about the traffic rules. ```bash sudo ip xfrm policy ``` ```bash= src fdc0:beef:1::/64 dst fdc0:beef:2::/64 dir out priority 334463 ptype main tmpl src AAA.BBB.CCC.DDD dst XXX.XXX.XXX.XXX proto esp spi 0xcc56abfe reqid 2 mode tunnel if_id 0x1 src fdc0:beef:2::/64 dst fdc0:beef:1::/64 dir fwd priority 334463 ptype main tmpl src XXX.XXX.XXX.XXX dst AAA.BBB.CCC.DDD proto esp reqid 2 mode tunnel if_id 0x1 src fdc0:beef:2::/64 dst fdc0:beef:1::/64 dir in priority 334463 ptype main tmpl src XXX.XXX.XXX.XXX dst AAA.BBB.CCC.DDD proto esp reqid 2 mode tunnel if_id 0x1 src 10.254.0.0/30 dst 192.168.1.0/24 dir out priority 372351 ptype main tmpl src AAA.BBB.CCC.DDD dst XXX.XXX.XXX.XXX proto esp spi 0xcd8be964 reqid 3 mode tunnel if_id 0x1 src 192.168.1.0/24 dst 10.254.0.0/30 dir fwd priority 372351 ptype main tmpl src XXX.XXX.XXX.XXX dst AAA.BBB.CCC.DDD proto esp reqid 3 mode tunnel if_id 0x1 src 192.168.1.0/24 dst 10.254.0.0/30 dir in priority 372351 ptype main tmpl src XXX.XXX.XXX.XXX dst AAA.BBB.CCC.DDD proto esp reqid 3 mode tunnel if_id 0x1 src 10.254.0.0/30 dst 10.254.0.0/30 dir out priority 369279 ptype main tmpl src AAA.BBB.CCC.DDD dst XXX.XXX.XXX.XXX proto esp spi 0xc1443240 reqid 1 mode tunnel if_id 0x1 src 10.254.0.0/30 dst 10.254.0.0/30 dir fwd priority 369279 ptype main tmpl src XXX.XXX.XXX.XXX dst AAA.BBB.CCC.DDD proto esp reqid 1 mode tunnel if_id 0x1 src 10.254.0.0/30 dst 10.254.0.0/30 dir in priority 369279 ptype main tmpl src XXX.XXX.XXX.XXX dst AAA.BBB.CCC.DDD proto esp reqid 1 mode tunnel if_id 0x1 src ::/0 dst ::/0 socket in priority 0 ptype main src ::/0 dst ::/0 socket out priority 0 ptype main src ::/0 dst ::/0 socket in priority 0 ptype main src ::/0 dst ::/0 socket out priority 0 ptype main src 0.0.0.0/0 dst 0.0.0.0/0 socket in priority 0 ptype main src 0.0.0.0/0 dst 0.0.0.0/0 socket out priority 0 ptype main src 0.0.0.0/0 dst 0.0.0.0/0 socket in priority 0 ptype main src 0.0.0.0/0 dst 0.0.0.0/0 socket out priority 0 ptype main ``` :::success The command output lists all **Header Transformations** - Inner IP headers (fixed per BEET SA) are replaced with outer headers on egress. - Outer headers are stripped, and inner headers are reconstructed using SA-stored addresses on ingress. ::: ### Step 3: Run ICMP tests To conclude this part we will perform ICMP tests between the VPS and the Lab Gateway local network to confirm that the VPN setup is working. 1. Tunnel end addresses tests from VPS after displaying Xfrm interface addresses ```bash ip a ls dev xfrm1 ``` ```bash= 3: xfrm1@lo: <NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/none inet 10.254.0.1/30 brd 10.254.0.3 scope global xfrm1 valid_lft forever preferred_lft forever inet6 fdc0:beef:1::1/128 scope global valid_lft forever preferred_lft forever inet6 fe80::70b6:ade:d864:8ccf/64 scope link stable-privacy proto kernel_ll valid_lft forever preferred_lft forever ``` ```bash for addr in 10.254.0.2 fdc0:beef:2::1 do ping -c3 $addr done ``` ```bash= PING 10.254.0.2 (10.254.0.2) 56(84) bytes of data. 64 bytes from 10.254.0.2: icmp_seq=1 ttl=64 time=14.6 ms 64 bytes from 10.254.0.2: icmp_seq=2 ttl=64 time=14.1 ms 64 bytes from 10.254.0.2: icmp_seq=3 ttl=64 time=13.6 ms --- 10.254.0.2 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2004ms rtt min/avg/max/mdev = 13.624/14.110/14.578/0.389 ms PING fdc0:beef:2::1 (fdc0:beef:2::1) 56 data bytes 64 bytes from fdc0:beef:2::1: icmp_seq=1 ttl=64 time=13.3 ms 64 bytes from fdc0:beef:2::1: icmp_seq=2 ttl=64 time=13.4 ms 64 bytes from fdc0:beef:2::1: icmp_seq=3 ttl=64 time=13.5 ms --- fdc0:beef:2::1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2004ms rtt min/avg/max/mdev = 13.319/13.417/13.500/0.074 ms ``` 2. Lab network addresses tests from VPS ```bash for addr in 192.168.1.1 192.168.1.28 do ping -c3 $addr done ``` ```bash= PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=63 time=13.9 ms 64 bytes from 192.168.1.1: icmp_seq=2 ttl=63 time=14.6 ms 64 bytes from 192.168.1.1: icmp_seq=3 ttl=63 time=14.7 ms --- 192.168.1.1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 13.874/14.377/14.703/0.361 ms PING 192.168.1.28 (192.168.1.28) 56(84) bytes of data. 64 bytes from 192.168.1.28: icmp_seq=1 ttl=63 time=13.7 ms 64 bytes from 192.168.1.28: icmp_seq=2 ttl=63 time=16.3 ms 64 bytes from 192.168.1.28: icmp_seq=3 ttl=63 time=13.8 ms --- 192.168.1.28 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 13.745/14.618/16.344/1.220 ms ``` :::warning The success of these last tests depends on NAT filtering rules. Therefore, we need to develop a special part on this topic. ::: ## Part 6: Implement an nftables filtering ruleset at each tunnel end Although we have verified that communication between the Xfrm tunnel end interfaces works, we need to make sure that outgoing traffic from these interfaces is translated with end host addresses to connect back and forth to neighboring hosts. Therefore, we will introduce the `nftables.conf` file with its source NAT rules for tunnel end addresses. We will also provide a complete filtering ruleset for each tunnel end, including IPSec traffic. Let's start by verifying that the tools are installed and the nftables service is active. ```bash sudo apt update sudo apt install nftables sudo systemctl enable --now nftables systemctl status nftables ``` The last command output must show that the `nftables.service` is marked **active**. We are now ready to create or edit the two `/etc/nftables.conf` configuration files. ### Step 1: Create filtering rules on Lab Gateway side Here is a copy of the Lab Gateway filtering ruleset file: `/etc/nftables.conf`. ```bash= #!/usr/sbin/nft -f flush ruleset # Interface definitions define GW_EXT = enp0s25 define XFRM_IF = xfrm1 table inet filter { chain input { type filter hook input priority filter; policy drop; ct state established,related accept iifname "lo" accept ip daddr 224.0.0.0/4 iifname $GW_EXT limit rate 2/second burst 5 packets counter packets 0 bytes 0 accept # ICMP meta l4proto { icmp, ipv6-icmp } limit rate 2/second burst 5 packets counter packets 0 bytes 0 accept # SSH meta l4proto tcp th dport { 22, 2222 } counter packets 0 bytes 0 ct state new accept # IPSec meta ipsec exists accept iifname $GW_EXT udp dport { 500, 4500 } counter packets 0 bytes 0 ct state new accept } chain forward { type filter hook forward priority filter; policy drop; ct state established,related accept # ICMP meta l4proto { icmp, ipv6-icmp } limit rate 2/second burst 5 packets counter packets 0 bytes 0 accept # Forward IPSec decrypted traffic oifname $XFRM_IF counter packets 0 bytes 0 ct state new accept } } table inet mangle { chain forward { type route hook output priority mangle; policy accept; # Clamp TCP MSS to path MTU for IPsec BEET traffic # oifname $XFRM_IF tcp flags syn tcp option maxseg size set rt mtu counter packets 0 bytes 0 oifname $XFRM_IF tcp flags syn tcp option maxseg size set 1380 counter packets 0 bytes 0 } } table inet nat { chain postrouting { type nat hook postrouting priority srcnat; policy accept; ip saddr 10.254.0.0/30 oifname $GW_EXT counter packets 0 bytes 0 masquerade ip6 saddr { fdc0:beef:1::/64, fdc0:beef:2::/64 } oifname $GW_EXT counter packets 0 bytes 0 masquerade } } ``` The key points of these rules for the IPsec BEET VPN are: Interface Definitions : Variables define the external gateway (GW_EXT) and the IPSec BEET tunnel interface (XFRM_IF), which are central to traffic routing. Note that the Xfrm framework on Linux provides a kernel-level infrastructure for transforming network packets, such as encrypting, authenticating, or compressing them. It is primarily used to manage security associations and policies for secure communications. It is essential to have a dedicated interface to apply IPsec-specific filtering rules. The filter table : The default policy for the `input` and `forward` tables is to drop all traffic. Therefore, all allowed traffic must be explicitly defined by the chain ruleset. For the **`input`** chain, allowed traffic includes: * Established and related connections * Loopback and rate-limited ICMP/IPv6-ICMP * New SSH connections on ports 22/2222 * IPSec traffic over meta ipsec exists and IKEv2 ports (500/4500) The `meta ipsec exists` acceptance rule allows only packets that have been processed by IPsec, i.e., decrypted or authenticated by the kernel's IPsec stack, to be accepted by the firewall. For the **`forward`** chain, we find the same rules for established and related connections, and for ICMP traffic. The most important rule in this chain is the one that accepts and counts new connection-traced packets forwarded through the `$XFRM_IF` interface, which is typically used to forward freshly decrypted IPSec BEET VPN traffic. The nat table : Here we need to masquerade (source address translation) all IPv4 or IPv6 packets coming out of the VPN. This source address translation is done after the routing decision. Therefore, it is placed in the `postrouting` chain. The mangle table : Here we find a very critical rule when forwarding traffic from hosts that are unaware of the VPN's presence. When traffic enters the VPN, the encapsulation overhead affects the Maximum Transmission Unit (MTU) of the packets. Without the alteration rule to change the Maximum Segment Size (MSS) according to the MTU of new TCP connections, these connections will simply hang. This is known as MSS clamping. In the context of this BEET tunnel, the decision is made to limit the MSS to 1380 in order to seamlessly forward traffic for the lab hosts. :::warning Note the presence of a commented rule that would accomplish the same goal if dynamic path MTU discovery worked properly. Ideally, manual MTU definition should not be necessary. To characterize the failure of path MTU discovery, you can start an initial SSH connection attempt that hangs. Then run the `tracepath` command on the same destination to determine the exact MTU value. A second SSH connection attempt will work immediately. Unfortunately, dynamic path MTU discovery is not reliable at this time, and I have decided to set a predetermined MTU value. ::: ### Step 2: Create filtering rules on VPS side Here is a copy of the symmetric VPS filtering ruleset file: `/etc/nftables.conf`. ```bash #!/usr/sbin/nft -f flush ruleset # Interface definitions define VPS_EXT = eth0 define XFRM_IF = xfrm1 table inet filter { chain input { type filter hook input priority filter; policy drop; ct state related,established accept iifname "lo" accept ip daddr 224.0.0.0/4 iifname $VPS_EXT limit rate 2/second burst 5 packets counter packets 0 bytes 0 accept # ICMP meta l4proto { icmp, ipv6-icmp } limit rate 2/second burst 5 packets counter packets 0 bytes 0 accept # SSH meta l4proto tcp th dport { 22, 2222 } counter packets 0 bytes 0 ct state new accept # IPSec meta ipsec exists accept iifname $VPS_EXT udp dport { 500, 4500 } counter packets 0 bytes 0 ct state new accept } chain forward { type filter hook forward priority filter; policy drop; ct state related,established accept # ICMP meta l4proto { icmp, ipv6-icmp } limit rate 2/second burst 5 packets counter packets 0 bytes 0 accept # IPSec oifname $XFRM_IF counter packets 0 bytes 0 accept } } table inet mangle { chain forward { type route hook output priority mangle; policy accept; # Clamp TCP MSS to path MTU for IPsec BEET traffic # oifname $XFRM_IF tcp flags syn tcp option maxseg size set rt mtu counter packets 0 bytes 0 oifname $XFRM_IF tcp flags syn tcp option maxseg size set 1380 counter packets 0 bytes 0 } } table inet nat { chain postrouting { type nat hook postrouting priority srcnat; policy accept; oifname $VPS_EXT ip saddr 10.254.0.0/30 counter packets 0 bytes 0 masquerade oifname $VPS_EXT ip6 saddr { fdc0:beef:1::/64, fdc0:beef:2::/64 } counter packets 0 bytes 0 masquerade } } ``` The key points of this ruleset are identical to those to the previous step. ### Step 3: Apply filtering rules at each tunnel end You can apply the `nftables.conf` file ruleset manually by running either of these two commands: ```bash sudo nft -f /etc/nftables.conf ``` ```bash sudo systemctl restart nftables.service ``` The most interesting thing here is to look at the counters while traffic is flowing over the VPN. Let's take an example from the VPS end of the BEET tunnel. First, we read the counter value from the mangle table. ```bash sudo nft list table inet mangle ``` ```bash= table inet mangle { chain forward { type route hook output priority mangle; policy accept; oifname "xfrm1" tcp flags syn tcp option maxseg size set 1380 counter packets 23 bytes 1420 } } ``` The packet in the screenshot above is 23. Now make a new SSH connection to a lab host and close it. ```bash ssh inetdoc0 ``` Then read the same counter value and see that it has been incremented to 24. ```bash sudo nft list table inet mangle ``` ```bash= table inet mangle { chain forward { type route hook output priority mangle; policy accept; oifname "xfrm1" tcp flags syn tcp option maxseg size set 1380 counter packets 24 bytes 1480 } } ``` Reading these counters proves that the filtering rules are relevant Just as we did before, we start a new SSH connection to a lab host after we start capturing traffic. On the lab gateway, we run: ```bash tshark -i xfrm1 ``` On the VPS, we run: ```bash ssh inetdoc0 ``` Then we read the capture results and get the MSS value. ```bash= Capturing on 'xfrm1' 1 0.000000000 10.254.0.1 → 192.168.1.200 TCP 60 56588 → 2222 [SYN] Seq=0 Win=64308 Len=0 MSS=1380 SACK_PERM TSval=2828489467 TSecr=0 WS=128 2 0.000603847 192.168.1.200 → 10.254.0.1 TCP 60 2222 → 56588 [SYN, ACK] Seq=0 Ack=1 Win=65160 Len=0 MSS=1460 SACK_PERM TSval=2646856229 TSecr=2828489467 WS=128 3 0.014737487 10.254.0.1 → 192.168.1.200 TCP 52 56588 → 2222 [ACK] Seq=1 Ack=1 Win=64384 Len=0 TSval=2828489482 TSecr=2646856229 ``` The captured MSS value is 1380. ## Part 7: Improve site-to-site VPN reliability The script presented here should not be useful if you are using a perfect transmission channel. However, real-world networks are subject to a number of hiccups that can break IPSec security associations, even though we have added Dead Peer Detection (DPD) mechanisms to the strongSwan configuration files. The following script has been very useful when running a VPN over a lossy outdoor wifi point-to-point connection. ### Step 1: Create the VPN monitoring script Here is the code of the `/usr/local/bin/vpn-monitor.sh` script: ```bash= #!/bin/bash # VPN tunnel monitoring script v1.0 # Configuration TUNNEL_PEER="10.254.0.1" LOG_FILE="/var/log/vpn_monitor.log" LOCK_FILE="/tmp/vpn_monitor.lock" PING_COUNT=3 PING_TIMEOUT=2 MAX_PACKET_LOSS=80 # Adjusted to 80% for better stability SERVICE_RESTART_DELAY=15 # Seconds to wait after restart # Logging function with rotation control log_message() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "${LOG_FILE}" # Rotate logs if over 10MB if [ "$(stat -c%s ${LOG_FILE})" -gt 10485760 ]; then mv "${LOG_FILE}" "${LOG_FILE}.old" fi } # Check for existing lock file if [ -f "${LOCK_FILE}" ]; then log_message "ERROR: Script already running. Exiting." exit 1 fi trap 'rm -f "${LOCK_FILE}"' EXIT touch "${LOCK_FILE}" # Check permissions if [ ! -w "${LOG_FILE}" ] && [ ! -w "$(dirname "${LOG_FILE}")" ]; then log_message "ERROR: Insufficient write permissions" exit 1 fi # Use temp file for ping output PING_TMP=$(mktemp) ping -c $PING_COUNT -W $PING_TIMEOUT "$TUNNEL_PEER" > "$PING_TMP" 2>&1 ping_exit=$? # Parse packet loss packet_loss=$(awk -F'[ %]' '/packet loss/{print $6}' "$PING_TMP") rm "$PING_TMP" # Decision logic if [[ $ping_exit -ne 0 ]] || [[ $packet_loss -ge $MAX_PACKET_LOSS ]]; then log_message "CRITICAL: Tunnel degraded (Exit:$ping_exit Loss:${packet_loss}%)" # Check if service is already restarting if ! systemctl is-active --quiet strongswan; then log_message "Service not running. Starting..." systemctl start strongswan else log_message "Full service restart required" systemctl restart strongswan fi log_message "Waiting ${SERVICE_RESTART_DELAY}s for stabilization" sleep $SERVICE_RESTART_DELAY else log_message "OK: Tunnel stable (${packet_loss}% loss)" fi exit 0 ``` This script should be executable. ```bash sudo chmod a+x /usr/local/bin/vpn-monitor.sh ``` ### Step 2: Create the VPN monitoring systemd timer Here we create a systemd timer to run the script every 5 minutes. ```bash= cat << EOF | sudo tee /etc/systemd/system/vpn-monitor.service [Unit] Description=VPN Tunnel Monitor After=network-online.target [Service] Type=oneshot ExecStart=/usr/local/bin/vpn-monitor.sh EOF cat << EOF | sudo tee /etc/systemd/system/vpn-monitor.timer [Unit] Description=Run VPN monitor every 5 minutes [Timer] OnBootSec=5min OnUnitActiveSec=5min AccuracySec=1min [Install] WantedBy=timers.target EOF sudo systemctl daemon-reload sudo systemctl enable --now vpn-monitor.timer ``` ### Step 3: Read the VPN monitoring logs Once the timer is active, all we need to do is read the VPN monitoring logs. Here is a snippet showing that the strongSwan service has been restarted due to loss of ICMP ping packets. ```bash= 2025-05-03 08:25:44 - OK: Tunnel stable (0% loss) 2025-05-03 08:31:14 - OK: Tunnel stable (0% loss) 2025-05-03 08:36:16 - CRITICAL: Tunnel degraded (Exit:1 Loss:100%) 2025-05-03 08:36:16 - Full service restart required 2025-05-03 08:36:17 - Waiting 15s for stabilization 2025-05-03 08:41:19 - OK: Tunnel stable (0% loss) 2025-05-03 08:46:24 - OK: Tunnel stable (0% loss) ``` ## Conclusion In this guide, we have demonstrated how to build a secure, efficient, and resilient site-to-site VPN using strongSwan's IPSec BEET mode between a cloud VPS and a home lab gateway. By leveraging BEET mode, Xfrm interfaces, modern cryptographic standards (Ed25519/X25519), and robust nftables rules, the solution achieves minimal overhead, strong security, and reliable NAT traversal. Convenient steps-including configuration, authentication, firewalling, and automated monitoring-ensure both IPv4 and IPv6 connectivity and ongoing tunnel health. This approach provides a flexible, self-hosted architecture that meets today's network requirements while balancing cost, security, and ease of use. For a more complete and detailed presentation of Nftables and VPN/IPSec packet flow, the following two resources are highly recommended: - [Nftables - Netfilter and VPN/IPsec packet flow](https://thermalcircle.de/doku.php?id=blog:linux:nftables_ipsec_packet_flow) - [Nftables - Demystifying IPsec expressions](https://thermalcircle.de/doku.php?id=blog%3Alinux%3Anftables_demystifying_ipsec_expressions)