AWS Official VPN Troubleshooting Guide

AWS Official VPN Troubleshooting Guide

This section provides AWS official troubleshooting procedures based on the comprehensive AWS Site-to-Site VPN documentation. Follow these systematic approaches for professional-grade VPN troubleshooting.

AWS Structured Troubleshooting Framework

AWS provides two main troubleshooting flowcharts based on your routing configuration:

For BGP-Enabled Devices (Dynamic Routing)

Troubleshooting Flow: IKE → IPsec → Tunnel → BGP

For Non-BGP Devices (Static Routing)

Troubleshooting Flow: IKE → IPsec → Tunnel → Static Routes

Step-by-Step AWS Troubleshooting Process

Step 1: IKE (Internet Key Exchange) Verification

Purpose: IKE security association is required to exchange keys for IPsec SA establishment.

Verification Steps:

  1. Check if IKE security association exists
  2. Review IKE configuration settings
  3. Verify encryption, authentication, PFS, and mode parameters match AWS config file

Diagnostic Commands:

# For strongSwan/Libreswan:
sudo ipsec status
sudo ipsec statusall

# For Cisco devices:
show crypto isakmp sa
show crypto ipsec sa

# For Juniper devices:
show security ike security-associations
show security ipsec security-associations

If IKE SA doesn’t exist: Review your IKE configuration settings and ensure all parameters match the AWS-provided configuration file.

Step 2: IPsec Security Association Verification

Purpose: IPsec SA is the actual tunnel that carries encrypted traffic.

Verification Steps:

  1. Query customer gateway device for active IPsec SA
  2. Ensure encryption, authentication, PFS, and mode parameters match AWS config
  3. Review IPsec configuration if no SA exists

Diagnostic Commands:

# For strongSwan/Libreswan:
sudo ipsec trafficstatus
sudo ipsec whack --status

# For Cisco devices:
show crypto ipsec sa
show crypto ipsec transform-set

# For Juniper devices:
show security ipsec security-associations detail
show security ipsec statistics

Step 3: Tunnel Connectivity Verification

AWS Required Firewall Rules:

Inbound Rules (from internet):

Rule Source IP Dest IP Protocol Port
I1 Tunnel1 Outside IP Customer Gateway UDP 500
I2 Tunnel2 Outside IP Customer Gateway UDP 500
I3 Tunnel1 Outside IP Customer Gateway IP 50 (ESP) -
I4 Tunnel2 Outside IP Customer Gateway IP 50 (ESP) -

Outbound Rules (to internet):

Rule Source IP Dest IP Protocol Port
O1 Customer Gateway Tunnel1 Outside IP UDP 500
O2 Customer Gateway Tunnel2 Outside IP UDP 500
O3 Customer Gateway Tunnel1 Outside IP IP 50 (ESP) -
O4 Customer Gateway Tunnel2 Outside IP IP 50 (ESP) -

NAT Traversal: If using NAT-T, also allow UDP traffic on port 4500.

IP Connectivity Test:

# Ping virtual private gateway address from customer gateway
ping <AWS_TUNNEL_INTERFACE_IP>

# If ping fails, review tunnel interface configuration
# Verify correct IP addresses are configured

Step 4A: BGP Verification (For Dynamic Routing)

Requirements:

  • BGP status must be “Active” or “Established”
  • Allow approximately 30 seconds for BGP peering to become active
  • Customer gateway must advertise default route (0.0.0.0/0)
  • Verify both tunnels are in established state

BGP Diagnostic Commands:

# For FRRouting:
sudo vtysh -c "show bgp summary"
sudo vtysh -c "show ip route bgp"
sudo vtysh -c "show bgp neighbors"

# For Cisco devices:
show ip bgp summary
show ip bgp neighbors
show ip route bgp

# For Juniper devices:
show bgp summary
show route protocol bgp
show bgp neighbor

Step 4B: Static Routes Verification (For Static Routing)

Customer Gateway Side:

  • Add static route to VPC CIDR with tunnels as next hop
  • Verify route table entries point to correct tunnel interfaces

AWS Side:

  • Add static route in VPC console
  • Configure virtual private gateway to route traffic to internal networks
  • Verify route propagation is enabled

Both Tunnels: Ensure both tunnels have proper static routes configured.

AWS CloudWatch Monitoring

Available VPN Metrics

Namespace: AWS/VPN

Key Metrics:

  • TunnelState: Shows UP/DOWN status of tunnels
  • PacketDropCount: Number of dropped packets
  • PacketCount: Total number of packets

Monitoring Setup:

  1. Open CloudWatch Console
  2. Navigate to Metrics → VPN
  3. Select “VPN Tunnel Metrics”
  4. Monitor tunnel status and performance

CLI Monitoring:

# List all VPN metrics
aws cloudwatch list-metrics --namespace "AWS/VPN"

# Get tunnel state metrics
aws cloudwatch get-metric-statistics \
  --namespace AWS/VPN \
  --metric-name TunnelState \
  --dimensions Name=VpnId,Value=vpn-xxxxxxxxx \
  --start-time 2024-01-01T00:00:00Z \
  --end-time 2024-01-01T23:59:59Z \
  --period 300 \
  --statistics Average

Common VPN Error Scenarios & AWS Solutions

Error 1: Tunnel Status DOWN

AWS Diagnosis Process:

  1. Check IKE SA: Verify pre-shared keys, IP addresses, IKE parameters
  2. Check IPsec SA: Verify IPsec parameters, encryption settings
  3. Check Connectivity: Ping tunnel interface IPs, verify firewall rules
  4. Check Routing: Verify BGP status or static route configuration

Error 2: Authentication Failures

Common Causes & Fixes:

Pre-shared Key Mismatch:

  • Verify PSK in customer gateway matches AWS config exactly
  • Check for extra spaces or special characters
  • Ensure proper file permissions (600) for secrets file

IP Address Mismatch:

  • Verify customer gateway public IP matches AWS config
  • Check for NAT/firewall IP translation issues
  • Confirm tunnel endpoint IPs are correct

Error 3: Intermittent Connectivity

Troubleshooting Steps:

DPD (Dead Peer Detection) Issues:

  • Verify DPD settings match AWS recommendations
  • Check DPD timeout and retry values
  • Monitor for DPD-related log messages

MTU Issues:

  • Test with different MTU sizes (try 1436 bytes)
  • Configure MSS clamping if needed
  • Check for fragmentation issues

Network Instability:

  • Monitor for packet loss
  • Check ISP connectivity stability
  • Verify redundant tunnel configuration

Error 4: BGP Routing Issues

BGP Session Not Establishing:

  • Verify BGP neighbor configuration
  • Check AS numbers (customer vs AWS)
  • Confirm BGP authentication settings

Route Advertisement Problems:

  • Verify default route (0.0.0.0/0) advertisement
  • Check route filtering and policies
  • Monitor BGP route table updates

End-to-End Connectivity Tests

# From customer network to AWS VPC
ping <AWS_PRIVATE_IP>
traceroute <AWS_PRIVATE_IP>

# From AWS VPC to customer network
ping <CUSTOMER_PRIVATE_IP>
traceroute <CUSTOMER_PRIVATE_IP>

# Tunnel interface connectivity
ping <AWS_TUNNEL_INTERFACE_IP>
ping <CUSTOMER_TUNNEL_INTERFACE_IP>

Performance Testing

# Bandwidth testing
iperf3 -c <REMOTE_IP> -t 60

# Latency testing
ping -c 100 <REMOTE_IP>

# MTU discovery
ping -M do -s 1472 <REMOTE_IP>

Advanced Troubleshooting Techniques

Packet Capture Analysis

Capture Points:

  • Customer Gateway: Capture on external interface (encrypted traffic)
  • Internal Interface: Capture decrypted traffic
  • AWS Side: Use VPC Flow Logs

Analysis Commands:

# Wireshark filters for VPN traffic
udp.port == 500 or udp.port == 4500 or ip.proto == 50

# tcpdump for VPN traffic
sudo tcpdump -i any -n 'udp port 500 or udp port 4500 or proto 50'

# strongSwan/Libreswan debugging
sudo ipsec whack --debug-all

AWS Best Practices for VPN Troubleshooting

Proactive Monitoring Setup

CloudWatch Alarms:

  • Tunnel state monitoring
  • Packet drop rate alerts
  • BGP session status alerts

Logging Configuration:

  • Enable VPN connection logs
  • Configure CloudWatch log groups
  • Set up log analysis and alerting

Health Checks:

  • Implement automated connectivity tests
  • Monitor both tunnel redundancy
  • Set up failover testing procedures

Documentation Requirements

Maintain Records Of:

  • Customer gateway device configuration
  • Network topology and IP addressing
  • Firewall rules and security policies
  • Change management procedures
  • Incident response procedures

Device-Specific AWS Guides

AWS provides official troubleshooting guides for:

  • Cisco ASA: Comprehensive ASA-specific troubleshooting
  • Cisco IOS: With and without BGP configurations
  • Juniper JunOS: Complete JunOS troubleshooting procedures
  • Juniper ScreenOS: Legacy ScreenOS support
  • Yamaha: Router-specific configuration guidance
  • Generic Devices: OpenSwan, Libreswan, StrongSwan support

AWS Official Resources

Community Support

  • AWS re:Post Forum: Amazon VPC category for community discussions
  • AWS Support: Technical support cases and premium consultation
  • AWS Architecture Review: Professional architecture guidance

Summary

This AWS official troubleshooting guide provides:

Systematic troubleshooting approach following AWS best practices
Step-by-step verification process from IKE to routing layers
Comprehensive error scenarios with official solutions
CloudWatch monitoring integration for proactive management
Device-specific guidance for major network equipment vendors
Advanced troubleshooting techniques for complex scenarios

Follow these AWS-recommended procedures for professional-grade VPN troubleshooting and ensure reliable Site-to-Site connectivity between your on-premises network and AWS VPC.

💡 Pro Tip: Always follow the systematic approach (IKE → IPsec → Tunnel → Routing) and use AWS CloudWatch monitoring for proactive VPN management.