9 January 2023

Forti SD-WAN Hub and Spoke: Part Two

By admin@labtinker.net

Picking up from the previous post, we now are going to test the resilience of our Hub and Spoke SD-WAN topology using the tests described in the Fortigate design reference guide below:

https://docs.fortinet.com/document/fortigate/7.0.0/sd-wan-self-healing-with-bgp/559415/overview

The following diagram is from the above post’s ‘Testing and Verification section’ which I’m essentially following (though I’m using port1 and port2 not port2 and port3). Having cited my source and given this is publically available I’m sure Fortinet will be OK with me reproducing the diagram here:

On the DC PC – I start a continuous ping to the B1 laptop.

Then checking the status of the HUB health check on the B1 Forti they are both alive. The B2 Forti health checks are also both alive: I’ve illustrated the former with GUI output and the latter with CLI.

B1 Health Check

B2 Health Check

BGP Route Check

On the Hub checking the BGP routes learnt for the B1 prefix 192.168.136.0/24, we can see it has been learnt from two different BGP peers corresponding to the two tunnels on B1 and each route has a different route tag.

The BGP Neighbor display on B1 confirms the local address assigned to the tunnels on B1:

Checking the SD-WAN service rules on the HUB:

We can see that the Rule 10 matches the traffic going to prefixes with BGP ‘route-tag 1’ and this gets steered down VPN1. This is higher in rule order than the rule for the ‘route tag 2’ so is favoured.

According to the Forti post you can check the sessions to confirm the favoured tunnel name using the commands below:

diag sys session filter dst 192.168.136.10

diag sys session list

However, the tunnel name showed as blank when I did this though the gateway was displayed:

So now shutting down VPN1 by the expedient of disabling the port1 interface that it hangs off on B1:

The continuous ping failed for probably over a minute before the failover kicked in so some tuning would be required for a production deployment but failover it did:

Health Check with B1 port1 interface down

On B1 the health check for VPN1 is not unexpectedly dead.

On B2 the health check for VPN1 is unaffected as you might expect:

BGP Route Check with B1 port1 Interface down

Now on the Hub we are only seeing the route on the second interface:

….so as the traffic is being tagged with ‘route tag 2’ it will be use rule 11 in our SD-WAN rules.

On disabling the second interface (port2) so both interfaces on B1 are down- the Forti post suggested the traffic would hit the last SD-WAN rule on the Hub. It seems moot as the traffic will fail regardless in this scenario though there are examples where you might want to capture this traffic if only to steer it to a black hole. For the record I just saw the route was missing:

This makes sense to me as all the BGP peers from B1 are down so nothing is advertised.

Hub port1 and B1 port1 interface both down

For the sake of completion’s sake I shut down port1 on the Hub and port1 on B1 and the ping to 192.168.136.10 survived. The route on the HUB took ‘route tag 2’ and thus was steered down VPN2 on the second interface.

And the branch SD-WAN Health Check had failed…

The example by Forti left it to the reader to define their own ‘standard’ SD-WAN rules on the branch Fortigates using these Health Checks. I hadn’t actually bothered to do this. In this current scenario, I guess as the BGP neighbor has gone on the port1 interface everything is going to get routed on the port2 interface where we still have a BGP neighbor adjacency.

Incidentally, I was also able to ping from the B1 laptop to the B2 laptop showing ADVPN worked.

To build on this lab, which I may try and do in subsequent posts, I would look at the getting the failover time down and putting a second HUB in place to ensure that the branches could still reach each other using ADVPN in the event of one hub failing.