Decoding Mist AP to Mist Edge tunnels with Wireshark

I don’t have any hard statistics to back this up, but I’m willing to fake bet that at least 90% of Mist’s customer deployments use local breakout (LBO) or local bridging as the method of offloading Wi-Fi client traffic onto the wired network. If this is your first time seeing the term LBO, it’s essentially bridging traffic directly onto the switch port that the AP is connected to. It requires a L2 network where your client VLANs are available on the switch(es) that your AP(s) are connected to.

Fig. 1 Local bridging or LBO deployment model courtesy of Mist’s website

What happens when you can’t or don’t want to configure your client VLANs locally and instead want to aggregate those VLANs somewhere else, like in your datacenter? You need some way of tunneling that client traffic from the APs to the datacenter and there needs to be a device at the other end of that tunnel that can terminate those tunnels from your APs. This would be an example of your more traditional controller-based Wi-Fi deployments where vendors like Aruba, Cisco, and Ruckus have solutions.

Mist also offers a solution to tunnel client traffic as well; while still keeping the control and management plane traffic cloud-based in the form of their Mist Edge appliances. In our home office (HO) locations, our design dictates that we configure a single VLAN for our APs to get connectivity on the network, but our client VLANs are configured in our data centers preventing local bridging from being an option so we must use Mist Edges.

Fig. 2 Tunnel client traffic to data center model courtesy of Mist’s website

Last week, we were experiencing problems with only one of our many HO sites where clients couldn’t obtain DHCP IP addresses and ARP replies from the VLANs’ L3 gateways were also not reaching the clients. This site uses the same Mist Edges that several other sites use with the same VLANs and same DHCP server and none of those sites were experiencing these problems from what we could tell.

Fig. 3 Marvis Actions correctly indicating there was a problem with DHCP at the site
Fig 4. Site’s SLE Insights page showing many DHCP and Gateway ARP timeouts
Fig 5. Site’s SLE Insights page showing all failures for the DHCP server in question
Fig. 6 OTA packet capture from Mist AP showing DHCP Discovers/Requests without any return Offers/Acks

We had to figure out a way to determine what was different with this site vs the others and one way to do that was to get wired packet captures locally from the problem site and in the datacenter where the Mist Edges resided. We were able to get those captures and ultimately determined that DHCP was working as expected and that DHCP Offers were making it back over the tunnel to the AP, but the cool thing that we discovered from the Mist support engineer was how to decode the L2TPv3 traffic that is used for the tunnels between their AP and Mist Edge. That’s what the rest of this post will show.

Here’s what the traffic looks like before decoding… Data plane traffic is proceeded with a “D” while control plane traffic is called “Control Message” followed by what the message actually is.

Fig. 7 Client data traffic marked as “D” and control plane traffic is indicated by “Control Message”
Fig. 8 Drilling down into the L2TPv3 shows data, unencrypted
Fig. 9 Right-click on the protocol you want to change decoding for and select “Decode As…”
Fig. 10 Instructions for how to configure Wireshark to decode the L2TPv3 traffic
Fig 11. Highlighted ARP and DHCP traffic now showing in the Packet List pane
Fig 12. Packet Details pane shows a DHCP Offer received on the AP’s switch port with the DHCP server in question

In conclusion, while most deployments of Mist likely use local bridging for client traffic which would make packet captures easier to decode and analyze, there is a way to also analyze the traffic that is tunneled between the APs and Mist Edges when that method is used as well. I hope this helps someone who might need this in the future!

Sources: Mist Edge Getting Started Guide