All posts by donovandurand

Promiscuous mode with NSX Distributed firewall

TL;DR – When running appliances that require promiscuous mode it’s advised to place the appliance in the DFW exclusion list, especially if the appliance is a security hardened virtual appliance

Last year I was working with a customer that had rolled out NSX to implement micro-segmentation. This customer is a mid-sized customer that is integrating NSX into their existing production vSphere environment. The customer had decided to implement security policy based on higher level abstractions i.e. security groups but also decided they need security policy enforced everywhere, using the Applied To column set to Distributed Firewall. Not an ideal implementation but perhaps that discussion is for another blog post

The environment being a brownfield environment, the customer wanted to implement security policy to the existing virtual appliances. They were using F5 virtual appliance for load balancing and had specific requirements to configure promiscuous mode on the F5 port group to support VLAN groups(https://support.f5.com/kb/en-us/products/big-ip_ltm/releasenotes/product/relnotes_ltm_ve_10_2_1.html)

Long story short, by configuring the F5 port groups in promiscuous mode they were facing intermittent packet drops across the environment. We were quickly able to identify that the workloads impacted were the workloads that resided on the host that had the F5 virtual appliances. As we could not power off the appliance we decided to place the F5 appliance in an exclusion list and like magic, everything returned to normal, no packet drops.

Well now that we understood what was causing the issue but as with all things we needed to understand why this behaviour was seen. We started by picking a virtual machine on the F5 host and turned up logging on the rule that allowed traffic for a particular service, in this case, SSH. We also added a tag to that rule so we could filter the logs.

We started to follow an SSH session that we initiated from 10.68.0.250 to 10.68.145.170. We were seeing the SYN from 10.68.0.250 reaches 10.68.145.170 and this is allowed by the firewall as rule 1795 permits it but interestingly enough we were also seeing the same traffic flow on a different filter, in this case, 53210

 

/var/log/dfwpktlog.log
T01:29:48.363Z 28341 INET match PASS domain-c7/1795 IN 52 TCP 10.68.0.250/65448->10.68.145.170/22 S 12345_ADMIN
T01:29:48.363Z 53210 INET match PASS domain-c7/1795 IN 52 TCP 10.68.0.250/65448->10.68.145.170/22 S 12345_ADMIN

 

Note: You can get the filter hash for a virtual machine by running vsipioctl getfilters. Each vNIC gets a filter hash

 

[[email protected]:~] vsipioctl getfilters

Filter Name : nic-11240904-eth0-vmware-sfw.2
VM UUID : 50 1e 02 7b b0 88 05 47-c2 ed dd 92 22 93 12 1d
VNIC Index : 0
Service Profile : –NOT SET–
Filter Hash : 28341

 

Looking at the output of vsipioctl getfilters we identified that the duplicate filter (53210) belonged to the F5 virtual appliance.

We also captured packets at the switchport of the virtual machine and we see 10.68.145.170 sending a SYN/ACK back to the client, this packet was also seen on both filters. The subsequent frame shows the client immediately sending a RST packet to close the connection. This was a bit strange as the firewall rule allows this traffic and there was no other reason why the client would immediately send a RST packet

Screen Shot 2018-01-17 at 2.07.11 PM

 

Interestingly the dfwpktlogs had recorded packets on the duplicate filter hitting a different rule (rule 1026) and this rule had a REJECT action so a RST packet would be sent out for TCP connections


/var/log/dfwpktlog.log
T01:29:48.364Z 53210 INET match REJECT domain-c7/1026 IN 52 TCP 10.68.145.170/22->10.68.0.250/65448 SA default-deny
T01:29:48.364Z 53210 INET match REJECT domain-c7/1026 IN 40 TCP 10.68.145.170/22->10.68.0.250/65448 R default-deny

 

Why would traffic be hitting a different rule when the initial SYN was evaluated by rule 1795 ??

So, when a virtual machine is connected to a promiscuous portgroup its filter is also in promiscuous mode and hence its filter would receive all traffic. In this particular case, the F5 appliance was placed in security group that had its own security policy applied to it. Duplicated traffic that was sent to this filter was being subjected to this security policy. The security policy had a REJECT action configured and hence a RST packet was being sent to the client

Having identified this, we provided the customer with a couple of options to mitigate the issue.

I hope you found this article to be as interesting as this issue was to me.Until next time !!

Powercli script to run commands on ESX hosts

Recently while working with a customer on a large scale NSX environment we hit a product bug that required us to increase the memory allocated to the vswfd (vShield-Statefull-Firewall) process on the ESXi host.

There were 2 main reasons we hit this issue,
1. Due to high churn in the distributed firewall there was a significantly high number of updates the vsfwd daemon has to process
2. Non-optimized way of allocating memory for these updates caused vsfwd to consume all of its allocated memory.

Anyway the purpose of this article is not to talk about the issue but a way to automate the process of updating the vsfwd memory on the ESXi hosts. As this was a large scale environment it was not ideal to manually edit config files on every ESXi host to increase the memory. Automating this process would eliminate errors arising due to manual intervention and keeps configuration consistent across all hosts.

I started writing the script in python and was going to leverage the paramiko module to SSH and run commands. I quickly found it a bit difficult to manage the numerous host IP addresses with python so I switched to PowerCli as I could use the Get-VMhost command to get the list of hosts in a cluster.

I’ve put the script up on the github for anyone interested – script

Until next time and here’s to learning something new each day!!!

Capturing and decoding vxlan encapsulated packets

In this short post we will look at capturing packets that are encapsulated with the VXLAN protocol and how to decode them with wireshark for troubleshooting and debugging purposes. This procedure is handy when you want to analyze network traffic on a logical switch or between logical switches.

In order to capture packets on ESXi we will use the pktcap-uw utility. Pktcap-uw is quite a versatile tool, allowing to capture at multiple points on the network stack and even trace packets through the stack. Further details on pktcap-uw can be found in VMware product documentation –
here

The limitation with the current version of pktcap-uw is that we need to run 2 sets of commands to capture both egress and ingress. With that said lets get to it.. In this environment I will capture packets on vmnic4 on source and destination ESXi hosts.

To capture VXLAN encapsulated packets egressing uplink vmnic4 on the source host

pktcap-uw --uplink vmnic4 --dir 1 --stage 1 -o /tmp/vmnic4-tx.pcap

To capture VXLAN encapsulated packets ingressing uplink vmnic4 on the destination host

pktcap-uw --uplink vmnic4 --dir 0 --stage 0 -o /tmp/vmnic4-rx.pcap

If you have access to the ESXi host and want to look at the packet capture with the VXLAN headers you can use the tcpdump command like so,

tcpdump

This capture can further be imported into wireshark and the frames decoded. When the capture is first opened wireshark displays only the outer source and destination IP which are VXLAN endpoints. We need to map destination UDP port 8472 to the VXLAN protocol to see the inner frames

To do so, open the capture with Wireshark –> Analyze –> Decode As

vxlan_decode

Once decoded wireshark will display the inner source and destination IP address and inner protocol.

vxlan_decap

I hope you find this post helpful, until next time!!

Removing IP addresses from the NSX IP pool

I was recently involved in a NSX deployment where the ESX hosts (VTEPs) were not able to communicate with each other. The NSX manager UI showed that few ESX hosts in the cluster were not prepared even though the entire cluster was prepared. We quickly took a look at the ESX hosts and found that the VXLAN vmk interfaces were missing but the VIBs were still installed. Re-preparing these hosts failed with no IP addresses available in the VTEP IP pool.

To cut a long story short, We had to remove some IP addresses from the IP pool and apparently there is no way to do this from the NSX UI without deleting and re-creating the IP pool. Even with deleting and re-creating the IP pool, you can only provide a single set of contiguous IP addresses. Fortunately there is a Rest API method available to accomplish this.

So to remove an IP address from the pool we first need to find the pool-id. Using a Rest client run this GET request to get the pool-id

https:///api/2.0/services/ipam/pools/scope/globalroot-0

The output would list all the configured IP pools. We need to look at the objectId tag to get the pool-id. Once we have the pool-id we can query the pool to verify the start and end of the IP pool

https:///api/2.0/services/ipam/pools/ipaddresspool-1

To remove an IP address from this pool use the Delete method along with IP address like so,

https:///api/2.0/services/ipam/pool/ipaddresspool-1/ipaddresses/192.168.1.10

Note: With this method you can only remove IP addresses that have been allocated and not free addresses in the pool.

VMware VDS link aggregation enhanced support

Recently while working on a NSX design we decided to use LAG from the ESX hosts to the ToR leaf switches. After configuring LACP in active mode on the physical switches we moved on to configure LACP on the distributed virtual switch. All LACP configuration for the VDS has to be done from the vSphere WebClient but looking at the WebClient we could not find the LACP option on the VDS. After some digging around we figured that the distributed virtual switch was created using the VMware C# client. A distributed virtual switch created from the C# client has only basic support for LACP.

To use advanced LACP options click the “Enhance” option in the distributed virtual switch features box from the vSphere WebClient. This allows us to choose the load balancing algorithm, LACP mode and also creates the link aggregation group with the ESX uplinks.

lag

These are the enhanced LACP features that VDS supports:

  • Support for configuring multiple link aggregation groups (LAGs)
  • LAGs are represented as uplinks in the teaming and failover policy of distributed ports or port groups. You can create a distributed switch configuration that uses both LACP and existing teaming algorithms on different port groups.
  • Multiple load balancing options for LAGs.
  • Centralized switch-level configuration available under Manage > Settings > LAC