Re-Introduction to PCAP Replay and GopherCAP
A while back we introduced GopherCAP, a simple tool written in Golang that leverages Google's GoPacket library for advanced packet replay. We were challenged by a special dataset that traditional tools simply could not handle.
More than a year has passed since we posted that initial introduction article in which we emphasized PCAP replay as the first GopherCAP use case. The GopherCAP command-line tool was implemented as a Go binary that incorporates multiple subcommands. Some are meant to be used in tandem, such as map and replay commands which were introduced in our previous blog post. Others are meant to tackle specific use cases where traditional tooling falls short. This post introduces one such subcommand - filter.
Filtering PCAP Files: The Initial Problem
Consider the following tcpdump packet filtering commands. The first one reads input PCAP and writes out packets that belong to relevant internal network segments to output. The second command does the same, but filters out only packets for interesting protocols.
tcpdump -r input.pcap -w int.pcap net 192.168.0.0/16 or net 10.0.0.0/8
tcpdump -r input.pcap -w http.pcap port 80 or port 8080 or port 8088
These are very typical packet filtering scenarios. And they were the first things that came to mind when we wanted to filter packets. However, that was not enough for two distinct use-cases we needed to tackle.
Firstly, last year we participated in Locked Shields 2021 cyber defense exercise, supporting the green team by capturing all PCAP data for the exercise. After the exercise, each team usually requests PCAP data from the exercise for research. However, since each team represents a NATO nation, they are only entitled to receive their portion of the traffic. The exercise hosts 25 teams with a complex virtualized network that has over 1000 network segments. Total traffic volume for the three exercise days amounts to dozens of terabytes, which need to be filtered by several hundred individual filtering tasks.
Secondly, this year we have been researching SMB protocol to improve our lateral detection capabilities. For this research, we needed to construct manageable development datasets from another large PCAP collection that also exceeds several terabytes in size.
For both problems, we needed controllable concurrency that previous tcpdump do not support. Furthermore, we had to deal with encapsulated packets which broke tcpdump filters. In short, we needed to modify the packets while still maintaining acceptable performance on modern multithreaded systems.
The Inner-workings of the Filter Subcommand
The filter subcommand reads a YAML file in addition to standard GopherCAP subcommand configuration. In other words, GopherCap configuration files or command line flags point the subcommand towards a dictionary file where keys correspond to filter names and values are a list of criteria. Packets matching those criteria are then written to new PCAP files in a sub directory matching the filter name. Consider the following example:
GopherCAP is pointed toward the /tmp/pcap folder where it recursively found a PCAP file from malware-traffic-analysis. Then two workers were spun up to process all packets in that PCAP file, according to rules in subnets.yml, and write output to /tmp/output folder.
The subnets YAML file was set up thus:
The configuration defines two filters. The first one called int is looking for packets from and to IPv4 internal network segments. The second one called ipv6_link_local is searching for link local addresses, or the internal addresses that are automatically generated from network card MAC addresses for bootstrapping IPv6 connections.
We can observe that we now have PCAPs organized into subfolders per filter, with each file containing only packets matching the corresponding subnet filter.
Furthermore, the filter YAML file can define port listings the same way as the previous example defined subnets.
To use port filtering mode, we simply need to switch mode via CLI flag, and point GopherCAP to a new filter definition.
Now let’s dive deep into how it works.
Modern packet capture servers can have several hundred CPU threads, yet disk IO is still limited. We wanted to utilize concurrency to speed up this filtering, yet not overload the storage throughput. Golang concurrency primitives, channels and goroutines, were perfect for initializing a controllable number of workers that would pick up and execute individual filtering tasks. Furthermore, those primitives coupled with contexts allow us to gracefully stop those workers.
Yet, this is not the full story. Tcpdump would still be faster than a custom Go application because it’s written in C and has been optimized for several decades. Clever scripting could create a controllable number of tcpdump processes and achieve the same result.
But it cannot get around the next issue. Our filter had no output. Neither with tcpdump, nor with gopacket implementation that was already used in GopherCAP with great success the year before. This was unexpected.
The problem we uncovered this year was caused by how the packets were captured. There was no physical packet broker - as the year before - that would decapsulate ERSPAN headers from actual traffic. Furthermore, the network team worked around this issue by forwarding packets in a GRE tunnel directly to the capture interface of the Arkime server. This did not affect Arkime nor Suricata, aside from a GRE tunnel being present in every parsed session. They even handled ERSPAN type 2 decapsulation with no issues.
However, this packet encapsulation had a profound effect on filtering. Simply put, the filter was applied on the GRE endpoint IP address that is higher in the packet layer stack. That layer needed to be stripped away before applying the filter. So we opted to implement a decapsulation feature, currently supporting GRE and ERSPAN type 2. In gopacket, this can be done using the Layers API which allows us to figure out the correct offset for encapsulated packets.
Once the offset is found, we construct a new packet from that offset. We simply need to figure out the layer type of the interior packet and pass the payload into the new packet constructor.
The calling function needs to read packets from source, apply the decapsulation if needed, and write a new packet to the output. Note, that the trick to writing modified packets with gopacket is lifting the metadata from the original packet with updated capture length. Otherwise the gopacket writer would reject the packet if metadata was missing or if capture length did not match the actual payload size. Since payload is a byte array, then length of that array literally translates to payload size in bytes.
The initial implementation of this filter logic was focused only on subnet matching, but we later added port filtering for building easily consumable R&D datasets. For that reason we had to generalize the actual packet matching code that initially only needed to handle a single use case. In other words, we needed to get rid of this hard-coded logic. But Go is a statically-typed language. That’s the reason why we like it, as it gives us compile-time safety guarantees and it makes large-scale code changes painless. We wanted to facilitate a new use case without converting it into a dynamic general purpose tool. After all, these already exist in tcpdump and tshart, for example.
But how to implement a static filtering method that’s generic enough to facilitate other use-cases down the line? The answer is interfaces!
Interface is a way in Golang to hide implementation details. PCAP reader does not care how a packet is filtered. It simply cares if the packet matches or not. Subnet filters can then easily be implemented as a custom type around a list of network objects. Matcher would attempt to extract the posterior network layer info from the packet. Source or destination, which correspond to IP addresses, would then be parsed into golang IP objects which are then compared against network lists.
By comparison, a port matcher is implemented as a custom type around a set. In computing, set is a data structure that holds unique values. In golang, it can be created with a map of truth values. Missing elements will return an implicit false, thus making checks easy and fast.
Gopacket exposes network port info via TransportLayer API, though we need to check for empty values here. Matchers must compare source or destination values. In practice, this is also much faster than subnet matchers, as endpoint objects are constructed around raw byte values.
Actual user-defined numeric ports need to be converted to Endpoint byte values by the constructor. Note that any value that satisfies Endpoint could be used here. MAC address is also a gopacket endpoint.
So how fast is it? Port filtering on a Ryzen 5800X yields around 350 000 packets per second per worker when configured for 4 workers. A single worker would process packets at 500 000 pps while 2 would do it at 450 000. Speed gains are not linear due to how goroutine scheduling works inside the Go runtime. Regardless, 4 workers yield about a 3 time speedup, processing almost 1,5 million packets per second. That is, assuming the system has enough idle CPU cores. And do note that mileage may vary depending on filter complexity.
Decapsulation reduces the performance per core by 80-150 thousand packets per second. This is to be expected, as GopherCAP now has to iterate over packet layers and construct new packets from the old payload.
At this rate we were able to filter a 2.5 terabyte dataset in little over 3 hours. While tcpdump could be faster thread-per-thread, we nevertheless gain a lot of flexibility in being able to modify individual packets.
Use case: Building a Dataset for SMB Lateral Detection Research
SMB, or Server Message Block, is file sharing and communication protocol widely used in Windows networks. Most know it as the underlying protocol behind file servers, printer connections, etc. However, it also enables Remote Procedure Calls (RPC) to remotely administer and communicate with endpoints. By design Remote Code Execution (RCE) enables one oft the most serious vulnerability exploitation categories out there.
These capabilities make SMB a great protocol not only for domain administration, but also for post-compromise lateral movement. Normally the first endpoint is compromised via some kind of malicious payload that establishes an outbound C2 connection back to the Command and Control server. The subsequent goal is to compromise more workstations to solidify that foothold. That way, if any compromised asset is taken offline, then the bad guys can still pop new shells via other endpoints. And the ultimate goal is to compromise the domain controller, which would give full administrative access for all domain assets to the bad guys.
As MITRE has documented, SMB is the protocol of choice for many threat actors. Most common technique is to use windows shares to pass malicious files between computers. Cobalt Strike, for example, makes it easy to generate new malicious payloads that can be dropped into shared folders on compromised workstations. If another workstation executes it, the payload would instantly establish a c2 connection back to the control server. Windows domains always have these shares, even if it seems no file server is being used. That’s because windows domains use hidden administrative SMB shares to distribute software deployments, policies, and group policy objects to workstations. These administrative shares are common targets for threat actors. However, it also means malicious payloads are visible from the network.
But this is not where the SMB fun ends. Threat actors and penetration testers use SMB to enumerate domain users, gather operating system and version information, discover password policies, etc. It is a great information gathering protocol. And sometimes they can even brute force passwords for valid users. Which in turn makes it easy for them to connect to remote endpoints via psexec and issue commands. Not to mention that SMB protocol itself has been subjected to many high-profile vulnerabilities. In 2017, the EternalBlue exploit was used against a zero-day vulnerability in SMB version 1, which allowed threat actors to easily compromise entire domains. WannaCry ransomware being the most high-profile example of how Eternalblue was used.
At Stamus Networks, our threat intelligence updates have historically solidified the perimeter coverage for known indicators of compromise via signature-based threat detection. And our forthcoming upgrade will introduce several new signature-less detection methods against C2 channels. We are now turning our attention to inside the networks - that is lateral movement. And SMB is one of the key protocols we’ve been exploring. This work includes:
- Developing lateral SMB detection rules to highlight irregular or abnormal connections
- Improving SMB parsing capabilities in Suricata;
- Exploring anomaly detection techniques in EVE protocol logs;
To do any of these things, we need to start from the data. In particular, the EVE output from Suricata directly affects the post-processing detection and anomaly detection methods we can apply. For example, during data exploration we discovered that Suricata lacked support for many SMB status code values. Any update to suricata would require re-parsing the PCAPs into a new EVE dataset.
GopherCAP filtering allowed us to cut several multi-terabyte datasets into manageable SMB datasets that can be parsed by Suricata in seconds. That is invaluable for research where data ingestion is usually the most time consuming task.
This post introduced the filtering subcommand in GopherCAP. Our initial motivation for the feature was for PCAP splitting and decapsulation after a cyber exercise. But we also found it to be a handy tool for our internal R&D. In truth, most users would not need it. General purpose tools such as tcpdump and tshark have been around for a long time and would satisfy most users. However, the custom tooling against gopacket library allows us to tackle niche use-cases and large datasets where general purpose tools fall short.
Developing GopherCAP has also been a fun learning experience. In particular, developing the packet decapsulation feature has greatly improved my knowledge of how packet processing works. Knowledge that will no doubt serve me well down the line.
Try GopherCAP Yourself
Are you dealing with large post-mortem PCAP datasets and need this kind of filtering? Then try it out. GopherCAP is available now on the official Stamus Networks GitHub page (https://github.com/StamusNetworks/gophercap).
GopherCAP can be installed from github releases, or be easily built from source. Does this seem like exactly what you need but current features don’t fully solve your problem? Please open an issue describing your use-case and providing a sample PCAP to test against. Solution for your filtering problem could be a if statement or a function call away.