Tutorial 1 - creating IP fragment reassembler

IP fragment reassembly is a bit more advanced topic. The tutorial will demonstrate more typical usage of Packet Decoding Framework, part of jNetPcap API.

In order to understand some of the networking concepts discussed in this tutorial you need to first study IP fragmentation and reassembly. Wikipedia has a nice short write up on Ip fragmentation and reassembly (link: Wikipedia). Only a cursory overview understanding is needed to follow the tutorial.

Then we will plan out our strategy for our little application and implement it.

Here are some key concepts about the API that are covered in this tutorial:

The complete application source code can be downloaded here: IpReassemblyExample.java

Step 1 - design our application

Our application is going to be very simple. It is going to read a pcap capture file that contains some Ip4 packets that have been fragmented. It is going to reassemble those fragments, create a new packet that is made up of just the reassembled data. We're going to drop the datalink headers and simply insert a new Ip header in front all those fragments, so that our packet's DLT will be Ip4 (DLT is the first header in the packet.)

Here is what our new packet will look like in memory:

+------------+--------+--------+
| Ip4 header | frag 1 | frag 2 |
+------------+--------+--------+

We need to handle the incoming stream of packets. So the first thing we need to setup is a packet handler that will receive packets from libpcap. We're not going to be concerned with multi-threading issues in this tutorial. So to receive packets our main application class will simply implement the PcapPacketHandler interface. Once we have the packets we will need to check if the packet is Ip4 packet and if its fragmented or not.

For all Ip4 packets, fragmented or not, we going to stuff them into a reassembly buffer that we are going to use for IP datagram, fragmented or not.

The Ip4 flag NO_MORE_FRAGMENTS is going to give us a clue about when the fragment is complete, but we can't always rely on that flag. Fragments can arrive out of sequence or even be dropped along the way and never arrive. So we are also going to keep track of how many bytes we have reassembled. When that total matches the length of the entire unfragmented datagram, then we know we have received all fragments and we are done.

For those cases where fragment is dropped and never arrives, we are also going to implement a simple timeout mechanism that will timeout each reassembly buffer past certain amount of time.

Here is some pseudo code that our application is going to implement:

loop {
  Receive packet from libpcap;
  if packet is Ip4 packet then
    get or create reassembly buffer and store in a map;

    calculate offset into the buffer and add fragment
  
    if the packet is complete then
      remove buffer from map;
      dispatch buffer to user's handler;
    endif

  endif

  timeout buffer entries;
}

User handler {
  receive reassembly buffers;
  create a new IP only packet;
  scan the packet;
  to packet.toString() to get pretty output;
}

Reassembly buffer

This is a very important piece of our application therefore we need to plan it out in detail. We're calling this buffer IpReassemblyBuffer and it extends a JBuffer.

We are going to allocate a large JBuffer which will hold our ip header and all the fragments combined. Like so:

+------------+--------+--------+
| Ip4 header | frag 1 | frag 2 |
+------------+--------+--------+

The buffer is also going to keep track of timeouts. We're going to set a time value at which time the buffer becomes officially timed out. We will implement a simple isTimedout():boolean method to check for that condition. The method simply compares the timeout timestamp with the current time and if its past its due date, return true.

The buffer needs to keep track of number of bytes already assembled and the total length of the IP data gram. When the 2 are equal, that means the buffer is complete and we can dispatch if to the user. We're also going to implement as boolean method that checks for this condition isComplete().

To keep track of all the buffers, we're going to use a JRE Map and use a 32 bit int hash we generate from ip fragments ip header using fields, Ip4.id(), Ip4.source(), Ip4.destination(), Ip4.protocol like so:
int hash = (id() << 16) ^ source() ^ destination() ^ type();

We're also going to use a PriorityQueue that will prioritize buffers for us based on the timeout timestamp value. Buffers will be ordered according to timeout value. The packets on top of the queue are going to be either timedout or closer to timeout than any other buffer on the queue. This is going to lets us efficiently check packets on the queue, until we reach a packet that is not timedout, at which time we can stop.

The first fragment that we see is the one that creates the buffer for that Ip datagram. At the time of the construction of the buffer, we're going to use the ip header of that fragment as a template for the IP header we need to insert infront off all the fragments in the buffer. We also need to reset a few fields in the header to match the new packet that we are creating out of the fragments. We need to either recalculate or reset to 0 the header crc, clear the MORE_FRAGMENTS flag, drop any optional headers by resetting the hlen field to 5 and also set the total length field to the new length of our IP datagram.

The buffer will never be complete unless we receive that last fragment. That last fragment is crucial since it tells us the length of the original IP datagram. If all the fragment arrive in sequence then the last fragment also means that reassembly is complete and we can dispatch to user. Although we could receive fragments out of sequence and still receive a fragment after the last one has been received. Another important thing we need to set, is to change the size of the buffer to match that of the entire datagram. The buffer's physical size is 8K, our datagrams are probably going to be smaller than that, so there will be some unused space at the end, but the buffer will be strictly bounded to datagram data.

So in summary. We have a buffer Map and a timeout Queue. The Map keeps track of reassembly buffers for us based on a special hashcode, while the timeout queue uses the priority queue mechanism to sort our buffers and keep buffers that have timed out at the top.

The user handler

The user handler is going to receive ip reassembly buffers. These buffers may or may not be complete, but they will always have atleast an ip header and 1 fragment.

We will check if the buffer is complete and report an error message if its not. Otherwise we will just create a packet out of it.

There is no need to copy the data out of the buffer, it already contains everything we need. It is freshly allocated so its our to do as we please. It has an Ip4 header at the start and then all of the reassembled fragments already copied into it.

We are simply going to peer the a JMemoryPacket with our buffer. Then we are going to run a scan on the packet to decode it, telling the scanner that the first header is Ip4.

And that's it.

Step 2 - setup the main class

We are going to setup the main class as the packet handler as well. All it needs to do is implement the PcapPacketHandler interface. We are also going to use a static main method that will setup the libpcap portion and register our application as listener to libpcap packets.

public class IpReassemblyExample implements PcapPacketHandler {

  public static void main(String[] args) {
    StringBuilder errbuf = new StringBuilder();
    Pcap pcap = Pcap.openOffline("tests/test-ipreassembly2.pcap", errbuf);
    if (pcap == null) {
      System.err.println(errbuf.toString());
      return;
    }

    pcap.loop(
      6, // Collect 6 packets
      new IpReassemblyExample(
        5 * 1000, // 5 second timeout for reassembly buffer
        new IpReassemblyBufferHandler() {/*Omitted for now */}),
      "");
  }
}

Here we are using the static main method as a start for our application. We open up a capture file and enter a dispatch loop. We're only collecting 6 packets and we are registering out application as the PcapPacketHandler. Our constructor takes a IpReassemblyBufferHandler that will be notified with reassembled buffers. We create an anonymous class for that since we only do very minor work in its handler callback method.

Step 3 - setup the packet handler

This is the meat of our application and the main loop for us. All the packets come here. We pick out the ip4 packets and we process each one of them.

public class IpReassemblyExample implements 
PcapPacketHandler {

  private Ip4 ip = new Ip4(); // Ip4 header

  public void nextPacket(PcapPacket packet, Object user) {

    if (packet.hasHeader(ip)) {
      final int flags = ip.flags();

      /*
       * Check if we have an IP fragment
       */
      if ((flags & Ip4.FLAG_MORE_FRAGEMNTS) != 0) {
        bufferFragment(packet, ip);

        /*
         * record the last fragment
         */
      } else {
        bufferLastFragment(packet, ip);
      }

      /*
       * Our crude timeout mechanism, should be implemented as a separate thread
       */
      timeoutBuffers();
    }
  }
}

We process each Ip4 packet a little differently depending if the the Ip4.FLAG_MORE_FRAGEMNTS is set. If it is not set that means it is the last fragment, otherwise we received a packet inside fragment. If a packet is not fragmented at all, it only contains a single fragment and is always the last fragment and we treat it as a last segment.

We use to methods, bufferFragment() and bufferLastFragment() to record the fragments in the reassembly buffer. The bufferLastFragment() is a little bit special in that it records the length of the entire ip datagram we are reassembling and if all the fragments arrived in sequence it also means we're done with this buffer.

At the end of this loop, we check the time out queue to see if there is anything ready to be timed out. A better approach would be put this check in a sub thread, but for our purpose this a simple check every time a packet arrives is sufficient.

Step 4 - reassemble buffers

public static class IpReassemblyBuffer
	    extends JBuffer implements Comparable<IpReassemblyBuffer> {

Our reassembly buffer simply overrides a JBuffer. Therefore we can copy fragment data directly into it. As a matter of fact we are setting up the buffer to be a complete ip-only packet with all the fragments in it.

+------------+--------+--------+
| Ip4 header | frag 1 | frag 2 |
+------------+--------+--------+

We just have to make sure that as fragments arrive we copy their data into the buffer at the right offset. Lucky for us Ip4.offset() field gives us an offset of every fragment we have to process. We know exactly where the fragment needs to be copied to. Its always going to be ip.offset * 8 + 20. We multiply by 8 because offset is in multiples of 8 per Ip4 specification. We add 20 because that is exactly how many bytes our Ip4 header upfront takes up.

Our buffer will also keep track of how many bytes were already copied into it. When the number of bytes copied equals the length of the original ip datagram we know we are done. In order to keep things simplified a bit, we are assuming that all the fragments arriving are not duplicates and they do not overlap. We would have to implement additional logic to also keep track of each individual fragment by the buffer.

We are going to use the first fragment that triggered the creation of this reassembly buffer as a template to write our ip header up front. We do need to reset certain values to make the header applicable to the new reassembled datagram.

All the fragment data will be written into the buffer starting at offset 20, past our ip header.

private final Queue<IpReassemblyBuffer> timeoutQueue = new PriorityQueue<IpReassemblyBuffer>();

private void timeoutBuffers() {
  while (timeoutQueue.isEmpty() == false) {

    if (timeoutQueue.peek().isTimedout()) {
      dispatch(timeoutQueue.poll());
    } else {
      break;
    }
  }
}

The buffer also maintains a timeout timestamp. This is a time when the buffer will been timedout. If that timestamp is still in the future, the buffer is still active, if its in the past, the buffer is timed out and can be cleared from our buffer map and timeout queues. Notice that the buffer implements a JRE Comparable interface. This is because we are using a JRE PriorityQueue class to manage and sort our timeout queue. We prioritise the buffers on the queue according to their timeout timestamp. Oldest up on top, youngest down below. The priority queue sorts that for us automatically. We can check starting at the top of the queue, expiring all the buffers until we reach a buffer that is not expired yet. We stop and break out of that time out loop since there can be no more expired buffer blow due to priority queue sorting.