So you want to build a Linux router

This post is part of a series on building a router in Linux.

Posts in the series

Why build your own router?

When my ISP issued me my router, it presented me with an interesting feature: The ability to manage my router from anywhere in the world with their provided app!

Which immediately triggered alarm bells.

To enable that, I figured there’s probably one of two scenarios happening:

  1. The router has a publicly accessible control port through which the app speaks (bad)
  2. The router polls a Command and Control server for commands to process (also bad)

Just using these techniques does not render a platform insecure, but these techniques are often beaten by hackers to gain control over devices. I am also entirely unconvinced that I need a “smart” router of any kind. I’d prefer my hardware to not introduce unintentional back doors into my network, thank you very much.

“But Jim”, you might be thinking, “we are talking about trusted and proven brands here. Surely you’re being over the top?”

Probably, but probably not. Plus, what’s the point in being a power user, if you can’t exert said power over your own network?

Furthermore, I am very privacy aware and the idea of a device feeding data back to a mothership was simply a no go for me.

I unplugged the device and threw it in the bin with no intention of looking back.

Let’s begin

What is a router?

At a high level, a router is simply a network device that attaches devices on a local network to the internet. Simple and straight forward, until you peer a little closer. In order to pull this off, there are a few checkboxes we need

Assigning network addresses: DHCP

In order for a local network to exist, there must be an addressable system. We won’t delve too deep into related protocols such as ARP or BOOTP, for now we just need to tell computer A they are address x.x.x.x and computer b they are x.x.x.y

And for that, we will need a DHCP server.

Resolving hostnames: DNS

With step 1 complete, we can now start our foray into the internet. As most of you know, domain names such as news.ycombinator.com need to resolve themselves into an IP address that network gateways understand. This is achieved via DNS queries.

An important caveat here is that DNS servers are presented to network clients in the DHCP options field when they request a local . Meaning our router does not need to be a DNS server, it only needs to know of a few. Such as 1.1.1. (cloudflare), 8.8.8.8 (google’s primary DNS) or 9.9.9.9 (quad9). Your ISP probably has its own DNS servers too, more on this later.

The internet connection

Now that we know who we are on the network and who to speak to in order to resolve domains, we need a way of physically getting there. Most routers (and GSM providers) make use of PPPoE. PPPoE is a protocol that encapsulates Ethernet frames in PPP frames (as neither are compatible with the other). This allows ISPs to provide a mechanism to connect one remote network to another in a fairly standard manner, as PPPoE provides most of the benefits of PPP and Ethernet without much fuss via bridging.

There are plenty of PPPoE providers in Linux, so we don’t have to sweat too much over this either.

Firewall: Filtering and translating packets

Firewalls are an interesting bit of network technology. The typical definition of firewall for me has always been any piece of software (or hardware) that can provide Packet Filtering and Network Address Translation.

Packet Filtering

This is the behaviour most people associate with a firewall. It inspects packets (incoming and/or outgoing) and decides whether or not to allow them to proceed.

Network Address Translation: NAT

NAT is the evil hack to circumvent IPv4 address exhaustion. IPv4 addresses are simply unsigned 32 bit integers (uint32). The max ceiling for a uint32 is 2^32, or 4,294,967,295. That might seem impressive, but the estimated number of internet connected devices is around the 35 billion mark, roughly 9 times the range of IPv4.

The purpose of NAT is to lift and shift packets on one set of IPv4 addresses onto another, such that we reduce the amount of “public” IPv4 namespaces, meaning we no longer have to worry about global uniqueness! Hurray workarounds!

In practice, it goes something like this:

Your router knows how to speak to your local network, but has no idea how to speak to 8.8.8.8. Instead, it will pawn that request off to your ISP, who’s IP range most certainly does not match yours. It does this pretty elegantly: by intercepting and mangling the IP frame to alter the src and dst fields, so that your ISP will return the reply to your router (who may or may not perform another NAT to return the reply to you). This is a complex step involving a fair amount of state management and table altering, but happily there are mechanisms in the wild to support this without. We will delve into this later when needed.

Now that we have an understanding of the challenges ahead, we can begin preparing our device. Stay tuned for the next post.