Announcement: I am now fully available for consulting, click here to learn more.
Skip to content

How to configure routed IPv6 in Docker

This article has been published on the APNIC blog as well.

I felt the need to write this short blog post, as I still receive DMs and emails from people who were not able to get native routed IPv6 working on Docker. In this post, I’ll show you how to do it in a simplified manner, while obscuring some details of network architecture for the purpose of solely focusing on Docker.

I have been using routed IPv6 on Docker for years before the ‘routed mode‘ configuration was officially added in 2024. However, even with the updates added in Docker v27, I still see users online struggling to get IPv6 working properly.

A lot of the confusion seems to stem from misinformation across the web and the lack of network engineering knowledge amongst the public in general. The concept of routing is alien-enough even in the network engineering world. This creates confusion for people with terms like ‘NAT’, ‘Bridge’, ‘Host Driver’, and ‘NDP Proxy’ in the context of Docker.

I had planned to write a more detailed article on how to do IPv6 with Docker in collaboration with Docker Inc. and publish it here. But life got in the way. I still plan to make it happen one day, so keep an eye on this GitHub issue for updates.

What does ‘routed’ really mean?

In this context, the IPv6 prefix is reachable via native Layer 3 packets forwarding without hacks like NAT66 and/or NDP proxying. Specifically, in the Docker additional context, it means the Docker ‘network’ isn’t bridged with the host’s underlying network infrastructure, and containers aren’t relying on the host’s IPv6 address for connectivity.

In other words, routing means no bridging, no NAT-ting, no sharing of the host’s public IPv6 address on the WAN interface. It’s completely Layer 3 native packet forwarding from your network infrastructure to the host’s Docker network segment.

Assumptions

  1. We will use Docker Compose.
  2. You know how routing works.
  3. You know how to route an IPv6 prefix (preferably a /64) from the underlay network to the Docker host.
  4. The Docker host can be a Pi, a laptop running Debian, a server box running a hypervisor or a regular Debian OS, or a Virtual Machine (VM) behind a hypervisor or possibly behind a KubeVirt.
  5. The Docker host’s iptables/nftables is clean; there’s no user-defined configuration that could break Docker and/or IPv6 networking.
  6. Let’s assume our Docker public IPv6 Prefix is 2001:db8::/64, and it is routed to the Docker host from the underlying network.

Routing Protocol Recommendation

I recommend BGP as the preferred routing protocol, as it’s the standard approach in data centre Clos fabrics. Alternatively, is-is is a viable option.

As a best practice, remember to route the IPv6 prefix to a blackhole with a high administrative distance (or metric) on the Docker host. This acts as a safety net if Docker fails in production. Any traffic that was previously live at scale would now just get routed to the blackhole, minimizing CPU load that would otherwise be spent generating ICMPv6 ‘Destination Unreachable’ packets.

Docker Compose configuration

You can create a custom Docker bridge and assign the routed IPv6 prefix to it. From here, the rest is simple; containers will get an IPv6 address from the configured prefix, or you can also assign a static IPv6 address to a container if you prefer that.

networks:
  ipv6_native:
    driver: bridge
    driver_opts:
      com.docker.network.bridge.gateway_mode_ipv6: "routed"
    enable_ipv6: true
    ipam:
      driver: default
      config:
        - subnet: 2001:db8::/64
          gateway: 2001:db8::1

Additional tips on firewall rules

I recommend using network engineering-centric firewall rules whereby the rules are constructed to permit solicited peer-to-peer (P2P) traffic in both directions, permit certain ports directly, permit ICMPv4/v6, permit UDP traceroute etc. To achieve this, you can disable Docker’s manipulation of the iptables and ip6tables (or eventually nftables).

Disable Docker’s iptables/ip6tables manipulation using this:

#Add these lines in your /etc/docker/daemon.json
{
  "iptables": false,
  "ip6tables": false
}

But remember, you need to build your own rules to protect the host and containers along with NAT rules for IPv4 to work. A quick example of NAT-ting IPv4 using nftables; the ‘persistent’ keyword just helps make STUN work better from the client-side due to persistent port mapping between RFC1918 address and the public IP, but this is a complex topic of its own and outside the scope of this post. Just move everything to IPv6 and forget about these NAT-related hacks and complexities.

table inet docker {
    chain postrouting {
	type nat hook postrouting priority 100; policy accept;
	ip saddr 172.16.0.0/16 oifname "eth0" snat to 192.0.2.1 persistent;
    }
}

If you want to continue using Docker’s built-in iptables manipulation, then remember, even with routed IPv6, you still need to publish the ports that you want global accessibility to.

Conclusion

As we’ve seen in this post, routed IPv6 with Docker doesn’t need to be complex.

Published inLinuxNetworking

4 Comments

  1. Abel Martín Abel Martín

    I’ve been years waiting for a write-up like yours. Hopefully, the Docker community starts embracing best practices when it comes to IPv6 networking. Your help pushing for a better understanding of IPv6 in Docker is much appreciated.

    Keep up the good work!

  2. I am able to get ICMPv6 packets in with this setup and have confirmed it (I exec’d into the container and then tcpdump’d).

    However, any port-based protocols like TCP and UDP seem to fail when sent form that same remote host (several hops away); only pinging works it seems.

    Any ideas? I know the container is bound correctly and I can reach 9from the container machine) the container’s port 3000. But trying to do so remotely doesn’t work.

    I have tried both with iptables6 set to `true` and `false`.

    My network config is here: https://github.com/deavmi/docker/blob/master/nodes/services/networks.yml#L26
    And the service, `gitea_old`, here: https://github.com/deavmi/docker/blob/master/nodes/services/git.yml#L23

    • Got it working.
      Firstly, need to still publish the port for Docker firewalling reasons, so I had to add:

      “`yaml
      ports:
      # Port i want acessible
      – 3000/tcp
      “`

      Secondly, having a mix of networks in `networks:` didn’t help. I removed those; only then did it seem to work.

      I haven’t remove ip6tables for this either, kept that there as it can still be useful for cases where I don’t do this `routed` mode.

      • A few things:
        1. Ideally, the industry should’ve learnt from network design process and move the container firewalling to the container namespace itself, instead of complicating the host when there are hundreds/thousands of containers at scale. Since the namespace concept by itself is a unique network stack, it would’ve been trivial for this to happen, but it didn’t. I’ve seen some similar takes from the OpenBSD-type communities on how jails/container security should’ve been in Linux.

        2. Ideally as-is today, disable iptables manipulation of Docker and manage the firewall rules yourself using nftables – since you’re a software engineer, this is trivial, just automate it with some scripting and CI/CD pipeline for config management.

        3. I’ve been delaying nftables for too long for personal use, I plan to, later this year perhaps, create a “Daryll Swer nftables template” that matches (and improves upon) my legacy iptables rules that I’ve used in production networks. This would be use-able by Docker users as well, so keep an eye on my blog.

        4. Yes, don’t mix multiple networks (subnets), single custom bridge, single subnet (/64 is sufficient for scale) route that subnet over eBGP (since this is host-networking and eBGP is the recommended industry standard for host routing).

Leave a Reply

Your email address will not be published. Required fields are marked *


Daryll Swer's network engineering blog.