Requirement
|
Overview
RARE project objective is to provide a routing platform proposing various solutions addressing multiple use cases in the R&E landscape. In the picture below you see in purple the different use cases:
As you can notice, each use case will run on different hardware that potentially can have different dataplanes. As we were starting from a clean slate environment without much choice, especially with P4 programmability - the first dataplane or P4 target considered was BMv2. BMv2 is an excellent way to learn P4, it is also the first target we use in order to program and validate new features. After 6 months of practising our "P4-fu" we developed:
- a P4lang repository for ubuntu bionic and focal
- a debian 10 repository
- had our first RARE/FreeRouter prototype powered by a P4 BMv2 dataplane !
Our initial work, considering FreeRouter's Java nature, was to write a Java P4Runtime GRPC client that would be able to program the entries in the tables exposed by BMv2 via the P4Info file. However, this would have intimately tied FreeRouter code to P4Runtime gRPC code. Even if it's more natural to choose this solution, going in that direction implied that dataplanes other than BMv2 would be compliant to P4Runtime. It turns out that this is not the case. We then opted for a simple message API via a bi-directional raw UNIX socket. We will see what this means later in this blog.
Motivated by the successful experience with BMv2, we then decided to move forward and started to study TOFINO as a target. We were greedy and eager to apply our P4 code against multi-terabits traffic. After a few P4 program compilations, the first impression from my personal perspective was ... mind blowing ! INTEL/BAREFOOT TOFINO effectively opened the door to multi-terabits packet processing... Just to have at the tip of your finger the possibility to process traffic at these traffic levels was exciting !
As a side note, the journey was not without suffering and pain... We had to port our BMv2 code - and to port to TOFINO was not "Une lettre à la poste"... It is not that TOFINO programming is gratuitously painful. It is just that it is p4c-tofino's job to make sure that our packets are processed at silicon lighting speed. Imagine you are asked to convey parcels by driving from Paris to Amsterdam with a car that has an infinitely sized trunk, with an infinite gas tank and no particular speed constraint along the road. And then you are asked to do the same trip, but with an actual real car that has a trunk with a fixed size and with a 50 litre gas tank, and of course you'll have to follow speed signs along the road.
In the first case, you would put as many parcels as you would like and you even won't bother looking at your gas tank level and maybe you'd set the speed to 200 Km/h. The second case forces you to carefully think about how many parcels you can put in your trunk, check to see if one completely full tank can be sufficient for the trip and of course, you would have to follow the speed signs.
If you allow me this comparison, this is where BMv2 and TOFINO programming differs.
But, this pain was not in vain, it was for the greater good... You can't imagine the inherent joy when you see the TOFINO compiler displaying the DONE word ! For the veterans who can remember, it is the same feeling when you manage to compile your first program in the ADA language. The compiler is not so strict that compiling an ADA program is in itself a feat. No wonder why this language is used in Spatial rocket (Ariane).
Back to our dataplane interface story, even TOFINO and BMv2 share some roots, while BMv2 had P4Runtime as a northnound interface, INTEL/BAREFOOT pushed into TOFINO platform with P4_16 their gRPC interface counterpart: BfRuntime.
Our best bet paid off as FreeRouter message API was unchanged and without much effort we could add a new dataplane "wingman" to the FreeRouter control plane.
To recap:
- For BMv2: Our interface yields P4Runtime RPC calls. This program is called: forwarder.py
- For TOFINO: Our interface yields BfRuntime RPC calls. This program is called witout too much originality: bf_forwader.py
At that point we were starting to have a decent LSR/LER router for CORE and Aggregation use cases.
But we still had nothing at the EDGE/AGGREGATION layer in terms of a solution proposal, deploying P4 hardware might be way too expensive in specific contexts such as small R&E institutions like primary schools or small R&E labs. To that purpose, we started to study new targets such as VMWARE XDP and a very promising project: T4P4S ELTE. While we could not use XDP without a lot of P4 code rewriting and compromise, T4P4S ELTE was from our perpective very promising. But due to a compilation issue, we could not move forward.
FPGA was also a solution that we considered but had no access to any FPGA hardware that was P4 compliant.
As a result, we were a little bit bitter and started to read the DPDK library. And we started to play with DPDK examples... These examples were tremendously useful as it sparked some DPDK development into the RARE team. Csaba, the FreeRouter lead developer, step by step came up with this GENIUS idea: why don't we just use emulate P4 RARE P4 dataplane program ? We can still revert to using T4P4S ELTE when it will be ready ?
P4emu/P4dpdk was then born !
To conclude this short story, RARE/FreeRouter has now 3 completely different dataplanes: (in order of appearance)
- BMv2
- TOFINO
- DPDK
Unique RARE/FreeRouter feature
However, please note that FreeRouter message API is common to the three dataplanes listed above. You'll see further how this structure make the solution: an open modular, interchangeable solution.
Article objective
In this article, let's present RARE/FreeRouter platform structure and focus on the interface(S) between FreeRouter control plane and various dataplane.
Diagram
[ #001 ] - Modular design
Discussion
Conclusion
In this 1st article you:
- had a 10K feet view description of RARE/FreeRouter modular design
- This design allow rapid dataplane addtion without altering whatsoever FreeRouter code base
- In case you would like to re-use BMv2/TOFINO/P4DPDK dataplane, this has been never implemented but this is possible !
Message API documentation
From the time being this API message is not yet publicly documented. However, it is available and buried inside forwarder.py or bf_forwarder.py source code. This is work in progress but if you feel an urgent need to use it feel free to read the code.
PS: We will publish this document ASAP, but time plays against us ...
Requirement
|
Overview
We will deal with a series of article related to APS Networks® BF2556X-1T P4 switch. The key highlight of this box is:
- It is a P4 TOFINO NPU based switch
- TOFINO version has 2 cores (i.e. 2 pipes) and can manage up to 2 Tbps
- It offers multiple connection types and rates:
- 48x25GSFP28 and 8x100GQSFP28
- SFP28 port [1 - 16] can configure into 1G/10G/25G
- SFP28 port [17 - 48] can configure into 10G/25G
- QSFP28 port [49 - 56] Each QSFP28 port can configure into 1x100G,2x50G,4x25G, 1x40G or 4x10G Mode.
- 48x25GSFP28 and 8x100GQSFP28
- SyncE and 1588 support
Article objective
In this article, we will just do a basic introduction of the BF2556X-1T
[ #001 ] - BF2556X-1T in a nutshell
Discussion
Conclusion
In this 1st article you:
- had a brief description APS Networks® BF2556X-1T hardware platform
- The hardware provide p4 connectivity at 1GE capacity (16x1GE ports is available)
- In addition to 1GE it also provide 10/25/40/50/100G connectivity
RARE hardware plarform: [ BF2556X-1T #001 ] - key take-away
- From RARE/FreeRouter point of view, BF2556X-1T is very good candidate for PE (Provider Edge) router.
The 8x100G ports can make as a strong in a collapse core architecture (P function merge with PE functions), the box can also be used a a BGP route as it boast with 32 GB of RAM (~10 full BGP feeds), but you won't leverage the ports availability. It can be used to implement BRAS/BNG use case but would be also a good candidate as a ToR in Data Center envionment with BGP/MPLS capability and the possibility to provide 1GE connection to existing server purchased beforehand.
- SyncE 1588 support is a key features if your application needs precision provided by PTP
As we will discover the box, we will explain in further articles how to benefit from this features.
- RARE/freeRouter @design can coexist with Virtualisation technology BF2556X-1T
We just started our experience with this box. You'll find further, a series of article dedicated to BF2556X-1T depicting:
- How to proceed to initial OS installation
- Proceed to APS Networks® BF2556X-1T software installation (TOFINO SDE and Gearbox) installation
- Port operations on TOFINO ports SFP28 port 16-47 and QSFP28 port 48-56
- Port operations on GearBox ports SFP28 port 1-16 (1G/10G/25G)
- How to benefit from SyncE 1588 support
- RARE/freeRouter effective installation
The installation will be implemented should be compliant to ISP TELECOM standard. (It should survives power outage, easy upgrade features, start automatically at boot time without any human intervention)
Requirement
|
Overview
Several choices were possible, we finally ended up in following the KISS method. The Operating system requirements are:
- requirement #0: LTS operating system
- requirement #1: Benefit from LTS security patches
- requirement #2: Must be able to run DPDK
- requirement #3: (personal requirement) Must be familiar to me
- requirement #4: Able to run Java software as freeRouter is written in Java
- requirement #5: Small operating system software footprint
- requirement #6: Support for IPv4/IPv6
The hardest path would be:
- Create a custom linux distribution using the Yocto project:
The objective is to have tight control of the software installed on the appliance. This guarantees the smallest footprint we hope to obtain. For those familiar with OpenWRT, we can reach a tiny image size. My OpenWRT image is 5Mb.
- Use of NixOS or Nix package manager
This provides an incredible feature: commit/rollback functionality at the package management level!
Note
The features above are still under study into RARE group. We will introduce these technologies once we feel more confident on how to integrate these technologies into a streamlined deployment process.
Article objective
In this article we will go through the major steps in deploying Debian 10 stable aka Buster in order to prepare freeRouter installation.
Diagrams
[ #002 ] - Cookbook
Discussion
Conclusion
In this article, we got our hands dirty and manually installed freeRouter with DPDK dataplane from a clean slate environment. This is done on purpose, as I'd like you to understand the whole installation process in detail. There is an automated installation alternative that will install freeRouter also. However this is will install freeRouter with software backend. If your hardware CPU+NIC is compatible you can just replace the software backend by DPDK backend. At that precise point we have a vanilla genuine installation of freeRouter with DPDK dataplane on an appliance that can survive physical wild environment and power cut. We have just now to create the 2 freeRouter configuration files:
ls -l rtr-* -rw-r--r-- 1 root root 646 Jul 31 17:03 rtr-hw.txt -rw-r--r-- 1 root root 9027 Aug 25 10:02 rtr-sw.txt
RARE validated design: [ SOHO #002 ] - key take-away
- freeRouter installation is not complex. It just boils down to installing a basic supported Linux OS, install Java, some 3rd party software and the freeRouter jar and binaries itself
- In the binary list you'll have a special one called p4dpdk that corresponds to freeRouter DPDK dataplane that emulate RARE P4 program on BMv2 (It does not emulate BMv2 !)
- Though this installation is manual for pedagogic purpose, the installation can be fully automated, just fire up a VM with a bunch of interfaces and test it !
- The installation proposed is highly resilient and will ease upgrade of the appliance (we will see in subsequent article what it means )
In the next article, we will configure the freeRouter appliance, start the router, and provide configuration in order to have effective basic ping reachability to the FTTH BROADBAND internal IP.
Requirement
|
Overview
Back in 2004, I deployed a 8Mbps ATM circuit that connected an airline company hub site. Traffic growth increased amazingly since then! In 2020, what does SOHO (Small Office, Home Office) mean nowadays? In our use case we will consider a SOHO connected at 1GE link. This is for example:
- Primary schools, Secondary schools
- Small R&E institution spoke sites
- Home office (especially considering the COVID context)
- Small company spoke agencies
Article objective
In this article we will describe how to build a carrier grade SOHO router (aka CPE) from an actual real platform. In this example let me share with you my personal story and introduce you my SOHO hardware that I'm using at home. It is compliant with the requirements implied by the use cases listed above:
Requirements
- requirement #0: n×1GE capable, ISP uplink is 1GE
- requirement #1: completely silent, the box can be moved to crowded room
- requirement #2: small power consumption, as it is meant to run 24x7. (I'm paying the bill ! )
- requirement #3: Run 64-bit linux
- requirement #4: native support of DPDK
Diagrams
[ #001 ] - Cookbook
Discussion
Conclusion
In this 1st article you:
- had a brief description hardware platform suitable for SOHO
- had a description of the SOHO use case in 2020
- get a rationale on why this platform has been chosen
- had a brief description of the selected Operating System
- get a rationale on why this OS has been chosen
RARE validated design: [ SOHO #001 ] - key take-away
- RARE/FreeRouter is a strong candidate for SOHO with multiple dataplane support solution.
If you are a company you run RARE/freeRouter with a versatile P4 switch such as APS Networks® BF2556X-1T or WEDGE, but as a SOHO with a small budget you can run it with a DPDK dataplane and for older hardware you still have the possibility run it with a pure software dataplane
- RARE/freeRouter is the first element at the very edge of the MPLS seamless architecture
End to end MPLS is now possible for the Service provider at an affordable price
- RARE/freeRouter design can coexist with Virtualisation technology
CPU extension such as VT-x/AMD-V, VT-D/AMD-Vi, VT-c can provide coexistence between RARE/freeRouter and a small amount of storage and compute node. (Such as micro-K8/docker)
In the next article we will start our journey in creating a carrier grade CPE using the platform above.
After having followed P4Lang P4 for dummies [ #002 ] article, you should have now a working P4 development environment.
Requirement
|
Overview
Let's start writing. compiling and running our first P4 program.
Article objective
This 3rd article propose to write your first P4 program based on P4Lang P4 for dummies [ #001 ] my_program.p4 specification.
Diagram: my_program.p4
[ #003 ] - Cookbook
Verification
Conclusion
In this article you:
- wrote your first P4 program
- use p4c in order to compile it
- learned how to instantiate virtual ethernet pair in order to bind them with simple_switch
- launch simple_switch and load your program on it
- set up a test environment using scapy
- and verify your program using a combination a scapy and tcpdump
P4Lang P4 for dummy [ #002 ] - key take-away
- my_program.p4 is written following V1Model definition that defines:
- a parsing stage
- a checksum verification stage
- an ingress packet processing control stage
- an egress packet processing control stage
- a checksum computation stage
- deparser stages
V1Switch( prs_main(), ctl_verify_checksum(), ctl_ingress(), ctl_egress(), ctl_compute_checksum(), ctl_deprs() ) main;
It is described by the diagram below:
In a subsequent article we will dissect my_program.p4, but as you could observe, P4 programming is quite intuitive as it is all about switching a packet based on intrinsic ingress packet header and metadata (like packet ingress port) value.
Requirement
|
Overview
In order to be able to start P4 programming, we will concretely start setting up a P4 development environment using Open Source P4Lang P4 community software.
Article objective
This article exposes how to install:
- P4Lang PI
- P4Lang BMv2
- P4Lang p4c
Operating system supported
- Debian 10 (stable aka buster)
- Ubuntu 18.04 (Bionic beaver)
- Ubuntu 20.04 (Focal fossa)
Note
You can of course use the distribution of your choice as soon as the Operating System you are using has all the necessary third party dependencies required by P4Lang software, mainly:
- protobuf
- grpc
- thrift
- nanomsg
- nnpy
You can find the full list here in Launchpad.
Diagram:
[ #002 ] - Cookbook
Verification
Conclusion
In this article you learned how to set up a P4 environment development
- Debian 10
- Ubuntu 18.04
- Ubuntu 20.04
And tested the installation by compiling RARE P4 code.
P4Lang P4 for dummy [ #002 ] - key take-away
- P4Lang P4 development environment creation is easy
- it uses P4Lang packages on Debian and Ubuntu
- These packages are maintained by RARE project and are nightly built based on P4Lang official GitHub
In the next article we will:
- write our first P4 program: my_program.p4 as it is specified in P4Lang P4 for dummies [ #001 ]
- compile my_program.p4
- launch P4Lang virtual switch called simple_switch and load my_program.p4 on it
- perform basic verification
Requirement
|
Overview
P4 is a language for programming the data plane of network devices. From p4.org web site:
«P4 is a domain-specific programming language for specifying the behaviour of the dataplanes of network-forwarding elements. »
Article objective
This 1st article exposes:
- A brief introduction to the P4 language
- A basic P4 development workflow
- Some basic specificities of the P4 language
Note
This article is preliminary a pure introduction to P4lang P4. It does not correspond in any way to an extensive programming language description nor a P4 compilation guide.
Diagram: P4 development workflow
[ #001 ] - Cookbook: P4 development workflow
Router for Academia Research & Education (RARE) & P4
Conclusion
In this article you:
- had a brief introduction of P4Lang P4 language
- had been presented a 10 thousand feet view of P4 development workflow
- had been exposed a list of P4 targets and the use cases enabled by these targets
P4Lang P4 for dummy [ #001 ] - key take-away
__THE__ exciting INNOVATION provided by P4 boils down into this community language that unlocks and opens for you the door of system's dataplane. Till now, dataplane programming was reserved to commercial vendors. Some of these dataplanes like the well known CEF (Cisco Express Forwarding) are specific to Cisco equipment. Juniper, has its own dataplane (not sure about the name) implemented by Forwarding Plane component. (example of vMX architecture)
P4 language inherent characteristics:
- Behavioral programming language
- Language with constraints
- Limited number of variable types
- With fixed size
- P4 is not a general purpose language, You cannot program any software. like C, C++ or Java
It is therefore a simple language, that is easier to be tamed by network managers rather than pure software developers. Indeed, writing a P4 program is all about defining the behavior of a network packet processing algorithm based on intrinsic variables encoded into the packet header.
Requirement
|
Overview
BGP is THE protocol of Internet, it is used to exchange routing information between other BGP systems between Internet domains. It comes in two flavours:
External BGP(eBGP): Network Layer Reachability Information (NLRI) is exchanged between network domain called Autonomous system usually administratively independant. We are speaking about BGP inter-domain routing. As an example, let's us assume a BGP speaker from AS2200 (RENATER) advertising NLRI information to AS20965 (GÉANT R&E). From that point AS20965 has the knowledge of how to reach any network advertised by AS2200 based on the NLRI information.
Internal BGP (iBGP): NLRI is propagated between BGP speakers inside the same domain. We are speaking about BGP intra-domain routing. As an example, assume border router AS2200 in Paris connected to GEANT network and get NLRI information from AS20965. I will then propagate this information internally and advertise GEANT NLRI information via iBGP session to other BGP speaker inside network domain for AS2200.
iBGP requires a full mesh network between all BGP speakers inside a domain because of an anti-AS loop avoidance. Thus requiring n*(n-1)/2 number of sessions to be implemented. BGP route reflection is a proposal that remove full mesh requirement. BGP Edge router has now only 1 BGP session toward the RR, thus reducing network equipment workload.
Article objective
In this article we will describe how to build a carrier grade route reflector cluster composed by RR1 and RR2. In order to reach Telecom Internet Service provider 99,999% of availability:
Let's consider the architecture network of a fictitious service provider below, router reflector RR1 and RR2 are dual homed to a core P routers.
Diagram
[ #001 ] - Cookbook
Verification
Conclusion
In this article you:
- had a brief introduction of BGP protocol and BGP route reflector rationale
- learned the design consideration related to BGP RR setup
- got a typical BGP configuration example with a long list of AFI/SAFI enabled
- This configuration is not exhaustive as for example BGP add-path is available but not configured
- verified BGP RR operation
RARE validated design: [ BGP RR #001 ]- key take-away
- BGP Router Reflector use case does not require a commercial vendor router, it can be handled perfectly by a sowftare solution running on a server with enoough RAM.
The example above an example of a high availability Route Reflector that is able to handle BGP signalling for a high carrier Service Provider for all address familay
- Redundant BGP Router Reflection is ensured by deploying 2 RR (at minimum) belonging to the same BGP RR cluster
In addition to have several RR for the whole domain, it is also common to see hierarchical RR design. SOme Service provider deploy dedicated RR for specific address family (L3VPN unicast for example)
- RR in the same cluster run basic iBGP session
These RR also share the same cluster ID, in order to ensure route withdraw in case of routing advertisement
- RR should not be in the traffic datapath
This is the reason why we are setting high cost (4444 and 6666) for IPv4 and IPv6 respectively on both direction on the RR(s) interconnections ports
- RR design for a multi-service backbone
In the example, the RR client are running only IPv4/IPv6 but the RR design above can empower a Service provider backbone with additional service running on TOP of MPLS, L3VPN, 6VPE, VPLS EVPN etc.
- In the next article we will dissect the rr1 configurations
This will demonstrate some nice features proposed by freeRouter such as BGP template and nexthop tracking among a list of other feature not mentioned here... (like BGP add-path)
RR design test
You can test this design above in order to check RR and backbone router signalling.
- Set up freeRouter environment as describe above
- Get RARE code
git clone https://github.com/frederic-loui/RARE.git
cd RARE/00-unit-labs/0101-rare-validated-design-bgp/ make
c1: telnet localhost 10001 c2: telnet localhost 10002 c3: telnet localhost 10003 c4: telnet localhost 10004 c4: telnet localhost 10005 c6: telnet localhost 10006 c7: telnet localhost 10007 c8: telnet localhost 10008 rr1: telnet localhost 10010 rr2: telnet localhost 10011
cd RARE/00-unit-labs/0101-rare-validated-design-bgp/ make clean
In article #005 you learned how RARE/freeRouter is controlling a P4Emu/pcap dataplane. We also demonstrated that this setup could be integrated into real networks.
Requirement
|
Overview
Though P4Emu/pcap can be used for SOHO and can handle nx1GE of traffic, this comes at a high CPU load cost and thus a higher power consumption.
"Why write yet another software dataplane as freeRouter has already a working native software dataplane ?"
The partial answer to the question raised in the previous article was:
"decoupling control plane from the dataplane"
We learned that P4Emu:
- is able to understand the VERY same strict control message from freeRouter as it occurs with a P4 dataplane
- is able to switch packet emulating router.p4 using libpcap packet forwarding backend.
However, even though libpcap is a performant packet processing library, the kernel is still heavily sollicited and the higher the traffic rate is, the higher CPU workload becomes.
Article objective
In this article we'll using freeRouter setup deployed in #005 and replace P4Emu/pcap's dataplane by P4Emu/dpdk dataplane.
Source Wikipedia: https://en.wikipedia.org/wiki/Data_Plane_Development_Kit
The Data Plane Development Kit (DPDK) is an Open source software project managed by the Linux Foundation. It provides a set of data plane libraries and network interface controller polling-mode drivers for offloading TCP packet processing from the operating system kernel to processes running in user space. This offloading achieves higher computing efficiency and higher packet throughput than is possible using the interrupt-driven processing provided in the kernel.
It is important to note that though its name implies, P4Emu/dpdk is not emulating V1Model. P4Emu is emulating router.p4 packet processing logic and uses a packet forwarding library to effectively transmit packets at specific ingress port to the right egress port defined by freeRouter control plane message. However, in this precise case, packet processing is offloaded from the kernel to user space. The consequence is the ability with dpdk compatible NIC and driver, to reach tremendous traffic rate. DPDK is not available on all hardware, please refer to DPDK HCL.
Diagram
[ #006 ] - Cookbook
Verification
Conclusion
In this article you:
- had a demonstration of how to integrate freeRouter into a local area network (Similar to article #002)
- However instead of using P4Emu/dpdk we used a P4Emu/dpdk dataplane
- communication between freeRouter control plane and P4Emu/dpdk is ensured by pcapInt via veth pair [ veth0a - veth0b ]
- In this example the freeRouter with P4Emu/dpdk has only 1 dataplane interface that is bound to enp0s3 VM interface exposed to the local network as a bridged interface
[ #006 ] RARE/FreeRouter-101 - key take-away
- FreeRouter is using UNIX socket in order to forward packet dedicated to control plane + dataplane communication.
This essential paradigm is used to ensure communication between freeRouter and P4Emu/dpdk dataplane. It is ensured by pcapInt binary from freeRouter net-tools that will bind freeRouter socket (veth0a@locathost:22001) to a virtual network interface (veth0b@localhost:22002) connected to CPU_PORT 1.
- freeRouter is the control plane for P4Emu/dpdk dataplane
freeRouter is doing all the control plane route computation and write/modify/remove message entry P4 entries are created/modified/removed accordingly from P4Emu/dpdk tables. Although the name is P4Emu, it does not emulate BMv2 V1Model.p4, but rather router.p4
- dpdk port_id allocation
dpkg port_id allocation follow pci_id port naming convention starting from id 0. p4dpdk.bin is invoked with the parameter: (number_of_dpdk_port - 1) + 1 <--- CPU_PORT
- In this setup the combination of freeRouter/P4Emu/dpdk delivers a solution for small campus network having 10GE links (100GE links to be validated)
dpkg removed the kernel intervention calls for each packet processed. In that configuration packet processing is now off loaded to user space. Reducing kernel intervention to ~ 0%. Congratulation you have a hardware NIC assisted forwarding is system !
In subsequent article we will see how this setup behaves with a DELL 640 server powered by Intel(R) Xeon(R) Gold 6138 CPU x 2 and equipped with a Mellanox ConnectX-5 EX Dual Port 100GbE QSFP28 PCIe Adapter Low Profile card. We will also see how to connect this server to a P4 switch, BF2556X-1T. So stay tuned !
In article #003 and #004 you learned how RARE/freeRouter is controlling a P4 dataplane (BMv2 or TOFINO virtual model). We also demonstrated that this setup could be integrated into real networks. However, these P4 dataplanes are not suitable for day to day real operation as it have inherent software limitations. While freeRouter native software dataplane presents the advantage to get the entire feature set and is sufficient to handle a home network traffic load, we investigated a way to improve dataplane performance. In that context we considered to study:
- VMWare P4 XDP project
- ELTE T4P4S project
Requirement
|
Overview
However, XDP model was not complete enough in order to compile router.p4 and we could not generate the corresponding kernel bypass code with ELTE T4P4S based on BMv2 V1Model.p4. (A GitHub issue is still pending). In that context, Csaba freeRouter lead developer decided to develop P4Emu a software dataplane that has the particularity to:
- understand freeRouter control plane message meant to be addressed to a P4 dataplane
- thus maintaining the control plane decoupled to the dataplane as it was the case with BMv2 and BF_SWITCHD
One would ask: Why write yet another software dataplane as freeRouter has already a working native software dataplane. This is a very good and valid question. The answer boils down in:
"decoupling control plane from the dataplane"
We will see in subsequent article how P4Emu unlock new valid uses cases.
Article objective
In this article we'll using freeRouter setup deployed in #004 and replace bf_switchd providing freeRouter INTEL/BAREFOOT TOFINO's dataplane by P4Emu/pcap.
It is important to note that though its name, P4Emu/pcap is not emulating V1Model. P4Emu is emulating router.p4 packet processing logic and uses a packet forwarding library to effectively transmit packets at specific ingress port to the right egress port defined by freeRouter control plane message.
Diagram
[ #005 ] - Cookbook
Verification
Conclusion
In this article you:
- had a demonstration of how to integrate freeRouter into a local area network (Similar to article #002)
- However instead of using bmv2 or TOFINO we used a P4Emu/pcap dataplane
- communication between freeRouter control plane and P4Emu/pcap is ensured by pcapInt via veth pair [ veth250 - veth251 ]
- In this example the freeRouter with P4Emu/pcap has only 1 dataplane interface that is bound to enp0s3 VM interface exposed to the local network as a bridged interface
[ #005 ] RARE/FreeRouter-101 - key take-away
- FreeRouter is using UNIX socket in order to forward packet dedicated to control plane + dataplane communication.
This essential paradigm is used to ensure communication between freeRouter and P4Emu/pcap dataplane. It is ensured by pcapInt binary from freeRouter net-tools that will bind freeRouter socket (veth251@locathost:22710) to a virtual network interface (veth250@localhost:22709) connected to CPU_PORT 0.
- freeRouter is the control plane for P4Emu/pcap dataplane
freeRouter is doing all the control plane route computation and write/modify/remove message entry P4 entries are created/modified/removed accordingly from P4Emu/pcap tables. Although the name is P4Emu, it does not emulate BMv2 V1Model.p4, but rather router.p4
- In this setup the combination of freeRouter/pcap deliver a solution for SOHO network having 1GE links
However, 1GE traffic rate require 50% of one CPU thread. Nevertheless, traffic rate achieved is higher with P4Emu/pcap than freeRouter native software packet forwarding software.
In subsequent article we will see how we can improve the latter requirement implied by P4Emu/pcap.
In the previous article #003 "Are you P4 compliant ?" we exposed a setup where RARE/freeRouter was controlling BMv2 P4 dataplane called simple_switch_grpc. In this article we replace the open source BMv2 target by a commercial virtual target provided by INTEL/BAREFOOT. As a side note, we will show that this setup can be integrated with real networks. (with inherent software limitations)
Requirement
|
Overview
I'm repeating the core message from #003: For those who are not familiar with data plane programming and especially with P4, "P4 is a domain-specific programming language for specifying the behaviour of the dataplanes of network-forwarding elements." (from p4.org) in short it helps you to write a "program specifying how a switch processes packets".
Article objective
In this article we'll using freeRouter setup deployed in #003 and replace bmv2/simple_switch_grpc providing freeRouter P4Lang's dataplane by INTEL BAREFOOT/bf_switchd. Actually the effective dataplane is ensured by INTEL/BAREFOOT virtual bf_switchd model running RARE P4 program called: bf_router.p4.
Diagram
[ #004 ] - Cookbook
Verification
Conclusion
In this article you:
- had a demonstration of how to integrate freeRouter into a local area network (Similar to article #002)
- However instead of using bmv2 we used a INTEL/BAREFOOT P4 dataplane called: TOFINO (bf_switchd)
- TOFINO bf_switchd target is running RARE bf_router.p4
- communication between freeRouter control plane and TOFINO is ensured by pcapInt via veth pair [ veth250 - veth251 ]
- This communication is possible via RARE bf_forwarder.py based on GRPC P4Lang BfRuntime python binding
- In this example the TOFINO bf_switchd P4 virtual switch model has only 1 dataplane interface that is bound to enp0s3 VM interface exposed to the local network as a bridged interface
[ #004 ] RARE/FreeRouter-101 - key take-away
- FreeRouter is using UNIX socket in order to forward packet dedicated to control plane + dataplane communication.
This essential paradigm is used to ensure communication between freeRouter and TOFINO bf_switchd P4 dataplane. It is ensured by pcapInt binary from freeRouter net-tools that will bind freeRouter socket (veth251@locathost:22710) to a virtual network interface (veth250@localhost:22709) connected to CPU_PORT 64.
- freeRouter control plane and dataplane communication is enabled by RARE bf_forwarder.py
bf_forwarder.py is a simple python script based on GRPC client BfRuntime python library.
- freeRouter is the control plane for TOFINO bf_switchd P4 dataplane
freeRouter is doing all the control plane route computation and write/modify/remove message entry via BfRuntime so that P4 entries are created/modified/removed accordingly from P4 tables
- TOFINO bf_switchd virtual model target
While TOFINO bf_switchd virtual model target is a very good choice for packet processing algorithm validation on TOFINO platform, the virtual model is not a target for production use. We will see in next articles how we can reach TREMENDOUS traffic throughput required by Internet Service Provider's use cases. Indeed, while with the model we can validate algorithm accuracy, traffic transfers achieved have a very low throughput. (I could barely make my setup described above working)
- TOFINO bf_switchd hardware target
In a subsequent article we will demonstrate how we can create with RARE/freeRouter/TOFINO TNA architecture, a service provider/carrier grade router that technically is able to switch 3.3 Tbps of traffic (line rate) using EdgeCore WEDGE100BF32X hardware switch.
TOFINO family most powerful Programmable Switching ASIC has the ability to switch 6.5 Tbps traffic throughput, our WEDGE100BF32X switches are powered by the ASIC's little brother that is able to handle 3.3 Tbps line rate traffic throughput.
"Are you P4 compliant ?". In France in the 1990's it was a pure French private joke before the military service was officially abolished. At that time being "classé P4" meant that you were mentally unable to join the French military army. Even if you wanted to. Therefore, at the age of 18, some daring people faked mental illness in order to avoid the "Service militaire" (1 year duration). Of course here, P4 is about the data plane programming language from P4Lang project.
Requirement
|
Overview
For those who are not familiar with data plane programming and especially with P4, "P4 is a domain-specific programming language for specifying the behaviour of the dataplanes of network-forwarding elements." (from p4.org) in short it helps you to write a "program specifying how a switch processes packets".
Article objective
In this article we'll using freeRouter setup deployed in #002 and replace the pcapInt providing freeRouter native software dataplane with P4Lang's dataplane. Actually the effective dataplane is ensured P4lang virtual simple_switch_grpc running RARE P4 program called: router.p4.
Diagram
[ #003 ] - Cookbook
Verification
Conclusion
In this article you:
- had a demonstration of how to integrate freeRouter into a local area network (Similar to article #002)
- However instead of using pcapInt you are now using a software P4 dataplane from P4lang project: bmv2
- BMv2 simple_switch_grpc target is used an run RARE router.p4
- communication between freeRouter control plane and bmv2 is ensured by pcapInt via veth pair [ veth250 - veth251 ]
- This communication is possible via RARE forwarder.py based on GRPC P4Lang P4Runtime python binding
- In this example the BMv2 P4 switch has only 1 dataplane interface that is bound to enp0s9 VM interface exposed to the local network as a bridged interface
[ #003 ] RARE/FreeRouter-101 - key take-away
- FreeRouter is using UNIX socket in order to forward packet dedicated to control plane + dataplane communication.
This essential paradigm is used to ensure communication between freeRouter and BMv2 P4 dataplane. It is ensured by pcapInt binary from freeRouter net-tools that will bind freeRouter socket (veth251@locathost:22710) to a virtual network interface (veth250@localhost:22709) connected to CPU_PORT 64.
- freeRouter control plane and dataplane communication is enabled by RARE forwarder.py
forwarder.py is a simple python script based on GRPC P4Runtime python library.
freeRouter is doing all the control plane route computation and write/modify/remove message entry via P4Runtime so that P4 entries are created/modified/removed accordingly from P4 tables
- BMv2 target
While BMv2 target is a very good choice for packet processing algorithm validation, it is not an ideal target for production use. We will see in next articles how we can reach a higher rate throughput related required by use cases defined by network operators.
While in article #001 of the 101 series we learnt how to spawn 2 router instances on the same VM, this use case is only useful for learning/pedagogic purposes. freeRouter can be considered as networking Swiss Army Knife in real networks. We will demonstrate further freeRouter capability to take control a a full VM and then be able to directly communicate with the external real world via the VM network device interface. i.e Out of the VM scope.
Requirement
|
Overview
Working with freeRouter inside VM is interesting but working and interact with the outside world is way more exciting !
Article objective
In this article we'll explain how to integrate freeRouter in an existing local area network (my home network) and how to inherit from IPv4 DHCP and IPv6 SLAAC. Though this simple example is consumer/end user oriented, freeRouter can be incorporated into a Internet Service provider environment. You can easily imagine how to build a highly scalable and versatile BGP route Reflector, sophisticated route server, ROA/RPKI validator or even a BGP BMP server ... (and the list of features set is huge). For example, in one of my project since 2015 I'm using freeRouter as a BGP route reflector inside a k8s cluster running calico network plugin.
Diagram
[ #002 ] - Cookbook
Verification
Conclusion
In this article you:
- had a demonstration of how to integrate freeRouter to a local area network
- learn how to configure an interface in order to act as an IPv4 DCHP client
- learn how to configure an interface using IPv6 SLAAC
[ #002 ] RARE/FreeRouter-101 - key take-away
- FreeRouter is using UNIX socket in order to forward packet.
You can use pcapInt binary from freeRouter net-tools that will bind freeRouter socket (locathost:26011) to a physical network interface (localhost:26021@enp0s9)
- freeRouter is a Swiss Army Knife
It support a huge list of feature with IPv4/IPv6 parity. In this example we demonstrated how an interface can inherit IPv4/IPv6 addresses from IPv4 DHCP server or IPv6 SLAAC
- freeRouter can interact with the real network (in various flavors. We will develop this in further articles)
It can be used as a BGP route reflector in Internet Service Provider environment, as ROA/RPKI validator, BMP server, BGP looking glass, route server etc.
The main objective of [RARE / FreeRouter 101] series is to help you getting started with FreeRouter from scratch without any prior knowledge.
Requirement
|
Overview
freeRouter is a free, open source router control plane software. For nostalgic and networkers from prehistoric era (like me), freeRouter besides Ethernet, is able to handle HDLC, X25, frame-relay, ATM encapsulation. Since it handles packets itself at the socket layer, it is independent of underlying Operation System capabilities. We will see in the next articles how freeRouter subtlety leverage this inherently independence to connect different data-plane such as OpenFlow, P4 and other possible data-plane that would appear in the near future.
The command line tries to mimic the industry standards with one exception:
- no global routing table: every routed interface must be in a virtual routing table
- positive side effect: there are no vrf-awareness questions
Article objective
This article is meant to simply deploy 2 instances of freeRouter on the same fresh linux installed linux box. We are voluntary using freeRouter (freerouter.nop.hu) "raw" official repository in order to get familiar with the deployment manual process. Even if the deployment process is straightforward, it is not self explanatory for people non familiar with java/linux.
In order to simplify the deployment we have automated freeRouter daily builds on:
- launchpad packages for ubuntu 18.04 and 20.04
- debian 10 (aka buster) package on OpenSuse Build System
- container on Docker Hub
But let's get our "hand dirty" and follow the simple manual installation.
Diagram
[ #001 ] - Cookbook
Verification
Conclusion
In this article you:
- had a brief introduction of freeRouter networking Swiss army knife
- learn how to deploy 2 instances of freeRouter and interconnect them via 2 UNIX sockets on a VM guest running on VirtualBox
- this setup is ideal, for network simulation encompassing hundreds of nodes, self contained in the same VM without interaction with the external world. (protocol experimentation, convergence test etc.)
[ #001 ] RARE/FreeRouter-101 - key take-away
- FreeRouter is using UNIX socket in order to forward packet.
This is a key feature that will be leveraged to connect freeRouter control plane to any type of data-plane
- In FreeRouter everything is in a VRF (so there is no global VRF)
This design choice has very positive consequences like: No VRF awareness questions,have multiple bgp processes for the same freeRouter instance (each bound to a different VRF)
- freeRouter is dual stack
All the feature set is IPv4 and IPv6 compliant. So there is no compromised !
Hi Csaba, Thanks for being with us today
Hi Csaba,
Hi Csaba, Thanks for being with us today Thanks for being with us today