Seems we have the odd occasional flakey test and also problems with pip
install not always doing the right thing.
To help drive those out, I was going to go ahead and setup travis (I found
there's a .travis.yml in the repo already, but not sure how well it's
working). Seems like a good opportunity for me to fix up requirements.txt
Any objection to me going ahead and sorting that out?
While paying down tech debt I have been working on test quality (Gauge is
still missing a lot of testing, so I'll work on that also).
I added some checks to make sure after every test, that FAUCET's exception
log is empty and also that there are no OFPErrors.
It looks like group table support causes some (though it apparently doesn't
affect functionality). Nonetheless we should never cause such switch
errors. I'll look into why.
I will also add a feature to the test to automatically tcpdump the OF
channel. Apart from making busted tests easier to diagnose, I want to add
automated reporting on OF channel traffic (Eg worst case peak, etc). Good
development instrumentation but also I think good instrumentation for
research minded people looking at OF performance etc.
Just submitted some big PRs to pay down a bunch of technical debt in IP
routing (including simplifying group table support and fixing the minor bug
of attempting always to delete all group tables when group tables are not
I'm now depending on both IPv4 and IPv6 routing at home (so far static, not
BGP yet), so am motivated not to break anything!
You probably want to rebase at this point if you're working with routing,
though there will be quite a lot of further changes for routing between
VLANs and then multiple FIBs.
My friend and colleague Marc designed and built the OpenFlow based IX,
TouSIX, in Toulouse, France. You can read a bit more about TouSIX here (
shows the topology as well). Marc is negotiating deployment at an IX in
Japan (where he now lives). There is an opportunity to use FAUCET instead.
TouSIX uses a forwarding algorithm called Umbrella. The point of Umbrella,
is turn broadcast into unicast and not require a controller to respond (in
other words, relatively static flows are pushed, that cause traffic such as
ARP requests to be directly forwarded to the port where the target is known
to be). This way you can't have a broadcast storm because there are no
Marc has sent us a couple of dumps of the flows, on each of the 3 switches
in the exchange. Both flows and groups are used. Taking at look at the
flows, I believe all of them can be accomplished as FAUCET ACLs.
Considering the groups - Marc uses the fast failover feature of groups for
redundancy. We could implement the same, or alternatively, we could have a
colocated controller with each switch that runs FAUCET, and FAUCET can do
active link checking/listen to port status messages and decide to switch
I am disposed towards the latter approach, because while it does require a
controller, that controller can be rebooted etc without disruption to the
network, and we can have some more local intelligence to diagnose soft
failures (eg low level packet loss).
Would very much appreciate your thoughts!
In the NZNOG deployment we used various expeditious methods to reduce
gateway resolver activity, and we did not optimize regular learning
Regular learning activity is now much reduced, and I'd like to take a
different approach to gateway resolver activity - making it progressive
with an absolutely maximum of resolver attempts/sec so as not to blow out
Ryu's queues. The basic idea is to main a set of gateways that need
resolving and spool them out at a controlled rate.
I plan to have this in this week. From there, I think it would be good to
concentrate on Brad's WAND deployment to ensure we are handling a realistic
We also need to do some significant OO'ing of our code. We rapidly
accumulated a fair bit of functionality and I did do some incremental work
to move code out of valve.py, but aside from routing I did not attempt
anything intrusive - we prioritized working, first. For ongoing code health
though now we'll need to do something about that as complexity is too high