I think there are a number of questions here.

It would good to separate them, then implement and test them independently.

For example. Probing link status and reacting to down links that still show link, would be a valuable step forward by itself.

We can then address your other questions separately.


On Mon, Jul 2, 2018 at 9:58 AM Truong Huu Trung <trungdtbk@gmail.com> wrote:
I got you. 
Btw,  what I propose is a generalization of the case where stack has only two switches. 
In that topology, it will work with two controllers, but it doesn't with more than two controllers.


On Mon, Jul 2, 2018 at 8:27 AM Josh Bailey <joshb@google.com> wrote:
We have deployed stacking at WAND and it already responds to link failures by reprogramming flows so the network still works (where link status can be used), and it does support multiple controllers. There is also automated testing in place to demonstrate this.

Responding to probe status would be a positive change, but it must be accompanied by tests per our usual convention.


On Mon, Jul 2, 2018, 05:09 Truong Huu Trung <trungdtbk@gmail.com> wrote:
Currently, broacast (and unicast) rules are built statically based on the configured stack graph.
I'm implement some incremental changes to make it resilient and updated based on the actual network state.
I just launched a PR#2125 that updates the graph based on the link probe.

I suggest the following changes to the stack broadcast algorithm: 
1. A DP floods to local ports (none stack ports) only if it can't reach to the root DP.
2. Stack should use only links that have been probed to be working (not the configured links).
3. Each time when the stack graph changes, a DP rebuilds flood rules (and unicast rules if any).
I've already implemented this (it does not change the code much). There is no automated test yet.
If you're ok, I'll proceed to the tests and make a PR for it.

That changes make the stack to react stack events. There are limitations though:
- There are a lot of more rules sent to switches during start up (many stack links are up during this time)
- It works only if all DPs are controlled by the same Faucet. It's because there is no mechanism to propagate events between Faucet yet.
- It does not deal with root failure.

Trung

_______________________________________________
Faucet-dev mailing list
http://faucet.nz/
Faucet-dev@list.waikato.ac.nz
https://list.waikato.ac.nz/mailman/listinfo/faucet-dev