netplan bridge STP bug

Bug #2007304 reported by Satish Patel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
netplan
Triaged
Medium
Unassigned

Bug Description

# cat /etc/lsb-release | grep LTS
DISTRIB_DESCRIPTION="Ubuntu 20.04.4 LTS"

#
netplan version: 0.104-0ubuntu2~20.04.2

I have configured netplan bridge and didn't set anything related to STP option. Doc saying STP is enabled by default and source code also saying default enable [1]

My config snippet:

bridges:
    br-mgmt:
      dhcp4: no
      dhcp6: no
      interfaces: [ eno49.51 ]
      addresses: [ 10.74.1.12/23 ]
      gateway4: 10.74.0.1
      nameservers:
        addresses: [ 10.30.0.8, 10.30.0.10 ]
        search: [ foo.com, bar.com ]

When i check STP status using brctl show command its saying STP is off. That means netplan not setting STP on bridges and seems like a bug.

# brctl show br-mgmt
bridge name bridge id STP enabled interfaces
br-mgmt 8000.38eaa7327d40 no eno49.51

[1] https://github.com/canonical/netplan/blob/0.105/tests/generator/base.py#L115

Official doc: https://netplan.io/reference

Revision history for this message
Danilo Egea Gondolfo (danilogondolfo) wrote :

Hi, thanks for your bug report.

I can confirm the issue still exists on netplan.io 0.106.

The problem happens when the bridge "parameters" section is omitted. When this field is not defined, the handler responsible for initializing the related data structures in the netplan parser is never called and the backend configuration is never generated.

To fix this we need to initialize the bridge parameters [0] regardless the existence of the field "parameters" in the YAML file.

Here is a simpler reproducer:

$ cat etc/netplan/90-configs.yaml
network:
  bridges:
    br-mgmt: {}

$ netplan generate --root-dir /tmp/fakeroot/

$ cat run/systemd/network/10-netplan-br-mgmt.netdev
[NetDev]
Name=br-mgmt
Kind=bridge

Defining an empty "parameters" section is enough to make the parser emit the configuration correctly.

$ cat etc/netplan/90-configs.yaml
network:
  bridges:
    br-mgmt:
      parameters: {}

$ netplan generate --root-dir /tmp/fakeroot/

$ cat run/systemd/network/10-netplan-br-mgmt.netdev
[NetDev]
Name=br-mgmt
Kind=bridge

[Bridge]
STP=true

[0] - https://github.com/canonical/netplan/blob/main/src/parse.c#L1891

Lukas Märdian (slyon)
Changed in netplan:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Danilo Egea Gondolfo (danilogondolfo) wrote :

After some consideration, I believe that changing the code to honor the documentation will potentially break network setups based on systemd-networkd out there.

One the other hand, leaving this option to be enabled according to the underlying network configuration system defaults will lead to inconsistencies in the final network configuration when switching between renderers. The reason for that is that Network Manager will enable STP by default [0] [1] and networkd will set it to whatever is the kernel's default [2] (which apparently is "disabled").

In my system, if I create two bridges, one with NetworkManager and the other with networkd, I get one with STP=yes and the other with STP=no, respectively.

So the options are:
1) we fix the code to honor the documentation and always enable STP. This will cause a sudden change on all the bridges created by netplan and managed by networkd out there that don't have the "parameters" section defined in the YAML file (and therefore doesn't have STP=true in their .netdev file)

2) we change the code to follow what appears to be the kernel's default and always disable it when the user is not explicitly enabling it. This would change the behavior for Network Manager when the Netplan YAML doesn't have the bridge "parameters" section. This seems to be "less bad" than option 1, since servers (at least Ubuntu Server) default to networkd as their network configuration service.

3) we leave it to the backend default and change the docs to say something like "this option defaults to whatever is the backend's default".

For the sake of consistency, we should always either enable or disable STP and not leave it to the system's default value. In this case we need to choose for what backend we will change the current behavior.

[0] - https://github.com/NetworkManager/NetworkManager/blob/1.42.2/src/core/devices/nm-device-bridge.c#L314
[1] - https://github.com/NetworkManager/NetworkManager/blob/1.42.2/src/libnm-base/nm-base.h#L325
[2] - https://www.freedesktop.org/software/systemd/man/systemd.netdev.html#STP=

Revision history for this message
Lukas Märdian (slyon) wrote :

Thank you a lot for your detailed investigation!

What's the worst case that could happen when we go for option (1)? Will enabling of STP break existing bridges that don't use STP, or will it just be ignored in the setup?

IMO we should probably go for option 2, as this should break the least setups while still leaving us with consistent behavior across different backends. In addition, we should update the documentation to mention STP is only being activated when any bridge parameters are given.

Revision history for this message
Satish Patel (satish-txt) wrote :

100% agreed that keeping it disable has more advantage over turn on for everyone. Anyway very few people prefer running STP on Host machine until they have very special case and again running STP on host cause major issue if not configured switch according. I would vote for disable for all and let end users decide to turn on.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.