fence_aws in Focal and Bionic (LP: #1894323) don't behave the same.

Bug #1900374 reported by Rafael David Tinoco
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
pacemaker (Ubuntu)
New
Medium
Unassigned

Bug Description

After the SRU of LP: #1894323, it looks like a bad behavior in pacemaker Bionic was exposed:

When declaring the fence_aws primitive, you can either declare it as a single resource and describe the pcmk_host_map... OR you can declare one fence resource PER NODE doing the exact same thing BUT using the "plug/port" resource argument. The thing is... in Focal, both methods work but in Bionic, the second method does not work.

It is not a big deal as there are some fence agents designed to work with "pcmk_host_map" only, and some others are designed to work with "plug/port" argument.

Test case (BIONIC):

Using the fence-agents version from LP: #1894323 you first configure the fence_aws primitive as:

```
node 1: bionic01
node 2: bionic02
node 3: bionic03

primitive fence-bionic stonith:fence_aws \
 params \
    access_key="xxxx" \
    secret_key="yyyy" \
    region="us-east-1" \
    pcmk_host_map="bionic01:i-068e134;bionic02:i-0136edd;bionic03:i-0de279ab"
```
and

```
property cib-bootstrap-options: \
    have-watchdog=false \
    dc-version=1.1.18-2b07d5c5a9 \
    cluster-infrastructure=corosync \
    stonith-enabled=on \
    stonith-action=reboot \
    no-quorum-policy=stop \
    cluster-name=bionic
```

You can cause an issue in the interconnect and observe the fence_aws agent working properly. Then, stop the resource and remove it. Configure the fencing agent as 1 fence resource per node:

```
primitive fence-bionic01 stonith:fence_aws \
 params \
    access_key="xxxx" \
    secret_key="yyyy" \
    region="us-east-1" \
    pcmk_host_map="bionic01:i-068e134;bionic02:i-0136edd;bionic03:i-0de279ab" \
    plug="bionic01:i-068e134de1beddc7f"

primitive fence-bionic02 stonith:fence_aws \
 params \
    access_key="xxxx" \
    secret_key="yyyy" \
    region="us-east-1" \
    pcmk_host_map="bionic01:i-068e134;bionic02:i-0136edd;bionic03:i-0de279ab" \
    plug="bionic02:i-0136eddd045ceb7e2"

primitive fence-bionic03 stonith:fence_aws \
 params \
    access_key="xxxx" \
    secret_key="yyyy" \
    region="us-east-1" \
    pcmk_host_map="bionic01:i-068e134;bionic02:i-0136edd;bionic03:i-0de279ab" \
    plug="bionic03:i-0de279ab4e6d642c8"

location l-fence-bionic01 fence-bionic01 -inf: bionic01
location l-fence-bionic02 fence-bionic02 -inf: bionic02
location l-fence-bionic03 fence-bionic03 -inf: bionic03
```

This last example, using multiple fence resources, works in Focal but does not work in Bionic (after making sure both had the exact same fence_aws script version in bug LP: #1894323).

I think a bisection of pacemaker between Bionic and Focal (yes, its bad because its a major version change) might be needed here to understand why using "plug" does not work in Bionic.

Looks like Red Hat also faces the same issue at:

https://access.redhat.com/solutions/4642491

Note: I have exhaustively tested the "pcmk_host_map" only primitive and it worked fine (fencing the correct nodes all the times I fenced the cluster). Using plug "" in Bionic is not advised as fence_aws fences other nodes (than the one provided in plug argument).

Changed in pacemaker (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

The testing and bisecting (suggested by Rafael) still need to be done. This is in our backlog.

Revision history for this message
Athos Ribeiro (athos-ribeiro) wrote :

I wonder if this still affects any of our supported releases. In case it does not, we should just close this as wontfix given bionic did reach EOSS.

Next steps: re-triage this one to verify if it still affects any supported series.

Changed in pacemaker (Ubuntu):
status: Confirmed → New
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.