ACTION failed when can't get the cluster lock

Bug #1681620 reported by Haiwei Xu
24
This bug affects 5 people
Affects Status Importance Assigned to Milestone
senlin
Fix Committed
Critical
XueFeng Liu

Bug Description

When trying to attach two policies to one cluster, one ATTACH_ACTION got the lock, but the other action failed to grab the lock.

Error messages show as below:

2017-04-11 02:49:49.060 ERROR senlin.engine.senlin_lock [req-3e56f0b0-44fc-45a2-ab1d-8577f25cf77a None None] Cluster is already locked by action [u'b59c4189-2a39-4e6f-87a5-e29a262019a6'], action 9b93417d-b621-4e4d-bc10-7f6895a98af5 failed grabbing the lock
2017-04-11 02:49:49.178 WARNING senlin.engine.event [req-3e56f0b0-44fc-45a2-ab1d-8577f25cf77a None None] test-senlin-cluster_scaling_cluster-worltj4mwv2l [2e904b70] CLUSTER_ATTACH_POLICY - error: Failed in locking cluster.

Haiwei Xu (xu-haiwei)
Changed in senlin:
importance: Undecided → Critical
Revision history for this message
Thiago Martins (martinx) wrote :

Guys,

 I'm trying to delete a cluster and it is not working!

 To delete, it tells me to detach the policies, but, also doesn't work.

 Command:

$ openstack cluster policy detach --policy del-pol-1 my-cluster-1
Request accepted by action: 86cc4e6d-19c8-423c-b3f6-002503060b5d

 senlin-engine.log:

http://paste.openstack.org/show/614507

 I'm stucked! Can't delete a cluster, can't detach policies...

 Any tip?

Thanks!
Thiago

Revision history for this message
Qiming Teng (tengqim) wrote :

@Thiago, which version are you using?

Revision history for this message
Thiago Martins (martinx) wrote :

@Qiming,

 I'm using Ubuntu 16.04 with packages from Ubuntu Cloud Archive:

 senlin-common 3.0.0-0ubuntu1~cloud0

Cheers!
Thiago

Revision history for this message
XueFeng Liu (jonnary-liu) wrote :
Changed in senlin:
assignee: nobody → XueFeng Liu (jonnary-liu)
status: New → Fix Committed
status: Fix Committed → Confirmed
status: Confirmed → Fix Committed
Revision history for this message
XueFeng Liu (jonnary-liu) wrote :

This bug is the same with under:
https://bugs.launchpad.net/senlin/+bug/1682002

Revision history for this message
Thiago Martins (martinx) wrote :

I applied the above patch but it still can't lock the cluster, so, it is not detaching the policies.

Error (CLUSTER_DETACH_POLICY - error: Failed in locking cluster) in now in loop:

http://paste.openstack.org/show/614793/

Revision history for this message
Thiago Martins (martinx) wrote :

I cleaned up the Senlin SQL database, something like:

-
mysql> delete from cluster_lock where semaphore=3;
-

And it worked! Policies detached, Cluster deleted.

Revision history for this message
XueFeng Liu (jonnary-liu) wrote :

Yes, Thiago. I think what you met have two bugs:
1.The first one is the problem about schedule: https://bugs.launchpad.net/senlin/+bug/1682002
2.And the second is about cluster/node lock problem. Which is cause when you restart you senlin engine.
The second problem need to be solved by a patch.

Revision history for this message
manthang (manvanthang) wrote :

I'm using with Senlin version Pike/4.0.0, and still get the same issue when trying to test this guide:
https://docs.openstack.org/senlin/ocata/scenarios/autoscaling_heat.html

Here is full log from senlin-engine:
http://paste.ubuntu.com/25900670/

Is there now any workaround/hotfix for this bug?

Thanks!

Revision history for this message
Qiming Teng (tengqim) wrote :

@manthang, I'm suspecting that there are bugs when attaching a LB policy to your cluster.
Can you help check if things work when we avoid LB policy?

Revision history for this message
manthang (manvanthang) wrote :

@tengqim: I tried removing the "bindings" property of lb_policy definition in Heat template, and Senlin Cluster would be created successfully. So, it seems likely that the issue is about LB policy attachment.

Here is my template:
http://paste.ubuntu.com/26002640/

And senlin-engine.log:
http://paste.ubuntu.com/26002644/

Thanks,

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to senlin (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/540625

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to senlin (master)

Reviewed: https://review.openstack.org/532641
Committed: https://git.openstack.org/cgit/openstack/senlin/commit/?id=957d77a0a9a5e6904011ecc38eac99192b66f31d
Submitter: Zuul
Branch: master

commit 957d77a0a9a5e6904011ecc38eac99192b66f31d
Author: tengqm <email address hidden>
Date: Sat Feb 3 20:48:24 2018 -0500

    Update sdk connection, tests and isoformat

    We can simplify connection creation now. The Connection constructor
    understands not to load yaml or env vars if it doesnt' receive a cloud
    argument - so this can all be done in one step.

    Update the mocks in the tests to work for the new calling pattern.

    Finally - isoformat wasn't processing UTC+00:00. There are some
    dark issues with how isoformats get passed around OpenStack, but this
    fixes the unittest matching to account for UTC+00:00.
    openstacksdk has been working on merging the code from shade and
    os-client-config. Part of this will result in the removal of the
    Profile object in favor of the CloudRegion object from the new
    openstack.config.

    This updates the Connection construction code to code that should work
    with both old and new versions of openstacksdk.

    Temporarily mask some jobs to get this in. Will need to revise the
    tempest plugin to fix gate jobs later.

    Related-Bug: #1681620
    Co-Authored-By: tengqm <email address hidden>
    Change-Id: I10c74ca48b1ddb848a5a68cc4360431a21e0a2cc

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on senlin (master)

Change abandoned by Qiming Teng (<email address hidden>) on branch: master
Review: https://review.openstack.org/540625
Reason: All covered by #532641

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.