Ceph not working after killing primary controller

Bug #1268579 reported by Denis Ipatov
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Ryan Moe

Bug Description

I have a HA configuration with 3 controllers + 1 compute and doing some tests, eg. when I shutdown the first Controller, and try to create a new volume with cinder it stays in status "creating"....
 I only use ceph for cinder, for glance I use swift. Uploading an image to glance it works fine.

[root@node-15 ~]# ceph health

^CError connecting to cluster: Error

[root@node-15 ~]# ceph mon stat
^CError connecting to cluster: Error

I change the following lines in the /etc/ceph/ceph.conf file on each controller node:

from:
[global]
filestore_xattr_use_omap = true
mon_host = 192.168.0.3
fsid = b963be07-edcb-4f65-a661-d005844f9332
mon_initial_members = node-9
auth_supported = cephx
osd_journal_size = 2048
osd_pool_default_size = 2
osd_pool_default_min_size = 1
osd_pool_default_pg_num = 100
public_network = 192.168.0.0/24
osd_pool_default_pgp_num = 100
osd_mkfs_type = xfs
cluster_network = 192.168.1.0/24

to:
[global]
filestore_xattr_use_omap = true
mon_host = 192.168.0.3 192.168.0.4 192.168.0.5 # in this line you need to add IP of all your controller node
fsid = b963be07-edcb-4f65-a661-d005844f9332
mon_initial_members = node-9 node-10 node-11 # in this line you need to add hostname of all your controller node
auth_supported = cephx
osd_journal_size = 2048
osd_pool_default_size = 2
osd_pool_default_min_size = 1
osd_pool_default_pg_num = 100
public_network = 192.168.0.0/24
osd_pool_default_pgp_num = 100
osd_mkfs_type = xfs
cluster_network = 192.168.1.0/24

and ceph start to work.

Denis Ipatov (dipatov)
tags: added: fuel-4.0
Changed in fuel:
status: New → Triaged
Evgeniy L (rustyrobot)
Changed in fuel:
importance: Undecided → High
milestone: none → 4.1
tags: added: ceph library
Changed in fuel:
assignee: nobody → Andrew Woodward (xarses)
Revision history for this message
mauro (maurof) wrote :

I d like to overcome with this bug ( an the one described in mine https://bugs.launchpad.net/bugs/1267937).
Is therefore safe to apply the described workaround in 4.0?

mon_host = 192.168.0.3 192.168.0.4 192.168.0.5
mon_initial_members = node-9 node-10 node-11

it s needed to restart any ceph process after modfying ceph.conf??

thanks

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

According to Ceph documentation:
http://ceph.com/docs/master/dev/mon-bootstrap/#cluster-expansion

you should be able to notify a running ceph-mon process about the change from command line:
ceph-mon -i <myid> --mon-initial-members foo,bar,baz

tags: added: customer-found
Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

ceph.conf on the primary controller can be updated with a list of all other ceph-mon nodes based from /etc/astute.yaml after Ceph cluster is up, but before deployment of other controllers and other nodes begins. It is not necessary to wait for other ceph-mon nodes to come up before they can be added to ceph.conf.

Thanks to that, a Puppet-only solution can be implemented without changes to the orchestration engine. An additional check should be executed after 'ceph-deploy new' that would wait for the cluster initialization to complete, that check should then notify a resource updating ceph.conf with a full list of ceph-mon nodes.

Changed in fuel:
assignee: Andrew Woodward (xarses) → Dmitry Borodaenko (dborodaenko)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/73106

Mike Scherbakov (mihgen)
tags: added: release-notes
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/73106
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=f769f20a287a2f6a438f12015b295ee8f94d2f35
Submitter: Jenkins
Branch: master

commit f769f20a287a2f6a438f12015b295ee8f94d2f35
Author: Dmitry Borodaenko <email address hidden>
Date: Wed Feb 12 15:24:24 2014 -0800

    Add full list of mon nodes to ceph.conf for HA

    Change-Id: Ib426524c5483fa10207351401880281e27734a34
    Closes-bug: #1268579

Changed in fuel:
status: In Progress → Fix Committed
Changed in fuel:
status: Fix Committed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/75044

Changed in fuel:
assignee: Dmitry Borodaenko (dborodaenko) → Vladimir Kuklin (vkuklin)
Changed in fuel:
assignee: Vladimir Kuklin (vkuklin) → Ryan Moe (rmoe)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/75044
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=61ef37b48f3ef1346800317f4af440f767d32882
Submitter: Jenkins
Branch: master

commit 61ef37b48f3ef1346800317f4af440f767d32882
Author: Vladimir Kuklin <email address hidden>
Date: Thu Feb 20 18:00:30 2014 +0400

    Fix ruby function args parsing

    Change-Id: I3149ce42e2bdfacf30e4feb4300e69b3f33882f6
    Closes-bug: #1268579

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Nastya Urlapova (aurlapova) wrote :

{
build_id: "2014-02-26_00-30-27",
mirantis: "yes",
build_number: "208",
nailgun_sha: "ea08cef3e06a72f47cfaa8cd8fe6d034e2cf722e",
ostf_sha: "8e6681b6d06c7cb20a84c1cc740d5f2492fb9d85",
fuelmain_sha: "7939e28a5b3ab65361991e2bc22a792c7561cf87",
astute_sha: "10cccc87f2ee35510e43c8fa19d2bf916ca1fced",
release: "4.1",
fuellib_sha: "0a2e5bdc01c1e3bb285acb7b39125101e950ac72"
}

Changed in fuel:
status: Fix Committed → Fix Released
Andrew Woodward (xarses)
tags: added: ha
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.