fuel-ccp

Keystone connection issue during big heat stack creation

Bug #1680430 reported by Sergey Galkin on 2017-04-06

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	fuel-ccp	New	Undecided	Unassigned

Bug Description

Steps to reproduce
1. Deploy ccp with configs from https://review.openstack.org/#/c/451419/ on the 152 nodes
2. Try to run shaker (http://pyshaker.readthedocs.io/en/latest/) scenario openstack/full_l2

Creation of heat stack always failed with error Unable to establish connection to http://keystone.ccp.svc.cluster.local:35357/v3/auth/tokens: ('Connection aborted.', BadStatusLine("''",))
On example
2017-04-06 11:30:46.379 33032 ERROR shaker.engine.server Exception: Failed to deploy Heat stack 76fa9649-fecb-434c-b0e4-c3380666f318. Expected status COMPLETE, but got FAILED. Reason: Resource CREATE failed: ConnectFailure: resources.shaker_uskadi_slave_25: Unable to establish connection to http://keystone.ccp.svc.cluster.local:35357/v3/auth/tokens: ('Connection aborted.', BadStatusLine("''",))

Changing replicas for keystone up to 10 in the topology.yaml a little bit helps and heat stack deploying successfully from time to time

Logs from all keystones pods does not show any errors.

Tags:

Sergey Galkin (sgalkin) on 2017-04-06

summary:

- Keystone connection issue during big heat stake creation
+ Keystone connection issue during big heat stack creation

Revision history for this message

Yuriy Taraday (yorik-sar) wrote on 2017-04-11:

I've investigated this issue and here's what I've found.

From the issue in requests [0] it seemed that most likely reason for such error is server dropping connection before answering to request. In this scenario [1] httplib in 2.7 stdlib raises such error, which means that there's no way to handle it higher in libraries (urllib3 or requests). This was fixed in stdlib in Python 3.5 [2] by raising different exception in such case, but it still doesn't seem to be handled higher in the stack.

I've took a look at traffic between heat-engine and keystone and found these 3 problems that we have:
1. heat-engine does a lot of token creation requests during stack creation (about 30-60 requests per second);
2. Keystone (or rather Apache in front of it) eventually drops keep-alive'd connection from heat-engine;
3. There's no way for heat-engine to retry on such failure (in Python 2.7).

Re 1.: Heat doesn't seem to be properly caching token that is uses (or reusing keystoneclient session), I didn't find relevant issue in Heat upstream.

Re 2.: It seems like legit behavior, I don't see a way to work around it. My guess would be to make keep-alive connections persist longer, but the default limit is really high as it is.

Re 3.: As I understand, Heat claims to support Python 3.x for some time, but still we would need to adjust clients to handle this situation.

[0] https://github.com/kennethreitz/requests/issues/2364
[1] https://bugs.python.org/issue8450
[2] https://bugs.python.org/issue3566

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

python-roundup #3566
[2:3] Edit
python-roundup #8450
[2:3] Edit
auto-github-kennethreitz-requests #2364
[closed] Edit

Bug watches keep track of this bug in other bug trackers.