ElasticSearch error handling not correct

Bug #1524998 reported by Travis Tripp
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Searchlight
New
High
Unassigned

Bug Description

The error handling for failure to index in ElasticSearch is not done properly right now. [1] [2]

If it succeeds, it is returning oslo_messaging.NotificationResult.HANDLED

If it fails, it just returns (implicit None)

However, according to OSLO notification handling docs [3]:

"An endpoint method can explicitly return oslo_messaging.NotificationResult.HANDLED to acknowledge a message or oslo_messaging.NotificationResult.REQUEUE to requeue the message.

The message is acknowledged only if all endpoints either return oslo_messaging.NotificationResult.HANDLED or None."

In addition, right now, we are not specifying whether or not to allow requeue when getting the listener:

https://github.com/openstack/searchlight/blob/099df8875ef344f4b909e6673b3201c0c7efbc96/searchlight/listener.py#L108

We must address this and must consider the following concerns:

* Plugins probably need to throw a "Fatal Exception" when a document failure will not benefit from requeue. For example, a data mapping failure will not change if tried again whereas a network failure will.

* Completely failed requests should potentially go to an error queue with a time to live for additional investigation

* We should consider a BP for publishing the status of a particular resource and its current coherency (e.g. /plugins includes errors count or something to that effect).

* Finally as the pipeline of publishers is worked through as part or the notification forwarding spec, we should consider how this should be done across publishers.

Reference:

[1] https://github.com/openstack/searchlight/blob/master/searchlight/elasticsearch/plugins/base.py#L469-L477

[2] https://github.com/openstack/searchlight/blob/master/searchlight/elasticsearch/plugins/designate/notification_handlers.py#L90-L92

[3] http://docs.openstack.org/developer/oslo.messaging/notification_listener.html

[4] https://blueprints.launchpad.net/searchlight/+spec/document-es-failure-handling

Changed in searchlight:
importance: Undecided → High
description: updated
Changed in searchlight:
milestone: none → mitaka-rc1
Changed in searchlight:
milestone: mitaka-rc1 → newton-1
Changed in searchlight:
milestone: newton-1 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.