Comment 0 for bug 1524998

Revision history for this message
Travis Tripp (travis-tripp) wrote :

The error handling for failure to index in ElasticSearch is not done properly right now. [1] [2]

If it succeeds, it is returning oslo_messaging.NotificationResult.HANDLED

If it fails, it just returns (implicit None)

However, according to OSLO notification handling docs [3]:

"An endpoint method can explicitly return oslo_messaging.NotificationResult.HANDLED to acknowledge a message or oslo_messaging.NotificationResult.REQUEUE to requeue the message.

The message is acknowledged only if all endpoints either return oslo_messaging.NotificationResult.HANDLED or None."

In addition, right now, we are not specifying whether or not to allow requeue when getting the listener:

https://github.com/openstack/searchlight/blob/099df8875ef344f4b909e6673b3201c0c7efbc96/searchlight/listener.py#L108

We must address this and must consider the following concerns:

* Plugins probably need to throw a "Fatal Exception" when a document failure will not benefit from requeue. For example, a data mapping failure will not change if tried again whereas a network failure will.

* Completely failed requests should potentially go to an error queue with a time to live for additional investigation

* We should consider a BP for publishing the status of a particular resource and its current coherency (e.g. /plugins includes errors count or something to that effect).

* Finally as the pipeline of publishers is worked through as part or the notification forwarding spec, we should consider how this should be done across publishers.

Reference:

[1] https://github.com/openstack/searchlight/blob/master/searchlight/elasticsearch/plugins/base.py#L469-L477

[2] https://github.com/openstack/searchlight/blob/master/searchlight/elasticsearch/plugins/designate/notification_handlers.py#L90-L92

[3] http://docs.openstack.org/developer/oslo.messaging/notification_listener.html