Fluentd doesn't generate _id for documents sent to elasticsearch
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
kolla-ansible |
Confirmed
|
Medium
|
Unassigned | ||
Victoria |
Confirmed
|
Medium
|
Unassigned |
Bug Description
When fluentd pushes documents into elasticsearch it does not generate _id for each one, leaving that to elasticsearch instead.
When request to ES times out (either on fluentd side, or by some proxy) fluentd will retry the request but ES can't tell it's processing the same documents (as the _id differs) and so will try to save them all, potentially leading to another timeout.
If that happens it's very easy to have kolla's fluentd DDos the entire cluster by filling up the entire available space.
It's possible to generate _id before sending documents to elasticsearch, which will let cluster know that some documents are sent twice and update them instead of creating duplicates - see https:/
Changed in kolla-ansible: | |
importance: | Undecided → Medium |
Changed in kolla-ansible: | |
milestone: | 11.0.0 → none |
Changed in kolla-ansible: | |
assignee: | Krzysztof Klimonda (kklimonda) → nobody |
status: | In Progress → Confirmed |
Fix proposed to branch: master /review. opendev. org/753291
Review: https:/