Timeout is not used on retry

Bug #1789384 reported by romain courtin
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mistral
Fix Released
High
Oleg Ovcharuk

Bug Description

I have a workflow with a single mistral http task with a task timeout. When my task doesn't complete before the timeout the first time, the task is re-executed but the task timeout is not used in the retry.

Here is my workflow definiton :

---
version: "2.0"

workflow_test:
  type: direct
  tasks:
    put_product_category:
      action: std.mistral_http
      timeout : 10
      input:
        url: http://myserver:81/products/categories/ids"
        method: PUT
        headers:
          Content-Type: application/json
        body: "716"
      retry:
        delay: 4
        count: 2

And the execution logs from mistral-engine.log :

2018-04-03 09:12:17.745 233440 INFO workflow_trace [req-6bb8e518-350f-4e4e-87cd-d8c032add3db - - - - -] Workflow 'workflow_test' [IDLE -> RUNNING, msg=None] (execution_id=ec2c3a95-2516-4f63-a30e-cd08d8e960d7)
2018-04-03 09:12:17.753 233440 INFO workflow_trace [req-6bb8e518-350f-4e4e-87cd-d8c032add3db - - - - -] Timeout check scheduled [task=412b1f83-4f15-486e-9c1b-4d0848a44759, timeout(s)=10]. (execution_id=ec2c3a95-2516-4f63-a30e-cd08d8e960d7 task_id=412b1f83-4f15-486e-9c1b-4d0848a44759)
2018-04-03 09:12:54.529 233440 INFO workflow_trace [req-6bb8e518-350f-4e4e-87cd-d8c032add3db - - - - -] Task 'put_product_category' (412b1f83-4f15-486e-9c1b-4d0848a44759) [RUNNING -> ERROR, msg=Task timed out [timeout(s)=10].] (execution_id=ec2c3a95-2516-4f63-a30e-cd08d8e960d7)
2018-04-03 09:12:54.535 233440 INFO workflow_trace [req-6bb8e518-350f-4e4e-87cd-d8c032add3db - - - - -] Task 'put_product_category' [ERROR -> DELAYED, delay = 4 sec] (execution_id=ec2c3a95-2516-4f63-a30e-cd08d8e960d7 task_id=412b1f83-4f15-486e-9c1b-4d0848a44759)
2018-04-03 09:13:25.068 233440 INFO workflow_trace [req-6bb8e518-350f-4e4e-87cd-d8c032add3db - - - - -] Task 'put_product_category' (412b1f83-4f15-486e-9c1b-4d0848a44759) [DELAYED -> RUNNING, msg=None] (execution_id=ec2c3a95-2516-4f63-a30e-cd08d8e960d7)

I was expecting that the second execution use the same task timeout than the first try.

Revision history for this message
Vitalii Solodilov (mcdoker18) wrote :

Hi. This is the expected behavior.
A task must fail and can't retry after a timeout.
There is the bug https://bugs.launchpad.net/mistral/+bug/1767352 . I try to finish it soon.

Revision history for this message
romain courtin (couc) wrote :

Thank you for your quick response.

Is there a way to reschedule a task that reach the task-timeout ?

romain courtin (couc)
summary: - Timeout is not use on retry
+ Timeout is not used on retry
Revision history for this message
Vitalii Solodilov (mcdoker18) wrote :

Yep, you need to rerun a failed task. For example:
mistral task-rerun *task-id*

If you need action timeout, feel free to create a blueprint. I can implement it during the Stein release.

Revision history for this message
romain courtin (couc) wrote :

Ok, I think we will handle the task failure with the on-error task attribute.

I think we can close this bug report.

thanks again for your answers.

Changed in mistral:
status: New → Invalid
Changed in mistral:
assignee: nobody → Vitalii Solodilov (mcdoker18)
Changed in mistral:
status: Invalid → In Progress
Dougal Matthews (d0ugal)
Changed in mistral:
importance: Undecided → Medium
importance: Medium → High
milestone: none → stein-1
Revision history for this message
romain courtin (couc) wrote :

Hello,

Do you have an new release date for this bug resolution ?

Dougal Matthews (d0ugal)
Changed in mistral:
milestone: stein-1 → stein-2
Changed in mistral:
milestone: stein-2 → stein-3
Revision history for this message
ali aguerouaz (aaguero) wrote :

Hello,

Do you have an new release date for this bug resolution ?

Changed in mistral:
assignee: Vitalii Solodilov (mcdoker18) → Oleg Ovcharuk (vgvoleg)
milestone: stein-3 → train-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to mistral (master)

Reviewed: https://review.opendev.org/601501
Committed: https://git.openstack.org/cgit/openstack/mistral/commit/?id=09cd21d56106f417305b10c2f8f64568e916e0bd
Submitter: Zuul
Branch: master

commit 09cd21d56106f417305b10c2f8f64568e916e0bd
Author: Vitalii Solodilov <email address hidden>
Date: Tue Sep 11 10:33:59 2018 +0400

    Docs improvements: task timeout, global context, Docker and jinja

    A not obvious point for users is the task does not retry after a
    timeout is triggered. Added clarification in the retry section.

    The documentation contains enough example with Jinja usage. Improved
    only the create_vm workflow definition.

    Added global publishing to the doc. It is brash copy-paste from
    https://specs.openstack.org/openstack/mistral-specs/specs/pike/approved/advanced_publishing.html without mention of atomic publish.

    Move Docker guides to the installation section.

    Change-Id: I149b2e1dff7f86bd356f4dd2f758659469e6a4a8
    Closes-Bug: #1789384
    Closes-Bug: #1690156
    Closes-Bug: #1779244
    Signed-off-by: Vitalii Solodilov <email address hidden>

Changed in mistral:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/mistral 9.0.0.0b1

This issue was fixed in the openstack/mistral 9.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.