Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

task is incorrectly triggered after dynamic task mapping with BranchPythonOperator #40620

Open
1 of 2 tasks
raycarter opened this issue Jul 5, 2024 · 4 comments
Open
1 of 2 tasks

Comments

@raycarter
Copy link

Apache Airflow version

2.9.2

If "Other Airflow 2 version" selected, which one?

No response

What happened?

We need to use BranchPythonOperator and dynamic task mapping in a DAG to meet our requirements. The BranchPythonOperator controls whether dynamic mapping is needed, and there is a task (branch_2_task) following the dynamic task mapping that must be executed regardless of whether the dynamic task mapping runs (it can be skipped by receiving an empty list). As shown in the diagram 1, the trigger_rule for branch_2_task needs to be set to none_failed.

diagram 1:
image

If it is set to all_success, the task branch_2_task will also be skipped if the dynamic mapping is skipped due to an empty list (diagram 2).

diagram 2:
image

However, when BranchPythonOperator triggers the branch_1 task, the task branch_2_task on the other branch also gets executed (diagram 3).

diagram 3:
image

The potential bug could be that Airflow does not check if a task is on the executing branch but only checks the trigger_rule.

Could you fix this?

What you think should happen instead?

The task branch_2_task has trigger_rule none_failed to be executed even if the dynamic task mapping before it is skipped. If the other branch is executed, the task branch_2_task must be skipped because the branch should be skipped.

How to reproduce

  • Airflow 2.9.2
  • the demo DAG is here: https://gist.github.com/raycarter/75e896d600adec0563545fc58e3795d2
    • use parameters condition1 = 2, mapping_count = 0 to reproduce diagram 1
    • use parameters condition1 = 1, mapping_count = 0 to reproduce diagram 3
    • change trigger_rule of branch_2_task to all_success, use parameters condition1 = 2, mapping_count = 0 to reproduce diagram 2

Operating System

Ubuntu

Versions of Apache Airflow Providers

No response

Deployment

Other

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@raycarter raycarter added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Jul 5, 2024
@romsharon98
Copy link
Collaborator

In the third picture the branch_2_task trigger rule set to none_failed,since all the previous tasks are not failed (skipped or success) it should run.

@romsharon98 romsharon98 removed the needs-triage label for new issues that we didn't triage yet label Jul 5, 2024
@raycarter
Copy link
Author

raycarter commented Jul 5, 2024

In the third picture the branch_2_task trigger rule set to none_failed,since all the previous tasks are not failed (skipped or success) it should run.

the problem is actually, in the 3rd picture the task branch_2_task should not run, because the branch operator decided to trigger branch_1 (the other branch).

@romsharon98
Copy link
Collaborator

I see.
I still don't know if it's a bug though because the reason I mention, so we will wait for third opinion.
A quick action that I think of to avoid it, in branch_2_task you can retrive the task_id from the branch_operator and decide to run the logic or not by this.

@raycarter
Copy link
Author

sure! thank you for the quick first response!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants