Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing KPO task wich task decorator and imported typing elements #40585

Closed
2 tasks done
BenediktRamsauer opened this issue Jul 3, 2024 · 3 comments · Fixed by #40642
Closed
2 tasks done

Failing KPO task wich task decorator and imported typing elements #40585

BenediktRamsauer opened this issue Jul 3, 2024 · 3 comments · Fixed by #40642
Labels
area:providers kind:bug This is a clearly a bug provider:cncf-kubernetes Kubernetes provider related issues

Comments

@BenediktRamsauer
Copy link
Contributor

BenediktRamsauer commented Jul 3, 2024

Apache Airflow Provider(s)

cncf-kubernetes

Versions of Apache Airflow Providers

8.3.2

Apache Airflow version

2.9.2

Operating System

rhel

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

What happened

Having specific typing elements that need to be imported generate an error when using @task.kubernetes task decorator.
See below for the specific of the errors.

What you think should happen instead

A solution could be to adjust the get_python_source function here : https://github.com/apache/airflow/blob/main/airflow/decorators/base.py#L301.

We could add a step to remove the typing from the code (for the parameters part but also, if wished, in the code).
I have a solve ready using ast and astunparse.

How to reproduce

from typing import Any
from airflow.decorators import dag, task
 
class MyObject:
    var: str = 'foo'
 
POD_CONFIG = dict(
        image='apache/airflow:slim-latest-python3.11',
        image_pull_policy="Always",
        namespace=None,
        get_logs=True,
        log_events_on_failure=True,
        do_xcom_push=True,
    )
 
@dag(
    schedule=None,
)
def kpo_bug_test():
 
    @task.kubernetes(**POD_CONFIG)
    def my_task0(var='foo'):
        pass
 
    @task.kubernetes(**POD_CONFIG)
    def my_task1(var: Any ='foo'):
        print(var)
 
    @task.kubernetes(**POD_CONFIG)
    def my_task2(var: MyObject=MyObject()) :
        pass
 
    @task.kubernetes(**POD_CONFIG)
    def my_task3():
        var: Any = 'foo'
        print(var)
 
    my_task0()
    my_task1()
    my_task2()
    my_task3()
 
kpo_bug_test()

I obtain the following results:
Bild

  • my_task0 is supposed to run as proof that everything is fine
  • my_task1 is supposed to fail because the from typing import Any is not added in the /tmp/scripts.py file. The interesting log portion:

[2024-07-03, 11:48:09 UTC] {pod_manager.py:472} INFO - [base] + python /tmp/script.py /tmp/script.in /airflow/xcom/return.json
[2024-07-03, 11:48:09 UTC] {pod_manager.py:472} INFO - [base] Traceback (most recent call last):
[2024-07-03, 11:48:09 UTC] {pod_manager.py:472} INFO - [base] File "/tmp/script.py", line 19, in
[2024-07-03, 11:48:09 UTC] {pod_manager.py:472} INFO - [base] def my_task1(var: Any ='foo'):
[2024-07-03, 11:48:09 UTC] {pod_manager.py:472} INFO - [base] ^^^
[2024-07-03, 11:48:09 UTC] {pod_manager.py:490} INFO - [base] NameError: name 'Any' is not defined. Did you mean: 'any'?

  • my_task2 is also supposed to fail, because the definition of the MyObject is not ported to /tmp/scripts.py and may never be since the created pod migt not contain the class definition.

[2024-07-03, 11:48:20 UTC] {pod_manager.py:472} INFO - [base] + python /tmp/script.py /tmp/script.in /airflow/xcom/return.json
[2024-07-03, 11:48:20 UTC] {pod_manager.py:472} INFO - [base] Traceback (most recent call last):
[2024-07-03, 11:48:20 UTC] {pod_manager.py:472} INFO - [base] File "/tmp/script.py", line 19, in
[2024-07-03, 11:48:20 UTC] {pod_manager.py:472} INFO - [base] def my_task2(var: MyObject=MyObject()) :
[2024-07-03, 11:48:20 UTC] {pod_manager.py:472} INFO - [base] ^^^^^^^^
[2024-07-03, 11:48:20 UTC] {pod_manager.py:490} INFO - [base] NameError: name 'MyObject' is not defined. Did you mean: 'object'?

  • For my_task3, I was expecting a failure since from typing import Any is not in the /tmp/script.py file (as for my_task2). But it appears that is does not raise errors. Thus we do not need to solve this none existing problem.

Anything else

This issue should also appear for @task.docker. But I have not tested it.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@BenediktRamsauer BenediktRamsauer added area:providers kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Jul 3, 2024
Copy link

boring-cyborg bot commented Jul 3, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@dosubot dosubot bot added the provider:cncf-kubernetes Kubernetes provider related issues label Jul 3, 2024
@potiuk
Copy link
Member

potiuk commented Jul 3, 2024

I think simpler solution will be to add from __future__ import annotations in the script that is generated by KPO. This will automatically convert all typing annotations into string annotations that are not evaluated during parsing https://peps.python.org/pep-0563/ -> this way there is no need to do any AST analysis

@potiuk potiuk removed the needs-triage label for new issues that we didn't triage yet label Jul 3, 2024
@BenediktRamsauer
Copy link
Contributor Author

Nice!

Thank you for the feedback. I'll do that! Way easier indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers kind:bug This is a clearly a bug provider:cncf-kubernetes Kubernetes provider related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants