Update await_pod_completion and affected unit tests #40568

ferruzzi · 2024-07-03T00:28:10Z

Follow-up to #40360

Only modifying one of the methods at the moment. If we like what I did here, I'll do the same for the others. Please triple check my unit test changes to make sure I didn't inadvertently break the underlying mocks.

For the tenacity part of the change, note that I did away with the idea of using exponential backoff from the previous attempt (for now??) and just re-implemented the existing sleep(2) as a tenacity retry to align it with the other methods in this module which were already converted to tenacity. We may go back to trying different retry periods later, but for now that part of the change is just for consistency.

@dstandish and @potiuk : You both made good suggestions in the previous attempt at this, let me know what you think.

potiuk · 2024-07-03T08:11:59Z

Ilike it.

airflow/providers/cncf/kubernetes/utils/pod_manager.py

dstandish

Personally I am not sure we need to go changing every while loop to tenacity.

But I don't really mind it either.

Re the main part of the change, changing to read_namespaced_pod_status, it's cool that it was not much trouble.

ferruzzi · 2024-07-03T17:46:56Z

airflow/providers/cncf/kubernetes/utils/pod_manager.py

@@ -65,6 +65,10 @@ class PodLaunchFailedException(AirflowException):
    """When pod launching fails in KubernetesPodOperator."""


[merge blocker] If we like how I implemented this, I'll go ahead and apply the same changes to the other waiters. Please do not merge this.

ferruzzi · 2024-07-03T17:54:31Z

Personally I am not sure we need to go changing every while loop to tenacity.

I don't disagree. In this case, most of the file was already converted to tenacity, it looked to me like whoever did it either did only the easy ones or never got around to finishing. I also figured of we do decide to implement the tiered backoff idea from the other PR then it's easier to make that change. Also also, I just irrationally dislike seeing hardcoded While True; sleep() loops for some reason. :P

dstandish · 2024-07-03T19:23:38Z

I don't disagree. In this case, most of the file was already converted to tenacity, it looked to me like whoever did it either did only the easy ones or never got around to finishing. I also figured of we do decide to implement the tiered backoff idea from the other PR then it's easier to make that change. Also also, I just irrationally dislike seeing hardcoded While True; sleep() loops for some reason. :P

Ok i think i understand better my problem here.

First, I'm with you on tierd backoff thing and such -- I see the value of tenacity for that kind of thing.

And with a strong caveat of "you do you" :) .....

Re just never getting around to finishing conversion to tenacity... I think it's a little different. The existing uses are a little different in that they are retrying after something unexpected happens. E.g. a connection error or something on a specific network call. But the use here is more "waiting for something expected to happen". When you see that decorator there, it basically signals, "this method retries for these failures". But here it doesn't mean that. It looks like it is retrying after error, but really it's sort of just a mechanism to wait indefinitely.

And if you wanted to add retry-on-unexpected type of behavior, you'd need to stack another decorator. This of course is no big deal, but just sort of illustrates the two diff types of usages.

In any case, I think I have a suggestion that may be good enough for everyone.

What I would suggest is, just move the tenacity part interior to the function.

Something like this

        for attempt in Retrying(
            retry=tenacity.retry_if_exception_type(ResourceNotReadyException),
            wait=tenacity.wait_fixed(2),
        ):
            with attempt:
                remote_pod = self.read_pod(pod)
                if <pod not ready>
                self.log.info("Pod %s has phase %s", pod.metadata.name, remote_pod.status.phase)
                raise ResourceNotReadyException

Then we don't need to decorate the whole function with tenacity, so it's clearer that it's a single synchronous wait process that is not actually retried.

Since we're on the topic, just want to point out that there is some ugliness there currently in our usage of tenacity and I think we may have overdone it a bit. For example I see we have it added on get_container_names. But from the looks of it the source of retries in there is read_pod, and that itself already has tenacity on it! So instead of retrying 3 times (which the params indicate) if read_pod is having trouble, then it will retry 9 times :)

Again, you do you and happy holiday, even though you are canadian

ferruzzi · 2024-07-03T20:45:48Z

Alright, we may need to have this conversation at a later date, but I'm going to drop that out of scope for this PR. For now I'll leave the loops how they are and just try scoping the get_{foo} calls down to get_{foo}_status calls. These should be two different PRs anyway, I suppose. I just fell into the "while you're in there..." trap.

This reverts commit 58bf83d.

hussein-awala

I don't understand the goal of this PR in its current state (replacing self.read_pod that supports caching by a simple K8S client call).

I request a change to avoid merging it accidentally.

ferruzzi · 2024-07-04T17:09:27Z

It was discussed pretty heavily in the previous PR (linked at the top of the description) but basically a user complained about CPU spikes while waiting for the pod and container to start. I thought it was the retry loop, but Jarek suggested that it was more likely the parsing of the api call, and suggested using get_state instead of read_pod when all we actually want is the state anyway.

dstandish · 2024-07-04T20:15:56Z

I don't understand the goal of this PR in its current state (replacing self.read_pod that supports caching by a simple K8S client call).

I don't understand the goal of this PR in its current state (replacing self.read_pod that supports caching by a simple K8S client call).

I request a change to avoid merging it accidentally.

@hussein-awala look here #40360 (comment)

that is the impetus for this PR

"why load the whole pod when you just need the status" is, i think, the idea.

but i'm curious -- can you tell us more what you're talking about with the caching?

hussein-awala · 2024-07-04T23:01:29Z

"why load the whole pod when you just need the status" is, i think, the idea.

Unfortunately, in K8S, the pod status route returns the full pod manifest, so the result of read_namespaced_pod_status is the same as read_namespaced_pod with the same performance/cost:

import time

from kubernetes import client as k8s, config

config.load_kube_config(context="minikube")
client = k8s.ApiClient()
api = k8s.CoreV1Api(client)

namespace = "some namepsace"
pod_name = "some pod"

status = api.read_namespaced_pod_status(name=pod_name, namespace=namespace)
pod = api.read_namespaced_pod(name=pod_name, namespace=namespace)
print(pod == status)

the output of this script is:

True

but i'm curious -- can you tell us more what you're talking about with the caching?

We have a caching mechanism in read_pod method:

    def read_pod(self):
        _now = utcnow()
        if (
            self.read_pod_cache is None
            or self.last_read_pod_at + timedelta(seconds=self.read_pod_cache_timeout) < _now
        ):
            self.read_pod_cache = self.pod_manager.read_pod(self.pod)
            self.last_read_pod_at = _now
        return self.read_pod_cache

I didn't dive into the code to check if we use/need it, but between two methods that return the same result, and one of them has a cache mechanism, I prefer the one with caching.

potiuk · 2024-07-05T15:56:33Z

Unfortunately, in K8S, the pod status route returns the full pod manifest, so the result of read_namespaced_pod_status is the same as read_namespaced_pod with the same performance/cost:

Well. As I wrote I am not the expert in those APIs :). But that's interesting. The API URLs for those are clearly different ("/status" is added in the status one) and the two indeed return V1Pod object. But I thought there must be a reason why there are two different calls - my intuition (yes not based on documentation - because the documentation is pretty enigmatic:

read the specified Pod

vs.

read status of the specified Pod

And while in simple cases, it might be the same - I was wondering if there are no cases where the Pod returned by the first method is way bigger and contain more information than Pod status returned by the second

I could not find any docs about it or discussions though :)

ferruzzi · 2024-07-05T17:00:38Z

Unfortunately, in K8S, the pod status route returns the full pod manifest, so the result of read_namespaced_pod_status is the same as read_namespaced_pod with the same performance/cost:

I actually didn't notice that. I saw the return type was the same but like Jarek mentioned, I didn't see much detail on it so I assumed the /status endpoint just didn't "fill in the whole form", because why would they have two endpoints return the same thing.

Alright, given this new knowledge...... I'm not sure where else to go with this. Do any of you have a suggestion to reduce CPU usage while stuck in the await_pod_start and await_container_start loops?

dstandish · 2024-07-05T21:54:56Z

Alright... I might have something for you @ferruzzi

We can skip the V1Pod deserialization process (response -> json -> V1Pod) by adding the arg _preload_content=False

Then, we can just convert to json and get the pod status with data["status"]["phase"]

In my testing this results in 40x speedup, so presumably, much less CPU intensive.

k8s_api.txt

You can test yourself by running locally with arg pod vs raw. You'll need to update the hardcoded pod name.

But yeah my results were consistently 20 seconds vs 0.5 seconds (with 10,000 operations)

dstandish · 2024-07-05T22:05:43Z

But yeah, @ferruzzi re your comments on the first PR, I think yeah it would be a good idea to try and repro the users's situation. You would ask what their setup is, how many tasks are concurrently waiting for pods to start -- I presume this is celery? This would help you understand whether the issue is real. E.g. is there not another noisy neighbor that is using lots of CPU. But then, in any case, having repro'd it, you could use the same setup to evaluate whether your enhancement helps and if so how much and under what scenarios -- whether it is this optimization or some other one such as longer waits.

dstandish · 2024-07-05T22:15:11Z

@hussein-awala

I didn't dive into the code to check if we use/need it, but between two methods that return the same result, and one of them has a cache mechanism, I prefer the one with caching.

The context here is looking for pod status to change. So a cache would be unhelpful here. I am actually quite surprised to see that code. I don't think it makes a lot of sense. Pretty much whenever we call read pod we're doing it for a reason.

It looks like it was related to the pod logs consumer thing. We should not have such a cache on the method generally. Most calls should not use it.

potiuk · 2024-07-06T05:48:30Z

Then, we can just convert to json and get the pod status with data["status"]["phase"]
In my testing this results in 40x speedup, so presumably, much less CPU intensive.

Nice! Good one @dstandish !

The context here is looking for pod status to change. So a cache would be unhelpful here. I am actually quite surprised to see that code. I don't think it makes a lot of sense. Pretty much whenever we call read pod we're doing it for a reason.

I quite agree here.

hussein-awala · 2024-07-08T19:10:10Z

We can skip the V1Pod deserialization process (response -> json -> V1Pod) by adding the arg _preload_content=False
Then, we can just convert to json and get the pod status with data["status"]["phase"]

Yes, that's a good idea.

ferruzzi · 2024-07-08T21:44:30Z

we can just convert to json and get the pod status with data["status"]["phase"]

Nice! Good one @dstandish !

Yes, that's a good idea.

Cool, looks like we have a way forward then. I'll look into implementing it. Do we want to reuse this PR or shall I open a new one when I have that done?

dstandish · 2024-07-08T21:45:33Z

Your call

ferruzzi requested review from jedcunningham and hussein-awala as code owners July 3, 2024 00:28

boring-cyborg bot added area:providers provider:cncf-kubernetes Kubernetes provider related issues labels Jul 3, 2024

potiuk approved these changes Jul 3, 2024

View reviewed changes

vincbeck approved these changes Jul 3, 2024

View reviewed changes

dstandish reviewed Jul 3, 2024

View reviewed changes

airflow/providers/cncf/kubernetes/utils/pod_manager.py Outdated Show resolved Hide resolved

dstandish approved these changes Jul 3, 2024

View reviewed changes

ferruzzi commented Jul 3, 2024

View reviewed changes

ferruzzi added 3 commits July 3, 2024 16:27

Update await_pod_completion and affected unit tests

6607a29

Revert "Update await_pod_completion and affected unit tests"

9ced819

This reverts commit 58bf83d.

Only get the full pod on return.

2b40646

ferruzzi force-pushed the ferruzzi/k8s/fix-get-state branch from cee2ab3 to 2b40646 Compare July 3, 2024 23:27

hussein-awala requested changes Jul 4, 2024

View reviewed changes

fix unit tests

6a02493

ferruzzi closed this Jul 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update await_pod_completion and affected unit tests #40568

Update await_pod_completion and affected unit tests #40568

ferruzzi commented Jul 3, 2024

potiuk commented Jul 3, 2024

dstandish left a comment

ferruzzi Jul 3, 2024

ferruzzi commented Jul 3, 2024

dstandish commented Jul 3, 2024

ferruzzi commented Jul 3, 2024 •

edited

Loading

hussein-awala left a comment

ferruzzi commented Jul 4, 2024

dstandish commented Jul 4, 2024 •

edited

Loading

hussein-awala commented Jul 4, 2024 •

edited

Loading

potiuk commented Jul 5, 2024

ferruzzi commented Jul 5, 2024 •

edited

Loading

dstandish commented Jul 5, 2024

dstandish commented Jul 5, 2024 •

edited

Loading

dstandish commented Jul 5, 2024 •

edited

Loading

potiuk commented Jul 6, 2024

hussein-awala commented Jul 8, 2024

ferruzzi commented Jul 8, 2024

dstandish commented Jul 8, 2024

		@@ -65,6 +65,10 @@ class PodLaunchFailedException(AirflowException):
		"""When pod launching fails in KubernetesPodOperator."""

Update await_pod_completion and affected unit tests #40568

Update await_pod_completion and affected unit tests #40568

Conversation

ferruzzi commented Jul 3, 2024

potiuk commented Jul 3, 2024

dstandish left a comment

Choose a reason for hiding this comment

ferruzzi Jul 3, 2024

Choose a reason for hiding this comment

ferruzzi commented Jul 3, 2024

dstandish commented Jul 3, 2024

ferruzzi commented Jul 3, 2024 • edited Loading

hussein-awala left a comment

Choose a reason for hiding this comment

ferruzzi commented Jul 4, 2024

dstandish commented Jul 4, 2024 • edited Loading

hussein-awala commented Jul 4, 2024 • edited Loading

potiuk commented Jul 5, 2024

ferruzzi commented Jul 5, 2024 • edited Loading

dstandish commented Jul 5, 2024

dstandish commented Jul 5, 2024 • edited Loading

dstandish commented Jul 5, 2024 • edited Loading

potiuk commented Jul 6, 2024

hussein-awala commented Jul 8, 2024

ferruzzi commented Jul 8, 2024

dstandish commented Jul 8, 2024

ferruzzi commented Jul 3, 2024 •

edited

Loading

dstandish commented Jul 4, 2024 •

edited

Loading

hussein-awala commented Jul 4, 2024 •

edited

Loading

ferruzzi commented Jul 5, 2024 •

edited

Loading

dstandish commented Jul 5, 2024 •

edited

Loading

dstandish commented Jul 5, 2024 •

edited

Loading