-
Notifications
You must be signed in to change notification settings - Fork 13.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update await_pod_completion and affected unit tests #40568
Conversation
Ilike it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I am not sure we need to go changing every while loop to tenacity.
But I don't really mind it either.
Re the main part of the change, changing to read_namespaced_pod_status
, it's cool that it was not much trouble.
@@ -65,6 +65,10 @@ class PodLaunchFailedException(AirflowException): | |||
"""When pod launching fails in KubernetesPodOperator.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[merge blocker] If we like how I implemented this, I'll go ahead and apply the same changes to the other waiters. Please do not merge this.
I don't disagree. In this case, most of the file was already converted to tenacity, it looked to me like whoever did it either did only the easy ones or never got around to finishing. I also figured of we do decide to implement the tiered backoff idea from the other PR then it's easier to make that change. Also also, I just irrationally dislike seeing hardcoded |
Ok i think i understand better my problem here. First, I'm with you on tierd backoff thing and such -- I see the value of tenacity for that kind of thing. And with a strong caveat of "you do you" :) ..... Re just never getting around to finishing conversion to tenacity... I think it's a little different. The existing uses are a little different in that they are retrying after something unexpected happens. E.g. a connection error or something on a specific network call. But the use here is more "waiting for something expected to happen". When you see that decorator there, it basically signals, "this method retries for these failures". But here it doesn't mean that. It looks like it is retrying after error, but really it's sort of just a mechanism to wait indefinitely. And if you wanted to add retry-on-unexpected type of behavior, you'd need to stack another decorator. This of course is no big deal, but just sort of illustrates the two diff types of usages. In any case, I think I have a suggestion that may be good enough for everyone. What I would suggest is, just move the tenacity part interior to the function. Something like this for attempt in Retrying(
retry=tenacity.retry_if_exception_type(ResourceNotReadyException),
wait=tenacity.wait_fixed(2),
):
with attempt:
remote_pod = self.read_pod(pod)
if <pod not ready>
self.log.info("Pod %s has phase %s", pod.metadata.name, remote_pod.status.phase)
raise ResourceNotReadyException Then we don't need to decorate the whole function with tenacity, so it's clearer that it's a single synchronous wait process that is not actually retried. Since we're on the topic, just want to point out that there is some ugliness there currently in our usage of tenacity and I think we may have overdone it a bit. For example I see we have it added on Again, you do you and happy holiday, even though you are canadian |
Alright, we may need to have this conversation at a later date, but I'm going to drop that out of scope for this PR. For now I'll leave the loops how they are and just try scoping the get_{foo} calls down to get_{foo}_status calls. These should be two different PRs anyway, I suppose. I just fell into the "while you're in there..." trap. |
cee2ab3
to
2b40646
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the goal of this PR in its current state (replacing self.read_pod
that supports caching by a simple K8S client call).
I request a change to avoid merging it accidentally.
It was discussed pretty heavily in the previous PR (linked at the top of the description) but basically a user complained about CPU spikes while waiting for the pod and container to start. I thought it was the retry loop, but Jarek suggested that it was more likely the parsing of the api call, and suggested using get_state instead of read_pod when all we actually want is the state anyway. |
@hussein-awala look here #40360 (comment) that is the impetus for this PR "why load the whole pod when you just need the status" is, i think, the idea. but i'm curious -- can you tell us more what you're talking about with the caching? |
Unfortunately, in K8S, the pod status route returns the full pod manifest, so the result of import time
from kubernetes import client as k8s, config
config.load_kube_config(context="minikube")
client = k8s.ApiClient()
api = k8s.CoreV1Api(client)
namespace = "some namepsace"
pod_name = "some pod"
status = api.read_namespaced_pod_status(name=pod_name, namespace=namespace)
pod = api.read_namespaced_pod(name=pod_name, namespace=namespace)
print(pod == status) the output of this script is: True
We have a caching mechanism in def read_pod(self):
_now = utcnow()
if (
self.read_pod_cache is None
or self.last_read_pod_at + timedelta(seconds=self.read_pod_cache_timeout) < _now
):
self.read_pod_cache = self.pod_manager.read_pod(self.pod)
self.last_read_pod_at = _now
return self.read_pod_cache I didn't dive into the code to check if we use/need it, but between two methods that return the same result, and one of them has a cache mechanism, I prefer the one with caching. |
Well. As I wrote I am not the expert in those APIs :). But that's interesting. The API URLs for those are clearly different ("/status" is added in the status one) and the two indeed return V1Pod object. But I thought there must be a reason why there are two different calls - my intuition (yes not based on documentation - because the documentation is pretty enigmatic:
vs.
And while in simple cases, it might be the same - I was wondering if there are no cases where the Pod returned by the first method is way bigger and contain more information than Pod status returned by the second I could not find any docs about it or discussions though :) |
I actually didn't notice that. I saw the return type was the same but like Jarek mentioned, I didn't see much detail on it so I assumed the Alright, given this new knowledge...... I'm not sure where else to go with this. Do any of you have a suggestion to reduce CPU usage while stuck in the |
Alright... I might have something for you @ferruzzi We can skip the V1Pod deserialization process (response -> json -> V1Pod) by adding the arg Then, we can just convert to json and get the pod status with In my testing this results in 40x speedup, so presumably, much less CPU intensive. You can test yourself by running locally with arg But yeah my results were consistently 20 seconds vs 0.5 seconds (with 10,000 operations) |
But yeah, @ferruzzi re your comments on the first PR, I think yeah it would be a good idea to try and repro the users's situation. You would ask what their setup is, how many tasks are concurrently waiting for pods to start -- I presume this is celery? This would help you understand whether the issue is real. E.g. is there not another noisy neighbor that is using lots of CPU. But then, in any case, having repro'd it, you could use the same setup to evaluate whether your enhancement helps and if so how much and under what scenarios -- whether it is this optimization or some other one such as longer waits. |
The context here is looking for pod status to change. So a cache would be unhelpful here. I am actually quite surprised to see that code. I don't think it makes a lot of sense. Pretty much whenever we call read pod we're doing it for a reason. It looks like it was related to the pod logs consumer thing. We should not have such a cache on the method generally. Most calls should not use it. |
Nice! Good one @dstandish !
I quite agree here. |
Yes, that's a good idea. |
Cool, looks like we have a way forward then. I'll look into implementing it. Do we want to reuse this PR or shall I open a new one when I have that done? |
Your call |
Follow-up to #40360
Only modifying one of the methods at the moment. If we like what I did here, I'll do the same for the others. Please triple check my unit test changes to make sure I didn't inadvertently break the underlying mocks.
For the tenacity part of the change, note that I did away with the idea of using exponential backoff from the previous attempt (for now??) and just re-implemented the existing sleep(2) as a tenacity retry to align it with the other methods in this module which were already converted to tenacity. We may go back to trying different retry periods later, but for now that part of the change is just for consistency.
@dstandish and @potiuk : You both made good suggestions in the previous attempt at this, let me know what you think.