Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve] Make max_num_models_per_replica in @serve.multiplexed reconfigurable #46422

Open
LahiLuk opened this issue Jul 3, 2024 · 0 comments
Open
Labels
enhancement Request for new feature and/or capability serve Ray Serve Related Issue triage Needs triage (eg: priority, bug/not-bug, and owning component)

Comments

@LahiLuk
Copy link

LahiLuk commented Jul 3, 2024

Description

Currently, max_num_models_per_replica parameter for multiplexed deployments can't be reconfigured using the Serve config file.
A similar feature exists for @serve.batch, https://docs.ray.io/en/latest/serve/advanced-guides/dyn-req-batch.html#enable-batching-for-your-deployment, and was implemented following issue #36844.

Use case

This feature would allow users to modify the parameter through the reconfigure method, allowing in place updates of multiplexed deployments.

@LahiLuk LahiLuk added enhancement Request for new feature and/or capability triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jul 3, 2024
@anyscalesam anyscalesam added the serve Ray Serve Related Issue label Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Request for new feature and/or capability serve Ray Serve Related Issue triage Needs triage (eg: priority, bug/not-bug, and owning component)
Projects
None yet
Development

No branches or pull requests

2 participants