[Serve] Make max_num_models_per_replica
in @serve.multiplexed
reconfigurable
#46422
Labels
enhancement
Request for new feature and/or capability
serve
Ray Serve Related Issue
triage
Needs triage (eg: priority, bug/not-bug, and owning component)
Description
Currently,
max_num_models_per_replica
parameter for multiplexed deployments can't be reconfigured using the Serve config file.A similar feature exists for
@serve.batch
, https://docs.ray.io/en/latest/serve/advanced-guides/dyn-req-batch.html#enable-batching-for-your-deployment, and was implemented following issue #36844.Use case
This feature would allow users to modify the parameter through the
reconfigure
method, allowing in place updates of multiplexed deployments.The text was updated successfully, but these errors were encountered: