You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to propose the addition of a new learning rate scheduler that combines MultiStepLR with a warmup phase. Currently, the Transformers library does not include a scheduler that uses both MultiStepLR and warmup. This feature can be beneficial for training models where the learning rate needs to be adjusted at specific epochs with an initial warmup phase to stabilise training.
Motivation
In many training scenarios, it is beneficial to start with a warmup phase where the learning rate gradually increases, followed by a phase where the learning rate decreases at specific milestones (steps).
Contribution
I propose adding a new scheduler, get_multistep_schedule_with_warmup, which combines the functionality of MultiStepLR and Warmup. This scheduler will increase the learning rate linearly during the warmup phase and then follow the MultiStepLR schedule. I am more than happy to create a pull request (PR) implementing this feature. Please let me know if this sounds like a valuable addition, and I will proceed with the implementation.
The text was updated successfully, but these errors were encountered:
Hi! In general we prefer if you can provide tangible results of improvement via either your own work or a paper referencing it. Can you link any please? thanks!
@muellerzr Thanks for the prompt reply! Sure I present it in detail below.
Popularity and Practical Use
The MultiStepLR scheduler is widely used and recognized for its effectiveness in practice, as evidenced by its popularity among PyTorch users. According to Defazio et al. (2023), it is one of the top three most popular schedulers. This piece-wise approach to decreasing the learning rate when progress plateaus has proven to be effective. Many studies incorporate this scheduler as a default choice for learning rate adjustment (Sohn et al., 2016; Wang et al., 2017; Gong et al., 2021).
PyTorch Scheduler
GitHub Files (K)
ReduceLROnPlateau
105.0
StepLR
101.0
MultiStepLR
37.9
CosineAnnealingLR
37.1
ExponentialLR
16.0
OneCycleLR
14.9
CosineAnnealingWarmRestarts
10.9
CyclicLR
9.1
LinearLR
5.9
ConstantLR
3.6
MultiplicativeLR
2.6
PolynomialLR
1.3
References
Defazio, Aaron, et al. "When, why and how much? adaptive learning rate scheduling by refinement." arXiv preprint arXiv:2310.07831 (2023).
Gong, Yuan, Yu-An Chung, and James Glass. "Ast: Audio spectrogram transformer." Proc. Interspeech (2021).
Wang, Jian, et al. "Deep metric learning with angular loss." Proceedings of the IEEE international conference on computer vision. (2017).
Sohn, Kihyuk. "Improved deep metric learning with multi-class n-pair loss objective." Advances in neural information processing systems 29 (2016).
Feature request
I would like to propose the addition of a new learning rate scheduler that combines MultiStepLR with a warmup phase. Currently, the Transformers library does not include a scheduler that uses both MultiStepLR and warmup. This feature can be beneficial for training models where the learning rate needs to be adjusted at specific epochs with an initial warmup phase to stabilise training.
Motivation
In many training scenarios, it is beneficial to start with a warmup phase where the learning rate gradually increases, followed by a phase where the learning rate decreases at specific milestones (steps).
Contribution
I propose adding a new scheduler,
get_multistep_schedule_with_warmup
, which combines the functionality of MultiStepLR and Warmup. This scheduler will increase the learning rate linearly during the warmup phase and then follow the MultiStepLR schedule. I am more than happy to create a pull request (PR) implementing this feature. Please let me know if this sounds like a valuable addition, and I will proceed with the implementation.The text was updated successfully, but these errors were encountered: