You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The option smoothing when creating progress bars in TQDMProgressBar has no effect in the default implementation, as
_update_n only calls bar.refresh() and not the update method of the progress bar. Thus only the global average is taken, as the update method of the tqdm class is responsible for calculating moving averages.
Either the update method of the progress bar could be used or it should be added to the documentation if smoothing having no effect is the desired behavior (overriding a default that has no effect is a bit misleading)
What version are you seeing the problem on?
master
How to reproduce the bug
importtimeimportlightning.pytorchasplimporttorchfromtorchimportnnfromtorch.utils.dataimportDataset, DataLoader, Samplerfromsrc.main.ml.data.data_augmentation.helpers.random_numbersimportcreate_rng_from_stringimportsysfromtypingimportAnyimportlightning.pytorchasplfromlightning.pytorch.callbacksimportTQDMProgressBarfromlightning.pytorch.callbacks.progress.tqdm_progressimportTqdmfromlightning.pytorch.utilities.typesimportSTEP_OUTPUTfromtyping_extensionsimportoverrideclassLitProgressBar(TQDMProgressBar):
""" different smoothing factor than default lightning TQDMProgressBar, where smoothing=0 (average), instead of smoothing=1 (current speed) is taken See also: https://tqdm.github.io/docs/tqdm/ """definit_train_tqdm(self) ->Tqdm:
"""Override this to customize the tqdm bar for training."""returnTqdm(
desc=self.train_description,
position=(2*self.process_position),
disable=self.is_disabled,
leave=True,
dynamic_ncols=True,
file=sys.stdout,
smoothing=1.0,
bar_format=self.BAR_FORMAT,
)
# default method# @override# def on_train_batch_end(# self, trainer: "pl.Trainer", pl_module: "pl.LightningModule", outputs: STEP_OUTPUT, batch: Any, batch_idx: int# ) -> None:# n = batch_idx + 1# if self._should_update(n, self.train_progress_bar.total):# _update_n(self.train_progress_bar, n)# self.train_progress_bar.set_postfix(self.get_metrics(trainer, pl_module))# my own method that uses smoothing by using the update method of progress bar@overridedefon_train_batch_end(
self, trainer: "pl.Trainer", pl_module: "pl.LightningModule", outputs: STEP_OUTPUT, batch: Any,
batch_idx: int
) ->None:
n=batch_idx+1ifself._should_update(n, self.train_progress_bar.total):
self.train_progress_bar.update(self.refresh_rate)
self.train_progress_bar.set_postfix(self.get_metrics(trainer, pl_module))
classTestModule(nn.Module):
def__init__(self, in_dim=512, out_dim=16):
super().__init__()
self.in_dim=in_dimself.out_dim=out_dimself.simple_layer=nn.Linear(self.in_dim, self.out_dim, bias=True)
defforward(self, input):
returnself.simple_layer(input)
classTestBatchSampler(Sampler):
def__init__(self, step=0):
super().__init__()
self.step=stepdef__len__(self) ->int:
return1e100# return len(self.train_allfiles)def__iter__(self): # -> Iterator[int]:returnselfdef__next__(self): # -> Iterator[int]:return_value=self.stepself.step+=1return [return_value]
classTestDataset(Dataset):
def__init__(self, in_dim):
super().__init__()
self.in_dim=in_dimself.total_len=512def__len__(self):
return1def__getitem__(self, idx):
rng=create_rng_from_string(
str(idx) +"_"+"random_choice_sampler")
returntorch.tensor(rng.random(self.in_dim), dtype=torch.float32)
classTestDataModule(pl.LightningDataModule):
def__init__(self, start_step=0):
super().__init__()
self.in_dim=512self.val_batch_size=1self.start_step=start_stepdeftrain_dataloader(self):
train_ds=TestDataset(self.in_dim)
train_dl=DataLoader(train_ds, batch_sampler=TestBatchSampler(step=self.start_step), num_workers=4,
shuffle=False)
returntrain_dlclassTestLitModel(pl.LightningModule):
def__init__(self):
super().__init__()
self.test_module_obj=TestModule(in_dim=512, out_dim=16)
self.automatic_optimization=Falsedeftraining_step(self, batch, batch_idx):
ifbatch_idx==0:
time.sleep(5)
time.sleep(0.5)
optimizer=self.optimizers()
output=self.test_module_obj(batch)
loss=output.sum()
self.manual_backward(loss)
optimizer.step()
defconfigure_optimizers(self):
optimizer=torch.optim.AdamW(
self.test_module_obj.parameters()
)
returnoptimizerif__name__=='__main__':
test_data_loader=TestDataModule()
test_lit_model=TestLitModel()
bar=LitProgressBar(refresh_rate=5)
trainer=pl.Trainer(
log_every_n_steps=1,
callbacks=[bar],
max_epochs=-1,
max_steps=400000,
)
trainer.fit(test_lit_model,
datamodule=test_data_loader)
as _update_n only calls bar.refresh() and not the update method of the progress bar
This change is needed to give us exact control in all use cases to enable exact updates. The progress bar is deeply tied to the loops so we need that precise control. So we can't remove the _update_n change. But you can definitely override everything in TQDMProgressBar you wish and make it your own if that smoothing option is important.
or it should be added to the documentation if smoothing having no effect is the desired behavior (overriding a default that has no effect is a bit misleading)
I'm fine with adding a note to the TQDMProgressBar.init_*_tqdm methods documentation 👍
Bug description
The option smoothing when creating progress bars in TQDMProgressBar has no effect in the default implementation, as
_update_n only calls bar.refresh() and not the update method of the progress bar. Thus only the global average is taken, as the update method of the tqdm class is responsible for calculating moving averages.
Either the update method of the progress bar could be used or it should be added to the documentation if smoothing having no effect is the desired behavior (overriding a default that has no effect is a bit misleading)
What version are you seeing the problem on?
master
How to reproduce the bug
Error messages and logs here please
#- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow):
#- PyTorch Lightning Version (e.g., 1.5.0):
#- Lightning App Version (e.g., 0.5.2):
#- PyTorch Version (e.g., 2.0):
#- Python version (e.g., 3.9):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(
conda
,pip
, source):#- Running environment of LightningApp (e.g. local, cloud):
The text was updated successfully, but these errors were encountered: