-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Lightning-AI pytorch-lightning Lightning-trainer-api-trainer-lightningmodule-lightningdatamodule Discussions
Pinned Discussions
Sort by:
Latest activity
Categories, most helpful, and community links
Categories
Community links
⚡ Lightning Trainer API: Trainer, LightningModule, LightningDataModule Discussions
Questions about the Lightning Module, Trainer, or anything lighting related!
-
You must be logged in to vote ⚡ resume_form_checkpoint with different hyperparameters
checkpointingRelated to checkpointing sirtris askedAug 10, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡ Do not apply fp16 to certain key returned by Dataset
data handlingGeneric data-related topic precision: ampAutomatic Mixed Precision MattYoon askedJul 26, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡ How to load checkpoint for a sharded model?
strategy: fairscale sharded (removed)Sharded Data Parallel ver217 askedAug 11, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ -
You must be logged in to vote ⚡ Exploding Validation Loss
lsaeuro askedAug 12, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ TLDR on Making Repo Pip Installable
raunakdoesdev askedAug 11, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ How to register a (repeatedly) sampled random tensor?
data handlingGeneric data-related topic lightningmodulepl.LightningModule accelerator: cudaCompute Unified Device Architecture GPU RylanSchaeffer askedAug 10, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡ Why is 'log' keyword in the return parameter of training_step of LightningModule is not working?
loggingRelated to the `LoggerConnector` and `log()` kochark1 askedMay 20, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡ "resume from checkpoint" lead to CUDA out of memory
Defiler24 askedJan 21, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡ Attribute Error when resuming training from checkpoint file
pamparana34 askedAug 8, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ What is updated *per-epoch* (and not *per-batch*)?
jpcbertoldo askedAug 5, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ Proper way of making predictions with multi-GPUs
marcmk6 askedAug 5, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ Change the scheduler interval in CLI
lightningclipl.cli.LightningCLI ForJadeForest askedAug 2, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡ Why does pytorch lightning cause more GPU memory usage?
accelerator: cudaCompute Unified Device Architecture GPU performance plGeneric label for PyTorch Lightning package chuzheng88 askedJul 14, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ global_step while using two dataloaders
fugokidi askedAug 3, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ -
You must be logged in to vote ⚡ -
You must be logged in to vote ⚡ Weighting different losses based on random initialization of network
data handlingGeneric data-related topic lightningmodulepl.LightningModule Michael-Geuenich askedJul 23, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡ Why does DDP mode continue the program in multiple process for longer than intended?
strategy: ddpDistributedDataParallel hfaghihi15 askedJun 2, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡ Error: found at least two devices with DataParallel
strategy: dp (removed in pl)DataParallel avivko askedMay 30, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ Access model weights during training
vokcow askedAug 1, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡ limit_train_batches
data handlingGeneric data-related topic trainer: argumentsirtris askedJul 28, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡ Mixed precision on an Huggingface models
marcmk6 askedJul 29, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ train_time_interval checkpoint and metric value
DA-L3 askedJul 28, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ Handling batch normalization with gradient accumulation
brunomaga askedJul 27, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered