scvi.core.trainers.UnsupervisedTrainer

class scvi.core.trainers.UnsupervisedTrainer(model, adata, train_size=0.9, test_size=None, n_iter_kl_warmup=None, n_epochs_kl_warmup=400, normalize_loss=None, **kwargs)[source]

Class for unsupervised training of an autoencoder.

Parameters
model

A model instance from class VAE, VAEC, SCANVI, AutoZIVAE

adata : AnnDataAnnData

A registered AnnData object

train_size : int, floatUnion[int, float] (default: 0.9)

The train size, a float between 0 and 1 representing proportion of dataset to use for training to use Default: 0.9.

test_size : int, float, NoneUnion[int, float, None] (default: None)

The test size, a float between 0 and 1 representing proportion of dataset to use for testing to use Default: None, which is equivalent to data not in the train set. If train_size and test_size do not add to 1 then the remaining samples are added to a validation_set.

**kwargs

Other keywords arguments from the general Trainer class.

Other Parameters
  • n_epochs_kl_warmup – Number of epochs for linear warmup of KL(q(z|x)||p(z)) term. After n_epochs_kl_warmup, the training objective is the ELBO. This might be used to prevent inactivity of latent units, and/or to improve clustering of latent space, as a long warmup turns the model into something more of an autoencoder. Be aware that large datasets should avoid this mode and rely on n_iter_kl_warmup. If this parameter is not None, then it overrides any choice of n_iter_kl_warmup.

  • n_iter_kl_warmup – Number of iterations for warmup (useful for bigger datasets) int(128*5000/400) is a good default value.

  • normalize_loss – A boolean determining whether the loss is divided by the total number of samples used for training. In particular, when the global KL divergence is equal to 0 and the division is performed, the loss for a minibatchis is equal to the average of reconstruction losses and KL divergences on the minibatch. Default: None, which is equivalent to setting False when the model is an instance from class AutoZIVAE and True otherwise.

Examples

>>> gene_dataset = CortexDataset()
>>> vae = VAE(gene_dataset.nb_genes, n_batch=gene_dataset.n_batches * False,
... n_labels=gene_dataset.n_labels)
>>> infer = VariationalInference(gene_dataset, vae, train_size=0.5)
>>> infer.train(n_epochs=20, lr=1e-3)

Notes

Two parameters can help control the training KL annealing If your applications rely on the posterior quality, (i.e. differential expression, batch effect removal), ensure the number of total epochs (or iterations) exceed the number of epochs (or iterations) used for KL warmup

Attributes

default_metrics_to_monitor

kl_weight

scvi_data_loaders_loop

Methods

check_training_status()

Checks if loss is admissible.

compute_metrics()

create_scvi_dl([model, adata, shuffle, …])

data_loaders_loop()

Returns an zipped iterable corresponding to loss signature.

loss(tensors[, feed_labels])

on_epoch_begin()

on_epoch_end()

on_iteration_begin()

on_iteration_end()

on_training_begin()

on_training_end()

on_training_loop(tensors_dict)

register_data_loader(name, value)

train([n_epochs, lr, eps, params])

train_test_validation([model, adata, …])

Creates data loaders train_set, test_set, validation_set.

training_extras_end()

Place to put extra models in eval mode, etc.

training_extras_init(**extras_kwargs)

Other necessary models to simultaneously train.