colossalai.nn.layer.vanilla
- class colossalai.nn.layer.vanilla.VanillaPatchEmbedding(img_size, patch_size, in_chans, embed_size, flatten=True, dtype=None, weight_initializer=<function kaiming_uniform_.<locals>.initializer>, bias_initializer=<function xavier_uniform_.<locals>.initializer>, position_embed_initializer=<function zeros_.<locals>.initializer>)[source]
2D Image to Patch Embedding
- Parameters
img_size (int) – image size.
patch_size (int) – patch size.
in_chans (int) – number of channels of input image.
embed_size (int) – size of embedding.
dtype (
torch.dtype, optional) – The dtype of parameters, defaults to None.flatten (bool, optional) – whether to flatten output tensor, defaults to True.
weight_initializer (
typing.Callable, optional) – The initializer of weight, defaults to kaiming uniform initializer.bias_initializer (
typing.Callable, optional) – The initializer of bias, defaults to xavier uniform initializer.position_embed_initializer (
typing.Callable, optional) – The initializer of position embedding, defaults to zeros initializer.
More details about initializer please refer to init.
- class colossalai.nn.layer.vanilla.VanillaClassifier(in_features, num_classes, weight=None, bias=True, dtype=None, weight_initializer=<function kaiming_uniform_.<locals>.initializer>, bias_initializer=<function xavier_uniform_.<locals>.initializer>)[source]
Dense linear classifier.
- Parameters
in_features (int) – size of each input sample.
num_classes (int) – number of classes.
weight (
torch.nn.Parameter, optional) – weight of the classifier, defaults to None.dtype (
torch.dtype, optional) – The dtype of parameters, defaults to None.flatten (bool, optional) – whether to flatten output tensor, defaults to True.
weight_initializer (
typing.Callable, optional) – The initializer of weight, defaults to kaiming uniform initializer.bias_initializer (
typing.Callable, optional) – The initializer of bias, defaults to xavier uniform initializer.
More details about initializer please refer to init.
- class colossalai.nn.layer.vanilla.DropPath(drop_prob=None)[source]
Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks). Adapted from https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/layers/drop.py
- Parameters
drop_prob (float, optional) – probability of dropping path, defaults None.
- class colossalai.nn.layer.vanilla.WrappedDropout(p=0.5, inplace=False, mode=None)[source]
Same as torch.nn.Dropout. But it is wrapped with the context of seed manager. During training, randomly zeroes some elements of the input tensor with probability p using samples from a Bernoulli distribution. Each channel will be zeroed out independently on every forward call. Furthermore, the outputs are scaled by a factor of 1/(1-p) during training. This means that during evaluation the module simply computes an identity function.
- Parameters
p (float, optional) – probability of an element to be zeroed, defaults 0.5.
inplace (bool, optional) – whether to do dropout in-place, default to be False.
mode (
colossalai.context.ParallelMode) – The chosen parallel mode.
Note
The parallel_mode should be concluded in
ParallelMode. More details aboutParallelModecould be found in parallel_mode
- class colossalai.nn.layer.vanilla.WrappedDropPath(p=0.0, mode=None)[source]
Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks). Here, it is wrapped with the context of seed manager.
- Parameters
p (float, optional) – probability of dropping path, defaults 0.0.
mode (
colossalai.context.ParallelMode) – The chosen parallel mode.
Note
The parallel_mode should be concluded in
ParallelMode. More details aboutParallelModecould be found in parallel_mode