colossalai.nn.layer.colossalai_layer.linear

class colossalai.nn.layer.colossalai_layer.linear.Linear(in_features, out_features, bias=True, dtype=None, weight_initializer=<function kaiming_uniform_.<locals>.initializer>, bias_initializer=<function xavier_uniform_.<locals>.initializer>, **kwargs)

Linear layer of colossalai

Parameters

in_features (int) – size of each input sample
out_features (int) – size of each output sample
bias (bool, optional) – If set to False, the layer will not learn an additive bias, defaults to True
dtype (torch.dtype, optional) – The dtype of parameters, defaults to None
weight_initializer (Callable, optional) – The intializer of weight, defaults to kaiming uniform initializer
bias_initializer (Callable, optional) – The intializer of bias, defaults to xavier uniform initializer
kwargs – Kwargs used for particular parallelisms

class colossalai.nn.layer.colossalai_layer.linear.Classifier(in_features, num_classes, weight=None, bias=True, dtype=None, weight_initializer=<function kaiming_uniform_.<locals>.initializer>, bias_initializer=<function xavier_uniform_.<locals>.initializer>, vocab_parallel_limit=2048)

Classifier layer of colossalai

Parameters

in_features (int) – size of each input sample
num_classes (int) – number of total classes for the dataset
bias (bool, optional) – If set to False, the layer will not learn an additive bias, defaults to True
dtype (torch.dtype, optional) – The dtype of parameters, defaults to None
weight_initializer (Callable, optional) – The intializer of weight, defaults to kaiming uniform initializer
bias_initializer (Callable, optional) – The intializer of bias, defaults to xavier uniform initializer