colossalai.nn.layer.colossalai_layer.linear

class colossalai.nn.layer.colossalai_layer.linear.Linear(in_features, out_features, bias=True, dtype=None, weight_initializer=<function kaiming_uniform_.<locals>.initializer>, bias_initializer=<function xavier_uniform_.<locals>.initializer>, **kwargs)

Linear layer of colossalai

Parameters
  • in_features (int) – size of each input sample

  • out_features (int) – size of each output sample

  • bias (bool, optional) – If set to False, the layer will not learn an additive bias, defaults to True

  • dtype (torch.dtype, optional) – The dtype of parameters, defaults to None

  • weight_initializer (Callable, optional) – The intializer of weight, defaults to kaiming uniform initializer

  • bias_initializer (Callable, optional) – The intializer of bias, defaults to xavier uniform initializer

  • kwargs – Kwargs used for particular parallelisms

class colossalai.nn.layer.colossalai_layer.linear.Classifier(in_features, num_classes, weight=None, bias=True, dtype=None, weight_initializer=<function kaiming_uniform_.<locals>.initializer>, bias_initializer=<function xavier_uniform_.<locals>.initializer>, vocab_parallel_limit=2048)

Classifier layer of colossalai

Parameters
  • in_features (int) – size of each input sample

  • num_classes (int) – number of total classes for the dataset

  • bias (bool, optional) – If set to False, the layer will not learn an additive bias, defaults to True

  • dtype (torch.dtype, optional) – The dtype of parameters, defaults to None

  • weight_initializer (Callable, optional) – The intializer of weight, defaults to kaiming uniform initializer

  • bias_initializer (Callable, optional) – The intializer of bias, defaults to xavier uniform initializer