colossalai.context.process_group_initializer

class colossalai.context.process_group_initializer.Initializer_Tensor(*args, **kwargs)

A ProcessGroupInitializer for tensor parallelism.

Parameters
  • args – Args used to initialize ProcessGroupInitializer

  • kwargs – Kwargs used to initialize ProcessGroupInitializer

init_dist_group()

Initialize tensor parallel groups, and assign local_ranks and groups to each gpu.

Returns

Tensor parallelism’s information

Return type

Tuple(local_rank, group_world_size, process_group, ranks_in_group, mode)

class colossalai.context.process_group_initializer.Initializer_Sequence(*args, **kwargs)

A ProcessGroupInitializer for sequence parallelism.

Parameters
  • args – Args used to initialize ProcessGroupInitializer

  • kwargs – Kwargs used to initialize ProcessGroupInitializer

init_dist_group()

Initialize Sequence parallel process groups and assign local_ranks and groups to each gpu.

Sequence parallelism requires 2 process groups. The first is for model forward where several processes exchange paritial query, key and value embedding to compute self attention values. The second is for all-reduce to synchronize the model parameters.

Returns

Sequence parallelism’s information

Return type

list of Tuples (local_rank, group_world_size, process_group, ranks_in_group, mode)

class colossalai.context.process_group_initializer.Initializer_Pipeline(*args, **kwargs)

A ProcessGroupInitializer for pipeline parallelism.

Parameters
  • args – Args used to initialize ProcessGroupInitializer

  • kwargs – Kwargs used to initialize ProcessGroupInitializer

init_dist_group()

Initialize pipeline parallel groups, and assign local_ranks and groups to each gpu.

Returns

Pipeline parallelism’s information

Return type

list of Tuples (local_rank, group_world_size, process_group, ranks_in_group, mode)

class colossalai.context.process_group_initializer.Initializer_Data(*args, **kwargs)

A ProcessGroupInitializer for data parallelism.

Parameters
  • args – Args used to initialize ProcessGroupInitializer

  • kwargs – Kwargs used to initialize ProcessGroupInitializer

init_dist_group()

Initialize data parallel groups, and assign local_ranks and groups to each gpu.

Returns

Data parallelism’s information

Return type

Tuple(local_rank, group_world_size, process_group, ranks_in_group, mode)

class colossalai.context.process_group_initializer.Initializer_2p5D(rank, world_size, config, data_parallel_size, pipeline_parallel_size, tensor_parallel_size, depth)

Serve as the single entry point to Tesseract parallel initialization.

Parameters
  • rank (int) – The rank of current process

  • world_size (int) – Size of whole communication world

  • config (Config) – Running configuration

  • data_parallel_size (int) – Size of data parallel

  • pipeline_parallel_size (int) – Size of pipeline parallel

  • tensor_parallel_size (int) – Size of tensor parallel

  • depth (int) – The depth of 2p5d parallel

init_dist_group()

Initialize 2p5D tensor row, col, depth, and colXdepth parallel groups, and assign local_ranks and groups to each gpu. :return: Whole 2p5D tensor parallelism’s information :rtype: list of Tuples (local_rank, group_world_size, process_group, ranks_in_group, mode)

class colossalai.context.process_group_initializer.Initializer_2D(*args, **kwargs)

Serve as the single entry point to 2D parallel initialization.

Parameters
  • args – Args used to initialize ProcessGroupInitializer

  • kwargs – Kwargs used to initialize ProcessGroupInitializer

init_dist_group()

Initialize 2D tensor row and col parallel groups, and assign local_ranks and groups to each gpu. :return: 2D tensor parallelism’s information :rtype: list of Tuples (local_rank, group_world_size, process_group, ranks_in_group, mode)

class colossalai.context.process_group_initializer.Initializer_3D(*args)

Serve as the single entry point to 3D parallel initialization. :param args: Args used to initialize ProcessGroupInitializer

init_dist_group()

Initialize 3D tensor parallel groups, and assign local_ranks and groups to each gpu. :return: 3D tensor parallelism’s information :rtype: list of Tuples (local_rank, group_world_size, process_group, ranks_in_group, mode)

class colossalai.context.process_group_initializer.Initializer_1D(*args, **kwargs)

A ProcessGroupInitializer for 1d tensor parallelism.

init_dist_group()

Initialize 1D tensor parallel groups, and assign local_ranks and groups to each gpu. :return: (local_rank, group_world_size, process_group, ranks_in_group, mode) :rtype: Tuple

class colossalai.context.process_group_initializer.ProcessGroupInitializer(rank, world_size, config, data_parallel_size, pipeline_parallel_size, tensor_parallel_size)

An object, knowing the parallelism configuration, that initializes parallel groups.

Parameters
  • rank (int) – The rank of current process

  • world_size (int) – Size of whole communication world

  • config (Config) – Running configuration

  • data_parallel_size (int) – Size of data parallel

  • pipeline_parallel_size (int) – Size of pipeline parallel

  • tensor_parallel_size (int) – Size of tensor parallel

class colossalai.context.process_group_initializer.Initializer_Model(*args, **kwargs)

A ProcessGroupInitializer for model parallelism (model parallel group contains pipeline and tensor parallel groups).

Parameters
  • args – Args used to initialize ProcessGroupInitializer

  • kwargs – Kwargs used to initialize ProcessGroupInitializer

init_dist_group()

Initialize model parallel groups, and assign local_ranks and groups to each gpu.

Returns

(local_rank, group_world_size, process_group, ranks_in_group, mode)

Return type

Tuple