colossalai.communication.collective

colossalai.communication.collective.all_gather(tensor, dim, parallel_mode, async_op=False)[source]

Gathers all tensors from the parallel group and concatenates them in a specific dimension.

Note

The parallel_mode should be concluded in ParallelMode. More details about ParallelMode could be found in parallel_mode.

Parameters
  • tensor (torch.Tensor) – Tensor to be gathered.

  • dim (int) – The dimension concatenating in.

  • parallel_mode (colossalai.context.ParallelMode) – Parallel group mode used in this communication.

  • async_op (bool, optional) – Whether operations are asynchronous.

Returns

The result of all-together only, if async_op is set to False. A tuple of output of all-gather and Async work handle, if async_op is set to True.

Return type

Union[tuple(torch.Tensor, work handle), torch.Tensor]

colossalai.communication.collective.reduce_scatter(tensor, dim, parallel_mode, op=<ReduceOp.SUM: 0>, async_op=False)[source]

Reduces all tensors then scatters it in a specific dimension to all members in the parallel group.

Note

The parallel_mode should be concluded in ParallelMode. More details about ParallelMode could be found in parallel_mode.

Parameters
  • tensor (torch.Tensor) – Tensor to be reduce_scattered.

  • dim (int) – The dimension concatenating in.

  • parallel_mode (colossalai.context.ParallelMode) – Parallel group mode used in this communication.

  • op (torch.distributed.ReduceOp, optional) – The type of reduce operation, should be included in [SUM, AVG, PRODUCT, MIN, MAX, BAND, BOR, BXOR]. More details about ReduceOp please refer to ReduceOp.

  • async_op (bool, optional) – Whether operations are asynchronous.

Returns

The result of reduce_scatter only, if async_op is set to False. A tuple of output of all-gather and Async work handle, if async_op is set to True.

Return type

Union[tuple(torch.Tensor, work handle), torch.Tensor]

colossalai.communication.collective.all_reduce(tensor, parallel_mode, op=<ReduceOp.SUM: 0>, async_op=False)[source]

Reduces the tensor data across whole parallel group in such a way that all get the final result.

Note

The parallel_mode should be concluded in ParallelMode. More details about ParallelMode could be found in parallel_mode.

Parameters
  • tensor (torch.Tensor) – Tensor to be all-reduced.

  • parallel_mode (colossalai.context.ParallelMode) – Parallel group mode used in this communication.

  • op (torch.distributed.ReduceOp, optional) –

    The type of reduce operation, should be included in [SUM, AVG, PRODUCT, MIN, MAX, BAND, BOR, BXOR]. More details about ReduceOp please refer to ReduceOp.

  • async_op (bool, optional) – Whether operations are asynchronous.

Returns

The result of all-gather only, if async_op is set to False. A tuple of output of all-gather and Async work handle, if async_op is set to True.

Return type

Union[tuple(torch.Tensor, work handle), torch.Tensor]

colossalai.communication.collective.broadcast(tensor, src, parallel_mode, async_op=False)[source]

Broadcast tensors to whole parallel group. Tensor must have the same number of elements in all processes participating in the collective.

Note

The parallel_mode should be concluded in ParallelMode. More details about ParallelMode could be found in parallel_mode.

Parameters
  • tensor (torch.Tensor) – Tensor to be broadcast.

  • src (int) – Source rank.

  • parallel_mode (colossalai.context.ParallelMode) – Parallel group mode used in this communication.

  • async_op (bool, optional) – Whether operations are asynchronous.

Returns

The tensor need to be broadcast only, if async_op is set to False. A tuple of output of all-gather and Async work handle, if async_op is set to True.

Return type

Union[tuple(torch.Tensor, work handle), torch.Tensor]

colossalai.communication.collective.reduce(tensor, dst, parallel_mode, op=<ReduceOp.SUM: 0>, async_op=False)[source]

Reduce tensors across whole parallel group. Only the process with rank dst is going to receive the final result.

Note

The parallel_mode should be concluded in ParallelMode. More details about ParallelMode could be found in parallel_mode.

Parameters
  • tensor (torch.Tensor) – Tensor to be reduced.

  • dst (int) – Destination rank.

  • parallel_mode (colossalai.context.ParallelMode) – Parallel group mode used in this communication.

  • async_op (bool, optional) – Whether operations are asynchronous.

Returns

The result of reduce only, if async_op is set to False. A tuple of output of all-gather and Async work handle, if async_op is set to True.

Return type

Union[tuple(torch.Tensor, work handle), torch.Tensor]