colossalai.communication.collective
- colossalai.communication.collective.all_gather(tensor, dim, parallel_mode, async_op=False)[source]
Gathers all tensors from the parallel group and concatenates them in a specific dimension.
Note
The parallel_mode should be concluded in
ParallelMode. More details aboutParallelModecould be found in parallel_mode.- Parameters
tensor (
torch.Tensor) – Tensor to be gathered.dim (int) – The dimension concatenating in.
parallel_mode (
colossalai.context.ParallelMode) – Parallel group mode used in this communication.async_op (bool, optional) – Whether operations are asynchronous.
- Returns
The result of all-together only, if async_op is set to False. A tuple of output of all-gather and Async work handle, if async_op is set to True.
- Return type
Union[tuple(
torch.Tensor, work handle),torch.Tensor]
- colossalai.communication.collective.reduce_scatter(tensor, dim, parallel_mode, op=<ReduceOp.SUM: 0>, async_op=False)[source]
Reduces all tensors then scatters it in a specific dimension to all members in the parallel group.
Note
The parallel_mode should be concluded in
ParallelMode. More details aboutParallelModecould be found in parallel_mode.- Parameters
tensor (
torch.Tensor) – Tensor to be reduce_scattered.dim (int) – The dimension concatenating in.
parallel_mode (
colossalai.context.ParallelMode) – Parallel group mode used in this communication.op (torch.distributed.ReduceOp, optional) – The type of reduce operation, should be included in [SUM, AVG, PRODUCT, MIN, MAX, BAND, BOR, BXOR]. More details about ReduceOp please refer to ReduceOp.
async_op (bool, optional) – Whether operations are asynchronous.
- Returns
The result of reduce_scatter only, if async_op is set to False. A tuple of output of all-gather and Async work handle, if async_op is set to True.
- Return type
Union[tuple(
torch.Tensor, work handle),torch.Tensor]
- colossalai.communication.collective.all_reduce(tensor, parallel_mode, op=<ReduceOp.SUM: 0>, async_op=False)[source]
Reduces the tensor data across whole parallel group in such a way that all get the final result.
Note
The parallel_mode should be concluded in
ParallelMode. More details aboutParallelModecould be found in parallel_mode.- Parameters
tensor (
torch.Tensor) – Tensor to be all-reduced.parallel_mode (
colossalai.context.ParallelMode) – Parallel group mode used in this communication.op (torch.distributed.ReduceOp, optional) –
The type of reduce operation, should be included in [SUM, AVG, PRODUCT, MIN, MAX, BAND, BOR, BXOR]. More details about ReduceOp please refer to ReduceOp.
async_op (bool, optional) – Whether operations are asynchronous.
- Returns
The result of all-gather only, if async_op is set to False. A tuple of output of all-gather and Async work handle, if async_op is set to True.
- Return type
Union[tuple(
torch.Tensor, work handle),torch.Tensor]
- colossalai.communication.collective.broadcast(tensor, src, parallel_mode, async_op=False)[source]
Broadcast tensors to whole parallel group. Tensor must have the same number of elements in all processes participating in the collective.
Note
The parallel_mode should be concluded in
ParallelMode. More details aboutParallelModecould be found in parallel_mode.- Parameters
tensor (
torch.Tensor) – Tensor to be broadcast.src (int) – Source rank.
parallel_mode (
colossalai.context.ParallelMode) – Parallel group mode used in this communication.async_op (bool, optional) – Whether operations are asynchronous.
- Returns
The tensor need to be broadcast only, if async_op is set to False. A tuple of output of all-gather and Async work handle, if async_op is set to True.
- Return type
Union[tuple(
torch.Tensor, work handle),torch.Tensor]
- colossalai.communication.collective.reduce(tensor, dst, parallel_mode, op=<ReduceOp.SUM: 0>, async_op=False)[source]
Reduce tensors across whole parallel group. Only the process with rank
dstis going to receive the final result.Note
The parallel_mode should be concluded in
ParallelMode. More details aboutParallelModecould be found in parallel_mode.- Parameters
tensor (
torch.Tensor) – Tensor to be reduced.dst (int) – Destination rank.
parallel_mode (
colossalai.context.ParallelMode) – Parallel group mode used in this communication.async_op (bool, optional) – Whether operations are asynchronous.
- Returns
The result of reduce only, if async_op is set to False. A tuple of output of all-gather and Async work handle, if async_op is set to True.
- Return type
Union[tuple(
torch.Tensor, work handle),torch.Tensor]