colossalai.utils.moe
- colossalai.utils.moe.get_moe_epsize_param_dict(model)[source]
Returns a parameter dictionary, the key of which is the expert parallel size of every parameter. Since the parameters in data parallelism is replicated in each GPU, we set their ep_size to 1.
- Parameters
model (
torch.nn.Module) – A pyTorch nn.Module from which we get dict.