colossalai.testing

colossalai.testing.parameterize(argument, values)

This function is to simulate the same behavior as pytest.mark.parameterize. As we want to avoid the number of distributed network initialization, we need to have this extra decorator on the function launched by torch.multiprocessing.

If a function is wrapped with this wrapper, non-paramterized arguments must be keyword arguments, positioanl arguments are not allowed.

Usgae:

# Example 1:
@parameterize('person', ['xavier', 'davis'])
def say_something(person, msg):
    print(f'{person}: {msg}')

say_something(msg='hello')

# This will generate output:
# > xavier: hello
# > davis: hello

# Exampel 2:
@parameterize('person', ['xavier', 'davis'])
@parameterize('msg', ['hello', 'bye', 'stop'])
def say_something(person, msg):
    print(f'{person}: {msg}')

say_something()

# This will generate output:
# > xavier: hello
# > xavier: bye
# > xavier: stop
# > davis: hello
# > davis: bye
# > davis: stop
Parameters
  • argument (str) – the name of the argument to parameterize

  • values (List[Any]) – a list of values to iterate for this argument

colossalai.testing.rerun_on_exception(exception_type=<class 'Exception'>, pattern=None, max_try=5)

A decorator on a function to re-run when an exception occurs.

Usage:

# rerun for all kinds of exception
@rerun_on_exception()
def test_method():
    print('hey')
    raise RuntimeError('Address already in use')

# rerun for RuntimeError only
@rerun_on_exception(exception_type=RuntimeError)
def test_method():
    print('hey')
    raise RuntimeError('Address already in use')

# rerun for maximum 10 times if Runtime error occurs
@rerun_on_exception(exception_type=RuntimeError, max_try=10)
def test_method():
    print('hey')
    raise RuntimeError('Address already in use')

# rerun for infinite times if Runtime error occurs
@rerun_on_exception(exception_type=RuntimeError, max_try=None)
def test_method():
    print('hey')
    raise RuntimeError('Address already in use')

# rerun only the exception message is matched with pattern
# for infinite times if Runtime error occurs
@rerun_on_exception(exception_type=RuntimeError, pattern="^Address.*$")
def test_method():
    print('hey')
    raise RuntimeError('Address already in use')
Parameters
  • exception_type (Exception, Optional) – The type of exception to detect for rerun

  • pattern (str, Optional) – The pattern to match the exception message. If the pattern is not None and matches the exception message, the exception will be detected for rerun

  • max_try (int, Optional) – Maximum reruns for this function. The default value is 5. If max_try is None, it will rerun foreven if exception keeps occurings