API Documentation

nnabla_nas.module

BatchNormalization

class nnabla_nas.module.batchnorm.BatchNormalization(n_features, n_dims, axes=[1], decay_rate=0.9, eps=1e-05, output_stat=False, fix_parameters=False, param_init=None, name='')[source]

Bases: Module

Batch normalization layer.

Parameters:
  • n_features (int) – Number of dimentional features.

  • n_dims (int) – Number of dimensions.

  • axes (tuple of int) – Mean and variance for each element in axes are calculated using elements on the rest axes. For example, if an input is 4 dimensions, and axes is [1], batch mean is calculated as np.mean(inp.d, axis=(0, 2, 3), keepdims=True) (using numpy expression as an example).

  • decay_rate (float, optional) – Decay rate of running mean and variance. Defaults to 0.9.

  • eps (float, optional) – Tiny value to avoid zero division by std. Defaults to 1e-5.

  • output_stat (bool, optional) – Output batch mean and variance. Defaults to False.

  • fix_parameters (bool) – When set to True, the beta and gamma will not be updated.

  • param_init (dict) –

    Parameter initializers can be set with a dict. A key of the dict must be 'beta', 'gamma', 'mean' or 'var'. A value of the dict must be an Initializer or a numpy.ndarray. E.g.:

    {
        'beta': ConstantIntializer(0),
        'gamma': np.ones(gamma_shape) * 2
    }
    

  • name (string) – the name of this module

Returns:

N-D array.

Return type:

Variable

References

Ioffe and Szegedy, Batch Normalization: Accelerating Deep

Network Training by Reducing Internal Covariate Shift. https://arxiv.org/abs/1502.03167

call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

Container

class nnabla_nas.module.container.ModuleList(modules=None)[source]

Bases: Module

Hold submodules in a list. This implementation mainly follows the Pytorch implementation.

Parameters:

modules (iterable, optional) – An iterable of modules to add.

append(module)[source]

Appends a given module to the end of the list.

Parameters:

module (Module) – A module to append.

extend(modules)[source]

Appends modules from a Python iterable to the end of the list.

Parameters:

modules (iterable) – An iterable of modules to append.

extra_format()[source]

Set the submodule representation format.

extra_repr()[source]

Set the extra representation for the module.

insert(index, module)[source]

Insert a given module before a given index in the list.

Parameters:
  • index (int) – An index to insert.

  • module (Module) – A module to insert.

class nnabla_nas.module.container.ParameterList(parameters=None)[source]

Bases: Module

Hold parameters in a list.

Parameters:

parameters (iterable, optional) – An iterable of parameters to add.

append(parameter)[source]

Appends a given module to the end of the list.

Parameters:

parameter (Parameter) – A parameter to append.

extend(parameters)[source]

Extends an iterable of parameters to the end of the list.

Parameters:

parameters (iterable) – An iterable of Parameters.

extra_format()[source]

Set the submodule representation format.

extra_repr()[source]

Set the extra representation for the module.

insert(index, parameter)[source]

Insert a given parameter before a given index in the list.

Parameters:
  • index (int) – An index to insert.

  • parameter (Parameter) – A parameter to insert.

class nnabla_nas.module.container.Sequential(*args)[source]

Bases: ModuleList

A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an ordered dict of modules can also be passed in.

call(input)[source]

Implement the call of module. Inputs should only be Variables.

Convolution

class nnabla_nas.module.convolution.Conv(in_channels, out_channels, kernel, pad=None, stride=None, dilation=None, group=1, w_init=None, b_init=None, base_axis=1, fix_parameters=False, rng=None, with_bias=True, channel_last=False, name='')[source]

Bases: Module

N-D Convolution layer.

Parameters:
  • in_channels (int) – Number of convolution kernels (which is equal to the number of input channels).

  • out_channels (int) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.

  • kernel (tuple of int) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).

  • pad (tuple of int, optional) – Padding sizes for dimensions. Defaults to None.

  • stride (tuple of int, optional) – Stride sizes for dimensions. Defaults to None.

  • dilation (tuple of int, optional) – Dilation sizes for dimensions. Defaults to None.

  • group (int, optional) – Number of groups of channels. This makes connections across channels more sparse by grouping connections along map direction. Defaults to 1.

  • w_init (nnabla.initializer.BaseInitializer or numpy.ndarray, optional) – Initializer for weight. By default, it is initialized with nnabla.initializer.UniformInitializer within the range determined by nnabla.initializer.calc_uniform_lim_glorot.

  • b_init (nnabla.initializer.BaseInitializer or numpy.ndarray, optional) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.

  • base_axis (int, optional) – Dimensions up to base_axis are treated as the sample dimensions. Defaults to 1.

  • fix_parameters (bool, optional) – When set to True, the weights and biases will not be updated. Defaults to False.

  • rng (numpy.random.RandomState, optional) – Random generator for Initializer. Defaults to None.

  • with_bias (bool, optional) – Specify whether to include the bias term. Defaults to True.

  • channel_last (bool, optional) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. Defaults to False.

  • name (string) – the name of this module

call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

class nnabla_nas.module.convolution.DwConv(in_channels, kernel, pad=None, stride=None, dilation=None, multiplier=1, w_init=None, b_init=None, base_axis=1, fix_parameters=False, rng=None, with_bias=True, name='')[source]

Bases: Module

N-D Depthwise Convolution layer.

Parameters:
  • in_channels (int) – Number of convolution kernels (which is equal to the number of input channels).

  • kernel (tuple of int) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).

  • pad (tuple of int, optional) – Padding sizes for dimensions. Defaults to None.

  • stride (tuple of int, optional) – Stride sizes for dimensions. Defaults to None.

  • dilation (tuple of int, optional) – Dilation sizes for dimensions. Defaults to None.

  • multiplier (int, optional) – Number of output feature maps per input feature map. Defaults to 1.

  • w_init (nnabla.initializer.BaseInitializer or numpy.ndarray, optional) – Initializer for weight. By default, it is initialized with nnabla.initializer.UniformInitializer within the range determined by nnabla.initializer.calc_uniform_lim_glorot.

  • b_init (nnabla.initializer.BaseInitializer or numpy.ndarray, optional) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.

  • base_axis (int, optional) – Dimensions up to base_axis are treated as the sample dimensions. Defaults to 1.

  • fix_parameters (bool, optional) – When set to True, the weights and biases will not be updated. Defaults to False.

  • rng (numpy.random.RandomState, optional) – Random generator for Initializer. Defaults to None.

  • with_bias (bool, optional) – Specify whether to include the bias term. Defaults to True.

  • name (string) – the name of this module

References

  1. Chollet: Chollet, Francois. “Xception: Deep Learning with

    Depthwise Separable Convolutions. https://arxiv.org/abs/1610.02357

call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

Dropout

class nnabla_nas.module.dropout.Dropout(drop_prob=0.5, name='')[source]

Bases: Module

Dropout layer.

During training, randomly zeroes some of the elements of the input tensor with probability p using samples from a Bernoulli distribution. Each channel will be zeroed out independently on every forward call.

Parameters:
  • drop_prob (int, optional) – The probability of an element to be zeroed. Defaults to 0.5.

  • name (string) – the name of this module

call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

Identity

class nnabla_nas.module.identity.Identity(name='')[source]

Bases: Module

Identity layer. A placeholder identity operator that is argument-insensitive.

call(input)[source]

Implement the call of module. Inputs should only be Variables.

Linear

class nnabla_nas.module.linear.Linear(in_features, out_features, base_axis=1, w_init=None, b_init=None, rng=None, bias=True, name='')[source]

Bases: Module

Linear layer. Applies a linear transformation to the incoming data: \(y = xA^T + b\)

Parameters:
  • in_features (int) – The size of each input sample.

  • in_features – The size of each output sample.

  • base_axis (int, optional) – Dimensions up to base_axis are treated as the sample dimensions. Defaults to 1.

  • w_init (nnabla.initializer.BaseInitializer or numpy.ndarray) – Initializer for weight. By default, it is initialized with nnabla.initializer.UniformInitializer within the range determined by nnabla.initializer.calc_uniform_lim_glorot.

  • b_init (nnabla.initializer.BaseInitializer or numpy.ndarray) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.

  • rng (numpy.random.RandomState) – Random generator for Initializer.

  • with_bias (bool) – Specify whether to include the bias term.

  • name (string) – the name of this module

call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

Merging

class nnabla_nas.module.merging.Merging(mode, axis=1, name='')[source]

Bases: Module

Merging layer.

Merges a list of NNabla Variables.

Parameters:
  • mode (str) – The merging mode (‘concat’, ‘add’, ‘mul’), where concat indicates that the inputs will be concatenated, add means the element-wise addition, and mul means the element-wise multiplication.

  • axis (int, optional) – The axis for merging when ‘concat’ is used. Defaults to 1.

  • name (string) – the name of this module

call(*inputs)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

MixedOp

class nnabla_nas.module.mixedop.MixedOp(operators, mode='full', alpha=None, rng=None, name='')[source]

Bases: Module

Mixed Operator layer.

Selects a single operator or a combination of different operators that are allowed in this module.

Parameters:
  • operators (List of Module) – A list of modules.

  • mode (str, optional) – The selecting mode for this module. Defaults to full. Possible modes are sample, full, max, or ‘fair’.

  • alpha (Parameter, optional) – The weights used to calculate the evaluation probabilities. Ignored in ‘fair’ mode. Defaults to None.

  • rng (numpy.random.RandomState) – Random generator for random choice.

  • name (string) – the name of this module

property active_index
call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

class nnabla_nas.module.module.Module(name='')[source]

Bases: object

Module base for all nnabla neural network modules.

Your models should also subclass this class. Modules can also contain other Modules, allowing to nest them in a tree structure.

apply(memo=None, **kargs)[source]

Helper for setting property recursively, then returns self.

calc_latency_all_modules(path, graph, func_latency=None)[source]

Calculate the latency for each of the modules in a graph. The modules are extracted using the graph structure information. The latency is then calculated based on each individual module’s nnabla graph. It also saves the accumulated latency of all modules.

Parameters:
  • path

  • graph

  • func_latency – function to use to calc latency of each of the modules This function needs to work based on the graph

call(*args, **kwargs)[source]

Implement the call of module. Inputs should only be Variables.

convert_npp_to_onnx(path, opset='opset_11')[source]

Finds all nnp files in the given path and its subfolders and converts them to ONNX For this to run smoothly, nnabla_cli must be installed and added to your python path.

Parameters:
  • path

  • opset

The actual bash shell command used is:

> find <DIR> -name '*.nnp' -exec echo echo {} \|
  awk -F \. '\{print "nnabla_cli convert -b 1 -d opset_11 "\$0" "\$1"\."\$2"\.onnx"\}' \; | sh | sh

which, for each file found with find, outputs the following:

> echo <FILE>.nnp | awk -F \. '{print "nnabla_cli convert -b 1 -d opset_11 "$0" "$1"."$2".onnx"}'  # noqa: E501,W605

which, for each file, generates the final conversion command:

> nnabla_cli convert -b 1 -d opset_11 <FILE>.nnp <FILE>.nnp.onnx
extra_format()[source]

Set the submodule representation format.

extra_repr()[source]

Set the extra representation for the module.

get_latency(estimator, active_only=True)[source]

Function to use to calc latency This function needs to work based on the graph :param estimator: a graph-based estimator :param active_only: get latency of active modules only

Returns:

list of all latencies of each module accum_lat: total sum of latencies of all modules

Return type:

latencies

get_latency_by_mod(estimator, active_only=True)[source]

* Note: This function is deprecated. Use get_latency() * Function to use to calc latency This function needs to work based on the module :param estimator: a module-based estimator :param active_only: get latency of active modules only

Returns:

list of all latencies of each module accum_lat: total sum of latencies of all modules

Return type:

latencies

get_modules(prefix='', memo=None)[source]

Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

Parameters:
  • prefix (str, optional) – Additional prefix to name modules. Defaults to ‘’.

  • memo (dict, optional) – Memorize all parsed modules. Defaults to None.

Yields:

(str, Module) – a submodule.

get_parameters(grad_only=False)[source]

Return an OrderedDict containing all parameters in the module.

Parameters:

grad_only (bool, optional) – If need_grad=True is required. Defaults to False.

Returns:

A dictionary containing parameters of module.

Return type:

OrderedDict

property input_shapes

Return a list of input shapes used during call function.

property is_active

Whether the module was called.

load_parameters(path, raise_if_missing=False)[source]

Loads parameters from a file with the specified format.

Parameters:
  • path (str) – Relative path to the parameter file (based on the original working directory).

  • raise_if_missing (bool, optional) – Raise exception if some parameters are missing. Defaults to False.

property modules

Return an OrderedDict containing immediate modules.

property modules_to_profile

Returns a list with the modules that will be profiled when the Profiler/Estimator functions are called. All other modules in the network will not be profiled.

property name

The name of the module.

Returns:

the name of the module

Return type:

string

property need_grad

Whether the module needs gradient.

property parameters

Return an OrderedDict containing immediate parameters.

save_modules_nnp(path, active_only=False, calc_latency=False, func_latency=None)[source]

Saves all modules of the network as individual nnp files, using folder structure given by name convention. The modules are extracted going over the module list, not over the graph structure. The latency is then calculated based on each individual module’s nnabla graph (e.g. [LatencyGraphEstimator])

Parameters:
  • path

  • active_only – if True, only active modules are saved

  • calc_latency – flag for calc latency

  • func_latency – function to use to calc latency of each of the extracted modules This function needs to work based on the graph

save_modules_nnp_by_mod(path, active_only=False, calc_latency=False, func_latency=None)[source]

* Note: This function is deprecated. Use save_modules_nnp() * Saves all modules of the network as individual nnp files, using folder structure given by name convention. The modules are extracted going over the module list, not over the graph structure. The latency is then calculated using the module themselves (e.g. [LatencyEstimator])

Parameters:
  • path

  • active_only – if True, only active modules are saved

  • calc_latency – flag for calc latency

  • func_latency – function to use to calc latency of each of the extracted modules This function needs to work based on the modules

save_net_nnp(path, inp, out, calc_latency=False, func_real_latency=None, func_accum_latency=None, save_params=None)[source]

Saves whole net as one nnp Calc whole net (real) latency (using e.g.Nnabla’s [Profiler]) Calculate also layer-based latency The modules are discovered using the nnabla graph of the whole net The latency is then calculated based on each individual module’s nnabla graph (e.g. [LatencyGraphEstimator])

Parameters:
  • path – absolute path

  • inp – input of the created network

  • out – output of the created network

  • calc_latency – flag for calc latency

  • func_real_latency – function to use to calc actual latency

  • func_accum_latency – function to use to calc accum. latency, this is, dissecting the network layer by layer using the graph of the network, calculate the latency for each layer and add up all these results.

save_parameters(path, params=None, grad_only=False)[source]

Saves the parameters to a file.

Parameters:
  • path (str) – Absolute path to file.

  • params (OrderedDict, optional) – An OrderedDict containing parameters. If params is None, then the current parameters will be saved.

  • grad_only (bool, optional) – If need_grad=True is required for parameters which will be saved. Defaults to False.

set_parameters(params, raise_if_missing=False)[source]

Set parameters for the module.

Parameters:
  • params (OrderedDict) – The parameters which will be loaded.

  • raise_if_missing (bool, optional) – Raise exception if some parameters are missing. Defaults to False.

Raises:

ValueError – Parameters are not found.

property training

The training mode of module.

Lambda

class nnabla_nas.module.operation.Lambda(func, name='')[source]

Bases: Module

Lambda module.

This module wraps a NNabla operator.

Parameters:
  • func (nnabla.functions) – A NNabla funcion.

  • name (string) – the name of this module

call(*args, **kargs)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

Parameter

class nnabla_nas.module.parameter.Parameter(shape, need_grad=True, initializer=None, scope='')[source]

Bases: Variable

Parameter is a Variable.

A kind of Variable that is to be considered a module parameter. Parameters are Variable subclasses, that have a very special property when used with Module s - when they’re assigned as Module attributes they are automatically added to the list of its parameters.

Parameters:
  • shape (tuple of int) – The shape of Parameter.

  • need_grad (bool, optional) – If the parameter requires gradient. Defaults to True.

  • initializer (nnabla.initializer.BaseInitializer or numpy.ndarray) – An initialization function to be applied to the parameter. numpy.ndarray can also be given to initialize parameters from numpy array data. Defaults to None.

Pooling

class nnabla_nas.module.pooling.AvgPool(kernel, stride=None, ignore_border=True, pad=None, channel_last=False, name='')[source]

Bases: Module

Average pooling layer. It pools the averaged values inside the scanning kernel.

Parameters:
  • kernel (tuple of int) – Kernel sizes for each spatial axis.

  • stride (tuple of int, optional) – Subsampling factors for each spatial axis. Defaults to None.

  • ignore_border (bool) – If false, kernels covering borders are also

  • True. (considered for the output. Defaults to) –

  • pad (tuple of int, optional) – Border padding values for each spatial axis. Padding will be added both sides of the dimension. Defaults to (0,) * len(kernel).

  • channel_last (bool) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. Defaults to False.

  • name (string) – the name of this module

call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

class nnabla_nas.module.pooling.GlobalAvgPool(name='')[source]

Bases: Module

Global average pooling layer. It pools an averaged value from the whole image. :param name: the name of this module :type name: string

call(input)[source]

Implement the call of module. Inputs should only be Variables.

class nnabla_nas.module.pooling.MaxPool(kernel, stride=None, pad=None, channel_last=False, name='')[source]

Bases: Module

Max pooling layer. It pools the maximum values inside the scanning kernel.

Parameters:
  • kernel (tuple of int) – Kernel sizes for each spatial axis.

  • stride (tuple of int, optional) – Subsampling factors for each spatial axis. Defaults to None.

  • pad (tuple of int, optional) – Border padding values for each spatial axis. Padding will be added both sides of the dimension. Defaults to (0,) * len(kernel).

  • channel_last (bool) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. Defaults to False.

  • name (string) – the name of this module

call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

Relu

class nnabla_nas.module.relu.LeakyReLU(alpha=0.1, inplace=False, name='')[source]

Bases: Module

LeakyReLU layer. Element-wise Leaky Rectified Linear Unit (ReLU) function.

Parameters:
  • alpha (float, optional) – The slope value multiplied to negative numbers. \(\alpha\) in the definition. Defaults to 0.1.

  • inplace (bool, optional) – can optionally do the operation in-place. Default: False.

  • name (string) – the name of this module

call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

class nnabla_nas.module.relu.ReLU(inplace=False, name='')[source]

Bases: Module

ReLU layer. Applies the rectified linear unit function element-wise.

Parameters:
  • inplace (bool, optional) – can optionally do the operation in-place. Default: False.

  • name (string) – the name of this module

call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

class nnabla_nas.module.relu.ReLU6(name='')[source]

Bases: Module

ReLU6 layer. Capping ReLU activation to 6 is often observed to learn sparse features earlier. :param name: the name of this module :type name: string

call(input)[source]

Implement the call of module. Inputs should only be Variables.

Zero

class nnabla_nas.module.zero.Zero(stride: Tuple[int, int] = (1, 1), *, name: str = '')[source]

Bases: Module

Zero layer. A placeholder zero operator that is argument-insensitive.

Parameters:

stride (tuple of int, optional) – Stride sizes for dimensions. Defaults to (1, 1).

call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

nnabla_nas.module.static

class nnabla_nas.module.static.AvgPool(parents, name='', eval_prob=None, *args, **kwargs)[source]

Bases: AvgPool, Module

The AvgPool module performs avg pooling on the output of its parent. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

  • kernel (tuple of int) – Kernel sizes for each spatial axis.

  • stride (tuple of int, optional) – Subsampling factors for each spatial axis. Defaults to None.

  • pad (tuple of int, optional) – Border padding values for each spatial axis. Padding will be added both sides of the dimension. Defaults to (0,) * len(kernel).

  • channel_last (bool) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. Defaults to False.

class nnabla_nas.module.static.BatchNormalization(parents, name='', eval_prob=None, *args, **kwargs)[source]

Bases: BatchNormalization, Module

The BatchNormalization module is the static version of nnabla_nas.modules.BatchNormalization. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

  • n_features (int) – Number of dimentional features.

  • n_dims (int) – Number of dimensions.

  • axes (tuple of int) – Mean and variance for each element in axes are calculated using elements on the rest axes. For example, if an input is 4 dimensions, and axes is [1], batch mean is calculated as np.mean(inp.d, axis=(0, 2, 3), keepdims=True) (using numpy expression as an example).

  • decay_rate (float, optional) – Decay rate of running mean and variance. Defaults to 0.9

  • eps (float, optional) – Tiny value to avoid zero division by std. Defaults to 1e-5.

  • output_stat (bool, optional) – Output batch mean and variance. Defaults to False.

  • fix_parameters (bool) – When set to True, the beta and gamma will not be updated.

  • param_init (dict) –

    Parameter initializers can be set with a dict. A key of the dict must be 'beta', 'gamma', 'mean' or 'var'. A value of the dict must be an Initializer or a numpy.ndarray. E.g.:

    {
        'beta': ConstantIntializer(0),
        'gamma': np.ones(gamma_shape) * 2
    }
    

Returns:

N-D array.

Return type:

Variable

References

Ioffe and Szegedy, Batch Normalization: Accelerating Deep

Network Training by Reducing Internal Covariate Shift. https://arxiv.org/abs/1502.03167

class nnabla_nas.module.static.Collapse(parents, name='')[source]

Bases: Module

The Collapse module removes the last two singleton dimensions of an 4D input. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

call(*inputs)[source]

The input to output mapping of the module. Given some inputs, it constructs the computational graph of this module. This method must be implemented for custom modules.

Parameters:

*input – the output of the parents

Returns:

the output of the module

Return type:

nnabla variable

Examples

>>> out = my_module(inp_a, inp_b)
class nnabla_nas.module.static.Conv(parents, name='', eval_prob=None, *args, **kwargs)[source]

Bases: Conv, Module

The Conv module performs a convolution on the output of its parent. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

  • in_channels (int) – Number of convolution kernels (which is equal to the number of input channels).

  • out_channels (int) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.

  • kernel (tuple of int) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).

  • pad (tuple of int, optional) – Padding sizes for dimensions. Defaults to None.

  • stride (tuple of int, optional) – Stride sizes for dimensions. Defaults to None.

  • dilation (tuple of int, optional) – Dilation sizes for dimensions. Defaults to None.

  • group (int, optional) – Number of groups of channels. This makes connections across channels more sparse by grouping connections along map direction. Defaults to 1.

  • w_init (nnabla.initializer.BaseInitializer or numpy.ndarray, optional) – Initializer for weight. By default, it is initialized with nnabla.initializer.UniformInitializer within the range determined by nnabla.initializer.calc_uniform_lim_glorot.

  • b_init (nnabla.initializer.BaseInitializer or numpy.ndarray, optional) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.

  • base_axis (int, optional) – Dimensions up to base_axis are treated as the sample dimensions. Defaults to 1.

  • fix_parameters (bool, optional) – When set to True, the weights and biases will not be updated. Defaults to False.

  • rng (numpy.random.RandomState, optional) – Random generator for Initializer. Defaults to None.

  • with_bias (bool, optional) – Specify whether to include the bias term. Defaults to True.

  • channel_last (bool, optional) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. Defaults to False.

class nnabla_nas.module.static.Dropout(parents, name='', *args, **kwargs)[source]

Bases: Dropout, Module

The Dropout module is the static version of nnabla_nas.modules.Dropout. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

  • drop_prob (int, optional) – The probability of an element to be zeroed. Defaults to 0.5.

class nnabla_nas.module.static.DwConv(parents, name='', eval_prob=None, *args, **kwargs)[source]

Bases: DwConv, Module

The DwConv module performs a depthwise convolution on the output of its parent. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

  • in_channels (int) – Number of convolution kernels (which is equal to the number of input channels).

  • kernel (tuple of int) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).

  • pad (tuple of int, optional) – Padding sizes for dimensions. Defaults to None.

  • stride (tuple of int, optional) – Stride sizes for dimensions. Defaults to None.

  • dilation (tuple of int, optional) – Dilation sizes for dimensions. Defaults to None.

  • multiplier (int, optional) – Number of output feature maps per input feature map. Defaults to 1.

  • w_init (nnabla.initializer.BaseInitializer or numpy.ndarray, optional) – Initializer for weight. By default, it is initialized with nnabla.initializer.UniformInitializer within the range determined by nnabla.initializer.calc_uniform_lim_glorot.

  • b_init (nnabla.initializer.BaseInitializer or numpy.ndarray, optional) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.

  • base_axis (int, optional) – Dimensions up to base_axis are treated as the sample dimensions. Defaults to 1.

  • fix_parameters (bool, optional) – When set to True, the weights and biases will not be updated. Defaults to False.

  • rng (numpy.random.RandomState, optional) – Random generator for Initializer. Defaults to None.

  • with_bias (bool, optional) – Specify whether to include the bias term. Defaults to True.

References

  1. Chollet: Chollet, Francois. “Xception: Deep Learning with

    Depthwise Separable Convolutions. https://arxiv.org/abs/1610.02357

class nnabla_nas.module.static.GlobalAvgPool(parents, name='', eval_prob=None, *args, **kwargs)[source]

Bases: GlobalAvgPool, Module

The GlobalAvgPool module performs global avg pooling on the output of its parent. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

class nnabla_nas.module.static.Graph(parents=[], name='', eval_prob=None, *args, **kwargs)[source]

Bases: ModuleList, Module

The static version of nnabla_nas.module.ModuleList. A Graph which can contain many modules. A graph can also be used as a module within another graph. Any graph must define self._output, i.e. the StaticModule which acts as the output node of this graph.

get_gv_graph(active_only=True, color_map={<class 'nnabla_nas.module.static.static_module.Join'>: 'blue', <class 'nnabla_nas.module.static.static_module.Merging'>: 'green', <class 'nnabla_nas.module.static.static_module.Zero'>: 'red'})[source]

Construct a graphviz graph object that can be used to visualize the graph.

Parameters:
  • active_only (bool) – whether or not to add inactive modules, i.e., modules which are not part of the computational graph

  • color_map (dict) – the mapping of class instance to vertice color used to visualize the graph.

property output

The output module of this module. If the module is not a graph, it will return self.

Returns:

the output module

Return type:

Module

reset_value()[source]

Resets all self._value, self.need_grad flags and self.shapes

property shape

The output determines the shape of the graph.

class nnabla_nas.module.static.Identity(parents, name='', eval_prob=None, *args, **kwargs)[source]

Bases: Identity, Module

The Identity module does not alter the input. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

Examples

>>> import nnabla as nn
>>> from nnabla_nas.module import static as smo
>>>
>>> nn.Variable((10, 3, 32, 32))
>>>
>>> inp_module = smo.Input(value=input)
>>> identity = smo.Identity(parents=[inp_module])
class nnabla_nas.module.static.Input(value=None, name='', eval_prob=None, *args, **kwargs)[source]

Bases: Module

A static module that can serve as an input, i.e., it has no parents but is provided with a value which it can pass to its children.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

  • value (nnabla variable) – the nnabla variable which serves as the input value

Examples

>>> import nnabla as nn
>>> from nnabla_nas.module import static as smo
>>> input = nn.Variable((10, 3, 32, 32))
>>> inp_module = smo.Input(value=input)
call(*inputs)[source]

The input module returns the plain input variable.

reset_value()[source]

the input module does not reset its value

property value
class nnabla_nas.module.static.Join(parents, join_parameters, name='', mode='linear', *args, **kwargs)[source]

Bases: Module

The Join module is used to fuse the output of multiple parents. It can either superpose them linearly, sample one of the input or select the maximum probable input. It accepts multiple parents. However, the output of all parents must have the same shape.

Parameters:
  • join_parameters (nnabla variable) – a vector containing unnormalized categorical probabilities. It must have the same number of elements as the module has parents. The selection probability of each parent is calculated, using the softmax function.

  • mode (string) – can be ‘linear’/’sample’/’max’. Determines how Join combines the output of the parents.

call(*input)[source]

Aggregates all input tensors to one single input tensor (summing them up)

property mode
class nnabla_nas.module.static.LeakyReLU(parents, name='', eval_prob=None, *args, **kwargs)[source]

Bases: LeakyReLU, Module

The LeakyReLu module is the static version of nnabla_nas.modules.ReLU. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

  • inplace (bool, optional) – can optionally do the operation in-place. Default: False.

class nnabla_nas.module.static.Linear(parents, name='', *args, **kwargs)[source]

Bases: Linear, Module

The Linear module performs an affine transformation on the output of its parent. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

  • in_features (int) – The size of each input sample.

  • in_features – The size of each output sample.

  • base_axis (int, optional) – Dimensions up to base_axis are treated as the sample dimensions. Defaults to 1.

  • w_init (nnabla.initializer.BaseInitializer or numpy.ndarray) – Initializer for weight. By default, it is initialized with nnabla.initializer.UniformInitializer within the range determined by nnabla.initializer.calc_uniform_lim_glorot.

  • b_init (nnabla.initializer.BaseInitializer or numpy.ndarray) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.

  • rng (numpy.random.RandomState) – Random generator for Initializer.

  • with_bias (bool) – Specify whether to include the bias term.

class nnabla_nas.module.static.MaxPool(parents, name='', eval_prob=None, *args, **kwargs)[source]

Bases: MaxPool, Module

The MaxPool module performs max pooling on the output of its parent. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

  • kernel (tuple of int) – Kernel sizes for each spatial axis.

  • stride (tuple of int, optional) – Subsampling factors for each spatial axis. Defaults to None.

  • pad (tuple of int, optional) – Border padding values for each spatial axis. Padding will be added both sides of the dimension. Defaults to (0,) * len(kernel).

  • channel_last (bool) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. Defaults to False.

class nnabla_nas.module.static.Merging(parents, mode, name='', eval_prob=None, axis=1)[source]

Bases: Merging, Module

The Merging module is the static version of nnabla_nas.modules.Merging. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

  • mode (str) – The merging mode (‘concat’, ‘add’).

  • axis (int, optional) – The axis for merging when ‘concat’ is used. Defaults to 1.

class nnabla_nas.module.static.Module(parents=[], name='', eval_prob=None, *args, **kwargs)[source]

Bases: Module

A static module is a module that encodes the graph structure, i.e., it has parents and children. Static modules can be used to define graphs that can run run simple graph optimizations when constructing the nnabla graph.

Parameters:
  • parents (list) – a list of static modules that are parents to this module

  • name (string, optional) – the name of the module

  • eval_prob (nnabla variable, optional) – the evaluation probability of this module

Examples

>>> from nnabla_nas.module import static as smo
>>> class MyModule(smo.Module):
>>>     def __init__(self, parents):
>>>         smo.Module.__init__(self, parents=parents)
>>>         smo.Module.__init__(self, parents=parents)
>>>         self.linear = mo.Linear(in_features=5, out_features=3)
>>>
>>>     def call(self, *input):
>>>         return self.linear(*input)
>>>
>>> module_1 = smo.Module(name='module_1')
>>> module_2 = smo.MyModule(parents=[module_1], name='module_2')
add_child(child)[source]

Adds a static_module as a child to self

Parameters:

child (static_module) – the module to add as a child

call(*inputs)[source]

The input to output mapping of the module. Given some inputs, it constructs the computational graph of this module. This method must be implemented for custom modules.

Parameters:

*input – the output of the parents

Returns:

the output of the module

Return type:

nnabla variable

Examples

>>> out = my_module(inp_a, inp_b)
property children

The child modules

Returns:

the children of the module

Return type:

list

property eval_prob

The evaluation probability of this module. It is 1.0 if not specified otherwise.

Returns:

the evaluation probability

Return type:

nnabla variable

property input_shapes

A list of input shapes of this module, i.e., the output shapes of all parent modules.

Returns:

a list of tuples storing the

output shape of all parent modules

Return type:

list

property name

The name of the module.

Returns:

the name of the module

Return type:

string

property output

The output module of this module. If the module is not a graph, it will return self.

Returns:

the output module

Return type:

Module

property parents

The parents of the module

Returns:

the parents of the module

Return type:

list

reset_value()[source]

Resets all self._value, self.need_grad flags and self.shapes

property shape

The output shape of the static_module.

Returns:

the shape of the output tensor

Return type:

tuple

class nnabla_nas.module.static.ReLU(parents, name='', eval_prob=None, *args, **kwargs)[source]

Bases: ReLU, Module

The ReLu module is the static version of nnabla_nas.modules.ReLU. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

  • inplace (bool, optional) – can optionally do the operation in-place. Default: False.

class nnabla_nas.module.static.ReLU6(parents, name='', eval_prob=None, *args, **kwargs)[source]

Bases: ReLU6, Module

The ReLu6 module is the static version of nnabla_nas.modules.ReLU. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

  • inplace (bool, optional) – can optionally do the operation in-place. Default: False.

class nnabla_nas.module.static.Zero(parents, name='', eval_prob=None, *args, **kwargs)[source]

Bases: Zero, Module

The Zero module returns a tensor with zeros, which has the same shape as the ouput of its parent. It accepts only a single parent.

Parameters:
  • parents (list) – the parents of this module

  • name (string) – the name of this module

Examples

>>> my_module = Zero(parents=[...], name='my_module')
call(*inputs)[source]

The input to output mapping of the module. Given some inputs, it constructs the computational graph of this module. This method must be implemented for custom modules.

Parameters:

*input – the output of the parents

Returns:

the output of the module

Return type:

nnabla variable

Examples

>>> out = my_module(inp_a, inp_b)

nnabla_nas.runner

class nnabla_nas.runner.runner.Runner(model, optimizer, regularizer, dataloader, hparams, args)[source]

Bases: ABC

Runner is a basic class for training a model.

You can adapt this class for your own runner by reimplementing the abstract methods of this class.

Parameters:
  • model (nnabla_nas.contrib.model.Model) – The search model used to search the architecture.

  • optimizer (dict) – This stores optimizers for both train and valid graphs. Must only store instances of Optinmizer

  • regularizer (dict) – This stores regularizers such as the latency and memory estimators

  • dataloader (dict) – This stores dataloaders for both train and valid graphs.

  • hparams (Configuration) – This stores all hyperparmeters used during training.

  • args (Configuration) – This stores other variables used during for training: event, communicator, output_path…

abstract callback_on_epoch_end()[source]

Calls this after one epoch.

abstract callback_on_finish()[source]

Calls this on finishing the run method.

abstract callback_on_start()[source]

Calls this on starting the run method.

property fast_mode
load_checkpoint()[source]
abstract run()[source]

Run the training process.

save_checkpoint(checkpoint_info={})[source]

Save the current states of the runner.

abstract train_on_batch(key='train')[source]

Runs the model update on a single batch of train data.

update_graph(key='train')[source]

Builds the graph and update the placeholder.

Parameters:

key (str, optional) – Type of graph. Defaults to ‘train’.

abstract valid_on_batch()[source]

Runs the model update on a single batch of valid data.

Searcher

class nnabla_nas.runner.searcher.search.Searcher(model, optimizer, regularizer, dataloader, hparams, args)[source]

Bases: Runner

Searching the best architecture.

callback_on_epoch_end()[source]

Calls this after one epoch.

callback_on_finish()[source]

Calls this on finishing the training.

callback_on_start()[source]

Calls this on starting the training.

run()[source]

Run the training process.

DartsSearcher

class nnabla_nas.runner.searcher.darts.DartsSearcher(model, optimizer, regularizer, dataloader, hparams, args)[source]

Bases: Searcher

An implementation of DARTS: Differentiable Architecture Search.

callback_on_start()[source]

Builds the graphs and assigns parameters to the optimizers.

train_on_batch(key='train')[source]

Updates the model parameters.

valid_on_batch()[source]

Updates the architecture parameters.

ProxylessNasSearcher

class nnabla_nas.runner.searcher.pnas.ProxylessNasSearcher(model, optimizer, regularizer, dataloader, hparams, args)[source]

Bases: Searcher

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware.

callback_on_start()[source]

Gets the architecture parameters.

train_on_batch(key='train')[source]

Update the model parameters.

valid_on_batch()[source]

Update the arch parameters.

FairNasSearcher

class nnabla_nas.runner.searcher.fairnas.FairNasSearcher(model, optimizer, regularizer, dataloader, hparams, args)[source]

Bases: Searcher

An implementation of FairNAS.

callback_on_epoch_end()[source]

Calls this after one epoch.

callback_on_finish()[source]

Calls this on finishing the training.

callback_on_start()[source]

Calls this on starting the training.

run()[source]

Run the training process.

search_arch(sample_id=0)[source]

Validate an acrchitecture from the search space.

train_on_batch()[source]

Update the model parameters.

valid_on_batch(key='valid')[source]

validate an architecture from the search space

OFASearcher

class nnabla_nas.runner.searcher.ofa.OFASearcher(model, optimizer, regularizer, dataloader, hparams, args)[source]

Bases: Searcher

An implementation of OFA.

callback_on_epoch_end(epoch=None, is_test=False, info=None)[source]

Calls this after one epoch.

callback_on_finish()[source]

Calls this on finishing the training.

callback_on_start()[source]

Calls this on starting the training.

get_net_parameters_with_keys(keys, mode='include', grad_only=False)[source]

Returns an OrderedDict containing model parameters.

Parameters:
  • keys (list of str) – Patterns of parameters to be considered for inclusion or exclusion. Note: Keys passed must be in regular expression format.

  • mode (str, optional) – Mode of getting network parameters with keys. - Selects parameters satisfying the keys if mode==’include’ - Selects parameters not satisfying the keys if mode==’exclude’ Choices: [‘include’, ‘exclude’]. Defaults to ‘include’.

  • grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.

Returns:

A dictionary containing parameters.

Return type:

OrderedDict

reset_running_statistics(net=None, subset_size=2000, subset_batch_size=200, dataloader=None, dataloader_batch_size=None, inp_shape=None)[source]
run()[source]

Run the training process.

train_on_batch(epoch, n_iter, key='train')[source]

Update the model parameters.

update_graph(key='train')[source]

Builds the graph and update the placeholder.

Parameters:

key (str, optional) – Type of graph. Defaults to ‘train’.

valid_genotypes(mode='valid')[source]
valid_on_batch(is_test=False)[source]

Updates the architecture parameters.

Trainer

class nnabla_nas.runner.trainer.OFATrainer(model, optimizer, regularizer, dataloader, hparams, args)[source]

Bases: Runner

Trainer class for OFA

callback_on_epoch_end()[source]

Calculates the metric and saves the best parameters.

callback_on_finish()[source]

Calls this on finishing the run method.

callback_on_start()[source]

Builds the graphs and assigns parameters to the optimizers.

get_net_parameters_with_keys(keys, mode='include', grad_only=False)[source]
reset_running_statistics()[source]
run()[source]

Run the training process.

train_on_batch(key='train')[source]

Updates the model parameters.

update_graph(key)[source]

Builds the graph and update the placeholder.

Parameters:

key (str, optional) – Type of graph. Defaults to ‘train’.

valid_on_batch()[source]

Runs the model update on a single batch of valid data.

class nnabla_nas.runner.trainer.Trainer(model, optimizer, regularizer, dataloader, hparams, args)[source]

Bases: Runner

Trainer class is a basic class for training a network.

callback_on_epoch_end()[source]

Calculates the metric and saves the best parameters.

callback_on_finish()[source]

Calls this on finishing the run method.

callback_on_start()[source]

Builds the graphs and assigns parameters to the optimizers.

run()[source]

Run the training process.

train_on_batch(key='train')[source]

Updates the model parameters.

valid_on_batch()[source]

Runs the validation.

nnabla_nas.utils

Profiler

nnabla_nas.utils.data.transforms.CIFAR10_transform(key='train')[source]

Return a transform applied to data augmentation for CIFAR10.

class nnabla_nas.utils.data.transforms.CenterCrop[source]

Bases: object

class nnabla_nas.utils.data.transforms.Compose(transforms)[source]

Bases: object

Composes several transforms together.

Parameters:

transforms (list of Transform objects) – list of transforms to compose.

append(transform)[source]

Appends a transfomer to the end.

Parameters:

transform (Transformer) – The transforme to append.

class nnabla_nas.utils.data.transforms.Cutout(length, prob=0.5, seed=-1)[source]

Bases: object

Cutout layer.

Cutout is a simple regularization technique for convolutional neural networks that involves removing contiguous sections of input images, effectively augmenting the dataset with partially occluded versions of existing samples.

Parameters:
  • length (int) – The lenth of region, which will be cutout.

  • prob (float, optional) – Probability of earasing. Defaults to 0.5.

References

[1] DeVries, Terrance, and Graham W. Taylor. “Improved regularization

of convolutional neural networks with cutout.” arXiv preprint arXiv:1708.04552 (2017).

nnabla_nas.utils.data.transforms.ImageNet_transform(key='train')[source]

Return a transform applied to data augmentation for ImageNet.

class nnabla_nas.utils.data.transforms.Lambda(func)[source]

Bases: object

Apply a user-defined lambda as a transform.

Parameters:

func (function) – Lambda/function to be used for transform.

class nnabla_nas.utils.data.transforms.Normalize(mean, std, scale)[source]

Bases: object

Normalizes a input image with mean and standard deviation.

Given mean: (M1,...,Mn) and std: (S1,..,Sn) for n channels, this transform will normalize each channel of the input image i.e. input[channel] = (input[channel] - mean[channel]) / std[channel]

Parameters:
  • mean (sequence) – Sequence of means for each channel.

  • std (sequence) – Sequence of standard deviations for each channel.

  • scale (float) – Scales the inputs by a scalar.

class nnabla_nas.utils.data.transforms.RandomCrop(shape, pad_width=None)[source]

Bases: object

RandomCrop randomly extracts a portion of an array.

Parameters:
  • shape ([type]) – [description]

  • pad_width (tuple of int, optional) – Iterable of before and after pad values. Defaults to None. Pad the input N-D array x over the number of dimensions given by half the length of the pad_width iterable, where every two values in pad_width determine the before and after pad size of an axis. The pad_width iterable must hold an even number of positive values which may cover all or fewer dimensions of the input variable x.

class nnabla_nas.utils.data.transforms.RandomHorizontalFlip[source]

Bases: object

Horizontally flip the given Image randomly with a probability 0.5.

class nnabla_nas.utils.data.transforms.RandomResizedCrop(shape, scale=None, ratio=None, interpolation='linear')[source]

Bases: object

Crop a random portion of image and resize it.

Parameters:
  • shape (tuple of int) – The output image shape.

  • scale (tuple of float) – lower and upper scale ratio when randomly scaling the image.

  • ratio (float) – The aspect ratio range when randomly deforming the image. For example, to deform aspect ratio of image from 1:1.3 to 1.3:1, specify “1.3”. To not apply random deforming, specify “1.0”.

  • interpolation (str) – Interpolation mode chosen from (‘linear’|’nearest’). The default is ‘linear’.

class nnabla_nas.utils.data.transforms.RandomRotation[source]

Bases: object

class nnabla_nas.utils.data.transforms.RandomVerticalFlip[source]

Bases: object

Vertically flip the given PIL Image randomly with a probability 0.5.

class nnabla_nas.utils.data.transforms.Resize(size, interpolation='linear')[source]

Bases: object

Resize an ND array with interpolation.

Parameters:
  • size (tuple of int) – The output sizes for axes. If this is given, the scale factors are determined by the output sizes and the input sizes.

  • interpolation (str) – Interpolation mode chosen from (‘linear’|’nearest’). The default is ‘linear’.

nnabla_nas.utils.data.transforms.none_transform(key='train')[source]

Return a null transform (passthrough, no transformation done) applied to data augmentation.

nnabla_nas.utils.data.transforms.normalize_0mean_1std_8bitscaling_transform(key='train')[source]

Return a zero mean, one std normalization, 8 bit scaling transform applied to data augmentation.

Estimator

class nnabla_nas.utils.estimator.estimator.Estimator[source]

Bases: object

Estimator base class.

get_estimation(module)[source]

Returns the estimation of the whole module.

property memo
predict(module)[source]

Predicts the estimation for a given module.

reset()[source]

Clear cache.

SummaryWriter

class nnabla_nas.utils.tensorboard.writer.FileWriter(log_dir, max_queue=10, flush_secs=120, filename_suffix='')[source]

Bases: object

Write protocol buffers to event files.

Parameters:
  • log_dir (str) – Directory where event file will be written.

  • max_queue (int, optional) – Size of the queue for pending events and summaries before one of the ‘add’ calls forces a flush to disk. Defaults to 10.

  • flush_secs (int, optional) – How often, in seconds, to flush the pending events and summaries to disk. Defaults to every two minutes (120s).

  • filename_suffix (str, optional) – Suffix added to all event filenames in the log_dir directory.

add_event(event, step=None, walltime=None)[source]

Adds an event to the event file.

Parameters:
  • event – An Event protocol buffer.

  • step (int, optional) – Optional global step value for training process to record with the event.

  • walltime – float. Optional walltime to override the default (current) walltime (from time.time()) seconds after epoch.

add_graph(graph_profile, walltime=None)[source]

Adds a Graph and step stats protocol buffer to the event file.

Parameters:
  • graph_profile – A Graph and step stats protocol buffer.

  • walltime (float, optional) – Optional walltime to override the default (current) walltime (from time.time()) seconds after epoch.

add_summary(summary, global_step=None, walltime=None)[source]

Adds a Summary protocol buffer to the event file.

Parameters:
  • summary – A Summary protocol buffer.

  • global_step (int, optional) – Optional global step value for training process to record with the summary.

  • walltime (float, optional) – Optional walltime to override the default (current) walltime (from time.time()) seconds after epoch.

close()[source]

Flushes the event file to disk and close the file.

flush()[source]

Flushes the event file to disk.

get_logdir()[source]

Returns the directory where event file will be written.

reopen()[source]

Reopens the EventFileWriter.

class nnabla_nas.utils.tensorboard.writer.SummaryWriter(log_dir=None, comment='', purge_step=None, max_queue=10, flush_secs=120, filename_suffix='')[source]

Bases: object

Creates a SummaryWriter that will write out events and summaries to the event file.

Parameters:
  • log_dir (string) – Save directory location. Default is runs/CURRENT_DATETIME_HOSTNAME, which changes after each run. Use hierarchical folder structure to compare between runs easily. e.g. pass in ‘runs/exp1’, ‘runs/exp2’, etc. for each new experiment to compare across them.

  • comment (string) – Comment log_dir suffix appended to the default log_dir. If log_dir is assigned, this argument has no effect.

  • purge_step (int) – Note that crashed and resumed experiments should have the same log_dir.

  • max_queue (int) – Size of the queue for pending events and summaries before one of the ‘add’ calls forces a flush to disk. Default is ten items.

  • flush_secs (int) – How often, in seconds, to flush the pending events and summaries to disk. Default is every two minutes.

  • filename_suffix (string) – Suffix added to all event filenames in the log_dir directory. More details on filename construction in tensorboard.summary.writer.event_file_writer.EventFileWriter.

add_graph(model, *args, **kargs)[source]
add_image(tag, img, global_step=None, walltime=None)[source]

Add an image.

add_scalar(tag, scalar_value, global_step=None, walltime=None)[source]

Add a scalar value.

close()[source]
flush()[source]

Flushes the event file to disk. Call this method to make sure that all pending events have been written to disk.

nnabla_nas.contrib

DARTS

class nnabla_nas.contrib.classification.darts.SearchNet(in_channels, init_channels, num_cells, num_classes, num_choices=4, multiplier=4, mode='full', shared=False, stem_multiplier=3)[source]

Bases: ClassificationModel

DARTS: Differentiable Architecture Search.

This is the search space for DARTS.

Parameters:
  • in_channels (int) – The number of input channels.

  • init_channels (int) – The initial number of channels on each cell.

  • num_cells (int) – The number of cells.

  • num_classes (int) – The number of classes.

  • num_choices (int, optional) – The number of choice blocks on each cell. Defaults to 4.

  • multiplier (int, optional) – The multiplier. Defaults to 4.

  • mode (str, optional) – The sampling strategy (‘full’, ‘max’, ‘sample’). Defaults to ‘full’.

  • shared (bool, optional) – If parameters are shared between cells. Defaults to False.

  • stem_multiplier (int, optional) – The multiplier used for stem convolution. Defaults to 3.

call(input)[source]

Implement the call of module. Inputs should only be Variables.

get_arch_parameters(grad_only=False)[source]

Returns an OrderedDict containing architecture parameters.

Parameters:

grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.

Returns:

A dictionary containing parameters.

Return type:

OrderedDict

get_net_parameters(grad_only=False)[source]

Returns an OrderedDict containing model parameters.

Parameters:

grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.

Returns:

A dictionary containing parameters.

Return type:

OrderedDict

loss(outputs, targets, loss_weights=None)[source]

Return loss computed from a list of outputs and list of targets.

Parameters:
  • outputs (list of nn.Variable) – A list of output variables computed from the model.

  • targets (list of nn.Variable) – A list of target variables loaded from the data.

  • loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.

Returns:

A scalar NNabla Variable represents the loss.

Return type:

nn.Variable

save_net_nnp(path, inp, out, calc_latency=False, func_real_latency=None, func_accum_latency=None, save_params=None)[source]

Saves whole net as one nnp Calc whole net (real) latency (using e.g.Nnabla’s [Profiler]) Calculate also layer-based latency The modules are discovered using the nnabla graph of the whole net The latency is then calculated based on each individual module’s nnabla graph (e.g. [LatencyGraphEstimator])

Parameters:
  • path – absolute path

  • inp – input of the created network

  • out – output of the created network

  • calc_latency – flag for calc latency

  • func_real_latency – function to use to calc actual latency

  • func_accum_latency – function to use to calc accum. latency, this is, dissecting the network layer by layer using the graph of the network, calculate the latency for each layer and add up all these results.

save_parameters(path=None, params=None, grad_only=False)[source]

Saves the parameters to a file.

Parameters:
  • path (str) – Absolute path to file.

  • params (OrderedDict, optional) – An OrderedDict containing parameters. If params is None, then the current parameters will be saved.

  • grad_only (bool, optional) – If need_grad=True is required for parameters which will be saved. Defaults to False.

summary()[source]

Summary of the model.

visualize(path)[source]

Save visualized graph to a file.

Parameters:

path (str) – Path to directory to save.

class nnabla_nas.contrib.classification.darts.TrainNet(in_channels, init_channels, num_cells, num_classes, genotype, num_choices=4, multiplier=4, stem_multiplier=3, drop_path=0, auxiliary=False)[source]

Bases: ClassificationModel

TrainNet used for DARTS.

call(input)[source]

Implement the call of module. Inputs should only be Variables.

loss(outputs, targets, loss_weights=None)[source]

Return loss computed from a list of outputs and list of targets.

Parameters:
  • outputs (list of nn.Variable) – A list of output variables computed from the model.

  • targets (list of nn.Variable) – A list of target variables loaded from the data.

  • loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.

Returns:

A scalar NNabla Variable represents the loss.

Return type:

nn.Variable

MobileNet V2

class nnabla_nas.contrib.classification.mobilenet.network.SearchNet(name='', num_classes=1000, width_mult=1, settings=None, drop_rate=0, candidates=None, mode='sample', skip_connect=True)[source]

Bases: ClassificationModel

MobileNet V2 search space.

This implementation is based on the PyTorch implementation.

Parameters:
  • num_classes (int) – Number of classes

  • width_mult (float, optional) – Width multiplier - adjusts number of channels in each layer by this amount

  • settings (list, optional) – Network structure. Defaults to None.

  • drop_rate (float, optional) – Drop rate used in Dropout. Defaults to 0.

  • candidates (list of str, optional) – A list of candicates. Defaults to None.

  • skip_connect (bool, optional) – Whether the skip connect is used. Defaults to True.

References

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.C., 2018.

Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4510-4520).

call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

get_arch_parameters(grad_only=False)[source]

Returns an OrderedDict containing architecture parameters.

Parameters:

grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.

Returns:

A dictionary containing parameters.

Return type:

OrderedDict

get_net_modules(active_only=False)[source]
get_net_parameters(grad_only=False)[source]

Returns an OrderedDict containing model parameters.

Parameters:

grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.

Returns:

A dictionary containing parameters.

Return type:

OrderedDict

loss(outputs, targets, loss_weights=None)[source]

Return loss computed from a list of outputs and list of targets.

Parameters:
  • outputs (list of nn.Variable) – A list of output variables computed from the model.

  • targets (list of nn.Variable) – A list of target variables loaded from the data.

  • loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.

Returns:

A scalar NNabla Variable represents the loss.

Return type:

nn.Variable

property modules_to_profile

Returns a list with the modules that will be profiled when the Profiler functions are called. All other modules in the network will not be profiled

summary()[source]

Returns a string summarizing the model.

visualize(path)[source]

Save visualized graph to a file.

Parameters:

path (str) – Path to directory to save.

class nnabla_nas.contrib.classification.mobilenet.network.TrainNet(num_classes=1000, width_mult=1, settings=None, drop_rate=0, candidates=None, mode='sample', skip_connect=True, genotype=None)[source]

Bases: SearchNet

MobileNet V2 Train Net.

Parameters:
  • num_classes (int) – Number of classes

  • width_mult (float, optional) – Width multiplier - adjusts number of channels in each layer by this amount

  • settings (list, optional) – Network structure. Defaults to None.

  • round_nearest (int, optional) – Round the number of channels in each layer to be a multiple of this number. Set to 1 to turn off rounding.

  • n_max (int, optional) – The number of blocks. Defaults to 4.

  • block – Module specifying inverted residual building block for mobilenet. Defaults to None.

  • mode (str, optional) – The sampling strategy (‘full’, ‘max’, ‘sample’). Defaults to ‘full’.

  • skip_connect (bool, optional) – Whether the skip connect is used. Defaults to True.

  • genotype (str, optional) – The path to architecture file. Defaults to None.

References

[1] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.C.,

2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4510-4520).

nnabla_nas.contrib.classification.mobilenet.network.label_smoothing_loss(pred, label, label_smoothing=0.1)[source]

Random Wired

class nnabla_nas.contrib.classification.random_wired.random_wired.AvgPool2x2(parents, channels, name='')[source]

Bases: RandomModule

A avg pooling module that accepts multiple parents. This pooling module is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.

Parameters:
  • parents (list) – the parent modules to this module

  • name (string, optional) – the name of the module

  • channels (int) – ignored

class nnabla_nas.contrib.classification.random_wired.random_wired.Conv(parents, channels, kernel, pad, name='')[source]

Bases: RandomModule

A convolution that accepts multiple parents. This convolution is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.

Parameters:
  • parents (list) – the parent modules to this module

  • name (string, optional) – the name of the module

  • channels (int) – the number of output channels of this module

  • kernel (tuple) – the kernel shape

  • pad (tuple) – the padding scheme used

class nnabla_nas.contrib.classification.random_wired.random_wired.Conv3x3(parents, channels, name='')[source]

Bases: Conv

A convolution of shape 3x3 that accepts multiple parents. This convolution is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.

Parameters:
  • parents (list) – the parent modules to this module

  • name (string, optional) – the name of the module

  • channels (int) – the number of output channels of this module

class nnabla_nas.contrib.classification.random_wired.random_wired.Conv5x5(parents, channels, name='')[source]

Bases: Conv

A convolution of shape 5x5 that accepts multiple parents. This convolution is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.

Parameters:
  • parents (list) – the parent modules to this module

  • name (string, optional) – the name of the module

  • channels (int) – the number of output channels of this module

class nnabla_nas.contrib.classification.random_wired.random_wired.MaxPool2x2(parents, channels, name='')[source]

Bases: RandomModule

A max pooling module that accepts multiple parents. This pooling module is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.

Parameters:
  • parents (list) – the parent modules to this module

  • name (string, optional) – the name of the module

  • channels (int) – ignored

class nnabla_nas.contrib.classification.random_wired.random_wired.RandomModule(parents, channels, name='')[source]

Bases: Graph

A module that automatically aggregates all the output tensors generated by its parents. Therefore, we automatically adjusts the input channel count and the input feature map dimensions of each input through 1x1 convolution and pooling. The result is summed up. Please refer to [Xie et. al]

Parameters:
  • parents (list) – the parent modules to this module

  • name (string, optional) – the name of the module

  • channels (int) – the number of output channels of this module

References

  • Xie, Saining, et al. “Exploring randomly wired neural networks for image recognition.” Proceedings of the IEEE International Conference on Computer Vision. 2019.

class nnabla_nas.contrib.classification.random_wired.random_wired.SepConv(parents, channels, kernel, pad, name='')[source]

Bases: RandomModule

A separable convolution that accepts multiple parents. This convolution is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.

Parameters:
  • parents (list) – the parent modules to this module

  • name (string, optional) – the name of the module

  • channels (int) – the number of output channels of this module

  • kernel (tuple) – the kernel shape

  • pad (tuple) – the padding scheme used

class nnabla_nas.contrib.classification.random_wired.random_wired.SepConv3x3(parents, channels, name='')[source]

Bases: SepConv

A separable convolution of shape 3x3 that accepts multiple parents. This convolution is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.

Parameters:
  • parents (list) – the parent modules to this module

  • name (string, optional) – the name of the module

  • channels (int) – the number of output channels of this module

class nnabla_nas.contrib.classification.random_wired.random_wired.SepConv5x5(parents, channels, name='')[source]

Bases: SepConv

A separable convolution of shape 5x5 that accepts multiple parents. This convolution is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.

Parameters:
  • parents (list) – the parent modules to this module

  • name (string, optional) – the name of the module

  • channels (int) – the number of output channels of this module

class nnabla_nas.contrib.classification.random_wired.random_wired.TrainNet(n_vertices=20, input_shape=(3, 32, 32), n_classes=10, candidates=[<class 'nnabla_nas.contrib.classification.random_wired.random_wired.RandomModule'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.SepConv3x3'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.SepConv5x5'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.RandomModule'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.SepConv3x3'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.SepConv5x5'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.RandomModule'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.SepConv3x3'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.SepConv5x5'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.MaxPool2x2'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.AvgPool2x2'>], min_channels=128, max_channels=1024, k=4, p=0.75, name='')[source]

Bases: ClassificationModel, Graph

A randomly wired DNN that uses the Watts-Strogatz process to generate random DNN architectures. Please refer to [Xie et. al]

Parameters:
  • n_vertice (int) – the number of random modules within this network

  • input_shape (tuple) – the shape of the input of this network

  • n_classes (int) – the number of output classes of this network

  • candidates (list) – a list of random_modules which are randomly instantiated as vertices

  • min_channels (int) – the minimum channel count of a vertice

  • max_channels (int) – the maximum channel count of a vertice

  • k (int) – the connectivity parameter of the Watts-Strogatz process

  • p (float) – the re-wiring probability parameter of the Watts-Strogatz process

  • name (string) – the name of the network

References

  • Xie, Saining, et al. “Exploring randomly wired neural networks

    for image recognition.” Proceedings of the IEEE International Conference on Computer Vision. 2019.

get_arch_modules()[source]
get_arch_parameters(grad_only=False)[source]
Returns an OrderedDict containing all architecture parameters of

the model.

Parameters:

grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True will be retrieved. Defaults to False.

Raises:

NotImplementedError – [description]

get_net_modules(active_only=False)[source]
get_net_parameters(grad_only=False)[source]

Returns an OrderedDict containing all network parmeters of the model.

Parameters:

grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True will be retrieved. Defaults to False.

Raises:

NotImplementedError

property input_shapes

Return a list of input shapes used during call function.

property modules_to_profile

Returns a list with the modules that will be profiled when the Profiler functions are called. All other modules in the network will not be profiled

save_graph(path)[source]

save whole network/graph (in a PDF file) :param path:

summary()[source]

Summary of the model.

Zoph

class nnabla_nas.contrib.classification.zoph.zoph.AveragePool3x3(parents, channels, name='', eval_prob=None)[source]

Bases: Graph

A static average pooling of size 3x3 followed by batch normalization and ReLU

Parameters:
  • parents (list) – a list of static modules that are parents to this module

  • channels (int) – the number of features

  • name (string, optional) – the name of the module

class nnabla_nas.contrib.classification.zoph.zoph.DilSepConv3x3(parents, channels, name='', eval_prob=None)[source]

Bases: SepConvBN

A static dilated separable convolution of shape 3x3 that applies batchnorm and relu at the end.

Parameters:
  • parents (list) – a list of static modules that are parents to this module

  • channels (int) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.

  • name (string, optional) – the name of the module

class nnabla_nas.contrib.classification.zoph.zoph.DilSepConv5x5(parents, channels, name='', eval_prob=None)[source]

Bases: SepConvBN

A static dilated separable convolution of shape 5x5 that applies batchnorm and relu at the end.

Parameters:
  • parents (list) – a list of static modules that are parents to this module

  • channels (int) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.

  • name (string, optional) – the name of the module

class nnabla_nas.contrib.classification.zoph.zoph.MaxPool3x3(parents, channels, name='', eval_prob=None)[source]

Bases: Graph

A static max pooling of size 3x3 followed by batch normalization and ReLU

Parameters:
  • parents (list) – a list of static modules that are parents to this module

  • channels (int) – the number of features

  • name (string, optional) – the name of the module

class nnabla_nas.contrib.classification.zoph.zoph.SearchNet(name='', input_shape=(3, 32, 32), n_classes=10, stem_channels=128, cells=[<class 'nnabla_nas.contrib.classification.zoph.zoph.ZophCell'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.ZophCell'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.ZophCell'>], cell_depth=[7, 7, 7], cell_channels=[128, 256, 512], reducing=[False, True, True], join_parameters=[[None, None, None, None, None, None, None], [None, None, None, None, None, None, None], [None, None, None, None, None, None, None]], candidates=[<class 'nnabla_nas.contrib.classification.zoph.zoph.SepConv3x3'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.SepConv5x5'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.DilSepConv3x3'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.DilSepConv5x5'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.MaxPool3x3'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.AveragePool3x3'>, <class 'nnabla_nas.module.static.static_module.Identity'>, <class 'nnabla_nas.module.static.static_module.Zero'>], mode='sample')[source]

Bases: ClassificationModel, Graph

A search space as defined in [Bender et. al]

Parameters:
  • name (string, optional) – the name of the module

  • input_shape (tuple) – the shape of the network input

  • n_classes (int) – the number of output classes

  • stem_channels (int) – the number of channels for the stem convolutions

  • cells (list) – the type of the cells used within this search space

  • cell_depth (list) – the number of modules within each cell

  • reducing (list) – specifies for each cell if it reduces the feature map dimensions through pooling

  • join_parameters (list) – the join_parameters used in each cell and block.

  • candidates (list, optional) – the candidate modules instantiated within this block (e.g. ZOPH_CANDIDATES)

  • mode (string) – the mode which the join modules within this network use

References

Bender, Gabriel. “Understanding and simplifying one-shot

architecture search.” (2019).

get_arch_modules()[source]
get_arch_parameters(grad_only=False)[source]
Returns an OrderedDict containing all architecture parameters of

the model.

Parameters:

grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True will be retrieved. Defaults to False.

Raises:

NotImplementedError – [description]

get_net_modules(active_only=False)[source]
get_net_parameters(grad_only=False)[source]

Returns an OrderedDict containing all network parmeters of the model.

Parameters:

grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True will be retrieved. Defaults to False.

Raises:

NotImplementedError

property input_shapes

Return a list of input shapes used during call function.

property modules_to_profile

Returns a list with the modules that will be profiled when the Profiler functions are called. All other modules in the network will not be profiled

save_graph(path)[source]

save whole network/graph (in a PDF file) :param path:

summary()[source]

Summary of the model.

class nnabla_nas.contrib.classification.zoph.zoph.SepConv(parents, in_channels, out_channels, kernel, pad, dilation, with_bias, name='', eval_prob=None)[source]

Bases: Graph

A static separable convolution (DepthWise conv + PointWise conv)

Parameters:
  • parents (list) – a list of static modules that are parents to this module

  • in_channels (int) – Number of convolution kernels (which is equal to the number of input channels).

  • out_channels (int) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.

  • kernel (tuple of int) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).

  • pad (tuple of int, optional) – Padding sizes for dimensions. Defaults to None.

  • dilation (tuple of int, optional) – Dilation sizes for dimensions. Defaults to None.

  • with_bias (bool, optional) – Specify whether to include the bias term. Defaults to True.

  • name (string, optional) – the name of the module

class nnabla_nas.contrib.classification.zoph.zoph.SepConv3x3(parents, channels, name='', eval_prob=None)[source]

Bases: SepConvBN

A static separable convolution of shape 3x3 that applies batchnorm and relu at the end.

Parameters:
  • parents (list) – a list of static modules that are parents to this module

  • channels (int) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.

  • name (string, optional) – the name of the module

class nnabla_nas.contrib.classification.zoph.zoph.SepConv5x5(parents, channels, name='', eval_prob=None)[source]

Bases: SepConvBN

A static separable convolution of shape 5x5 that applies batchnorm and relu at the end.

Parameters:
  • parents (list) – a list of static modules that are parents to this module

  • channels (int) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.

  • name (string, optional) – the name of the module

class nnabla_nas.contrib.classification.zoph.zoph.SepConvBN(parents, out_channels, kernel, dilation, name='', eval_prob=None)[source]

Bases: Graph

Two static separable convolutions followed by batchnorm and relu at the end.

Parameters:
  • parents (list) – a list of static modules that are parents to this module

  • out_channels (int) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.

  • kernel (tuple of int) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).

  • dilation (tuple of int, optional) – Dilation sizes for dimensions. Defaults to None.

  • name (string, optional) – the name of the module

class nnabla_nas.contrib.classification.zoph.zoph.TrainNet(name, input_shape=(3, 32, 32), n_classes=10, stem_channels=128, cells=[<class 'nnabla_nas.contrib.classification.zoph.zoph.ZophCell'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.ZophCell'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.ZophCell'>], cell_depth=[7, 7, 7], cell_channels=[128, 256, 512], reducing=[False, True, True], join_parameters=[[None, None, None, None, None, None, None], [None, None, None, None, None, None, None], [None, None, None, None, None, None, None]], candidates=[<class 'nnabla_nas.contrib.classification.zoph.zoph.SepConv3x3'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.SepConv5x5'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.DilSepConv3x3'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.DilSepConv5x5'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.MaxPool3x3'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.AveragePool3x3'>, <class 'nnabla_nas.module.static.static_module.Identity'>, <class 'nnabla_nas.module.static.static_module.Zero'>], param_path=None, *args, **kwargs)[source]

Bases: SearchNet

A search space as defined in [Bender et. al]. Its the same as SearchNet, just that mode is fixed to ‘max’.

Parameters:
  • name (string, optional) – the name of the module

  • input_shape (tuple) – the shape of the network input

  • n_classes (int) – the number of output classes

  • stem_channels (int) – the number of channels for the stem convolutions

  • cells (list) – the type of the cells used within this search space

  • cell_depth (list) – the number of modules within each cell

  • reducing (list) – specifies for each cell if it reduces the feature map dimensions through pooling

  • join_parameters (list) – the join_parameters used in each cell and block

  • candidates (list, optional) – the candidate modules instantiated within this block (e.g. ZOPH_CANDIDATES)

  • mode (string) – the mode which the join modules within this network use

References

Bender, Gabriel. “Understanding and simplifying one-shot

architecture search.” (2019).

class nnabla_nas.contrib.classification.zoph.zoph.ZophBlock(parents, candidates, channels, name='', join_parameters=None)[source]

Bases: Graph

A zoph block as defined in [Bender et. al]

Parameters:
  • parents (list) – a list of static modules that are parents to this module

  • name (string, optional) – the name of the module

  • candidates (list) – the candidate modules instantiated within this block (e.g. ZOPH_CANDIDATES)

  • channels (int) – the number of output channels of this block

  • join_parameters (nnabla variable, optional) – the architecture parameters used to join the outputs of the candidate modules. join_parameters must have the same number of elements as we have candidates.

References

Bender, Gabriel. “Understanding and simplifying one-shot

architecture search.” (2019).

class nnabla_nas.contrib.classification.zoph.zoph.ZophCell(parents, candidates, channels, name='', n_modules=3, reducing=False, join_parameters=[None, None, None])[source]

Bases: Graph

A zoph cell that consists of multiple zoph blocks, as defined in [Bender et. al]

Parameters:
  • parents (list) – a list of static modules that are parents to this module

  • name (string, optional) – the name of the module

  • candidates (list) – the candidate modules instantiated within this block (e.g. ZOPH_CANDIDATES)

  • channels (int) – the number of output channels of this block

  • join_parameters (list of nnabla variable, optional) – lift of the architecture parameters used to join the outputs of the candidate modules. Each element in join_parameters must have the same number of elements as we have candidates. The length of this list must be n_modules.

References

Bender, Gabriel. “Understanding and simplifying one-shot

architecture search.” (2019).

FairNas

class nnabla_nas.contrib.classification.fairnas.SearchNet(num_classes=1000, width_mult=1, settings=None, drop_rate=0, candidates=None, skip_connect=True, weights=None, seed=123)[source]

Bases: ClassificationModel

MobileNet V2 search space.

This implementation is based on the PyTorch implementation.

Parameters:
  • num_classes (int) – Number of classes

  • width_mult (float, optional) – Width multiplier - adjusts number of channels in each layer by this amount

  • settings (list, optional) – Network structure. Defaults to None.

  • drop_rate (float, optional) – Drop rate used in Dropout. Defaults to 0.

  • candidates (list of str, optional) – A list of candicates. Defaults to None.

  • skip_connect (bool, optional) – Whether the skip connect is used. Defaults to True.

  • weight (str, optional) – The path to weight file. Defaults to None.

  • seed (int, optional) – The seed for the random generator.

References

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.C., 2018.

Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4510-4520).

call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

get_arch()[source]
get_net_parameters(grad_only=False)[source]

Returns an OrderedDict containing architecture parameters.

Parameters:

grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.

Returns:

A dictionary containing parameters.

Return type:

OrderedDict

loss(outputs, targets, loss_weights=None)[source]

Return loss computed from a list of outputs and list of targets.

Parameters:
  • outputs (list of nn.Variable) – A list of output variables computed from the model.

  • targets (list of nn.Variable) – A list of target variables loaded from the data.

  • loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.

Returns:

A scalar NNabla Variable represents the loss.

Return type:

nn.Variable

visualize(path)[source]

Save visualized graph to a file.

Parameters:

path (str) – Path to directory to save.

class nnabla_nas.contrib.classification.fairnas.TrainNet(num_classes=1000, width_mult=1, settings=None, drop_rate=0, candidates=None, skip_connect=True, genotype=None, weights=None)[source]

Bases: SearchNet

MobileNet V2 Train Net.

Parameters:
  • num_classes (int) – Number of classes

  • width_mult (float, optional) – Width multiplier - adjusts number of channels in each layer by this amount

  • settings (list, optional) – Network structure. Defaults to None.

  • round_nearest (int, optional) – Round the number of channels in each layer to be a multiple of this number. Set to 1 to turn off rounding.

  • n_max (int, optional) – The number of blocks. Defaults to 4.

  • block – Module specifying inverted residual building block for mobilenet. Defaults to None.

  • skip_connect (bool, optional) – Whether the skip connect is used. Defaults to True.

  • genotype (str, optional) – The path to architecture file. Defaults to None.

References

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.C., 2018.

Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4510-4520).

OFAMobileNetV3

class nnabla_nas.contrib.classification.ofa.networks.ofa_mbv3.OFAMbv3Net(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, base_stage_width=None, width_mult=1.0, op_candidates='MB6 3x3', depth_candidates=4, compound=False, fixed_kernel=False, weight_init='he_fout', weights=None)[source]

Bases: ClassificationModel

MobileNet V3 Search Net.

Parameters:
  • num_classes (int, optional) – Number of classes. Defaults to 1000.

  • bn_param (tuple, optional) – BatchNormalization decay rate and eps. Defaults to (0.9, 1e-5).

  • drop_rate (float, optional) – Drop rate used in Dropout. Defaults to 0.1.

  • base_stage_width (list of int, optional) – A list of base stage channel size. Defaults to None.

  • width_mult (float, optional) – Multiplier value to base stage channel size. Defaults to 1.0.

  • op_candidates (str or list of str, optional) – Operator choices. Defaults to “MB6 3x3”.

  • depth_candidates (int or list of int, optional) – Depth choices. Defaults to 4.

  • compound (bool, optional) – Use CompOFA or not. Defaults to False.

  • fixed_kernel (bool, optional) – Fix kernel or not. Defaults to False.

  • weight_init (str, optional) – Weight initializer. Defaults to ‘he_fout’.

  • weights (str, optional) – The relative path to weight file. Defaults to None.

References

[1] Cai, Han, et al. “Once-for-all: Train one network and specialize it for

efficient deployment.” arXiv preprint arXiv:1908.09791 (2019).

CHANNEL_DIVISIBLE = 8
call(x)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

get_net_parameters(grad_only=False)[source]

Returns an OrderedDict containing architecture parameters.

Parameters:

grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.

Returns:

A dictionary containing parameters.

Return type:

OrderedDict

property grouped_block_index
kd_loss(outputs, logits, targets, loss_weights=None)[source]
loss(outputs, targets, loss_weights=None)[source]

Return loss computed from a list of outputs and list of targets.

Parameters:
  • outputs (list of nn.Variable) – A list of output variables computed from the model.

  • targets (list of nn.Variable) – A list of target variables loaded from the data.

  • loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.

Returns:

A scalar NNabla Variable represents the loss.

Return type:

nn.Variable

sample_active_subnet()[source]
sample_compound_subnet()[source]
set_active_subnet(ks=None, e=None, d=None, **kwargs)[source]
set_bn_param(decay_rate, eps, **kwargs)[source]

Sets decay_rate and eps to batchnormalization layers.

Parameters:
  • decay_rate (float) – Deccay rate of running mean and variance.

  • eps (float) – Tiny value to avoid zero division by std.

set_parameters(params, raise_if_missing=False)[source]

Set parameters for the module.

Parameters:
  • params (OrderedDict) – The parameters which will be loaded.

  • raise_if_missing (bool, optional) – Raise exception if some parameters are missing. Defaults to False.

Raises:

ValueError – Parameters are not found.

set_valid_arch(genotype)[source]
class nnabla_nas.contrib.classification.ofa.networks.ofa_mbv3.SearchNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, base_stage_width=None, width_mult=1.0, op_candidates='MB6 3x3', depth_candidates=4, compound=False, fixed_kernel=False, weight_init='he_fout', weights=None)[source]

Bases: OFAMbv3Net

re_organize_middle_weights(expand_ratio_stage=0)[source]
class nnabla_nas.contrib.classification.ofa.networks.ofa_mbv3.TrainNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, base_stage_width=None, width_mult=1, op_candidates=None, depth_candidates=None, genotype=None, weights=None)[source]

Bases: OFAMbv3Net

MobileNet V3 Train Net.

Parameters:
  • num_classes (int, optional) – Number of classes. Defaults to 1000.

  • bn_param (tuple, optional) – BatchNormalization decay rate and eps. Defaults to (0.9, 1e-5).

  • drop_rate (float, optional) – Drop rate used in Dropout. Defaults to 0.1.

  • base_stage_width (list of int, optional) – A list of base stage channel size. Defaults to None.

  • width_mult (float, optional) – Multiplier value to base stage channel size. Defaults to 1.0.

  • op_candidates (str or list of str, optional) – Operator choices. Defaults to None.

  • depth_candidates (int or list of int, optional) – Depth choices. Defaults to None.

  • genotype (list of int, optional) – A list to operators. Defaults to None.

  • weights (str, optional) – Relative path to the weights file. Defaults to None.

call(x)[source]

Implement the call of module. Inputs should only be Variables.

nnabla_nas.contrib.classification.ofa.networks.ofa_mbv3.candidates2subnetlist(candidates)[source]
nnabla_nas.contrib.classification.ofa.networks.ofa_mbv3.genotype2subnetlist(op_candidates, genotype)[source]

OFAXception

class nnabla_nas.contrib.classification.ofa.networks.ofa_xception.OFAXceptionNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, base_stage_width=[32, 64, 128, 256, 728, 1024, 1536, 2048], op_candidates='XP1 7x7 3', width_mult=1.0, weights=None)[source]

Bases: ClassificationModel

Xception41 Base Class

This is the Base Class used for both TrainNet and SearchNet. This implementation is based on the PyTorch implementation given in References.

Parameters:
  • num_classes (int) – Number of classes

  • bn_param (tuple, optional) – BatchNormalization decay rate and eps.

  • drop_rate (float, optional) – Drop rate used in Dropout in classifier. Defaults to 0.1.

  • base_stage_width (list of int, optional) – A list of base stage channel size. Defaults to None.

  • op_candidates (str or list of str, optional) – Operator choices. Defaults to “XP1 7x7 3” (the largest block in the search space).

  • width_mult (float, optional) – Multiplier value to base stage channel size. Defaults to 1.0.

  • weight (str, optional) – The path to weight file. Defaults to None.

References

[1] Cai, Han, et al. “Once-for-all: Train one network and specialize it for

efficient deployment.” arXiv preprint arXiv:1908.09791 (2019).

[2] GitHub implementation of Xception41.

https://github.com/Cadene/pretrained-models.pytorch/blob/master/pretrainedmodels/models/xception.py

CHANNEL_DIVISIBLE = 8
NUM_MIDDLE_BLOCKS = 8
call(x)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

get_arch_parameters(grad_only=False)[source]

Returns an OrderedDict containing architecture parameters.

Parameters:

grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.

Returns:

A dictionary containing parameters.

Return type:

OrderedDict

get_bn_param()[source]

Return dict of batchnormalization params.

Returns:

A dictionary containing decay_rate and eps of batchnormalization

Return type:

dict

get_net_parameters(grad_only=False)[source]

Returns an OrderedDict containing architecture parameters.

Parameters:

grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.

Returns:

A dictionary containing parameters.

Return type:

OrderedDict

kd_loss(outputs, logits, targets, loss_weights=None)[source]
loss(outputs, targets, loss_weights=None)[source]

Return loss computed from a list of outputs and list of targets.

Parameters:
  • outputs (list of nn.Variable) – A list of output variables computed from the model.

  • targets (list of nn.Variable) – A list of target variables loaded from the data.

  • loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.

Returns:

A scalar NNabla Variable represents the loss.

Return type:

nn.Variable

sample_active_subnet()[source]
set_active_subnet(ks, e, d, **kwargs)[source]
set_bn_param(decay_rate, eps, **kwargs)[source]

Sets decay_rate and eps to batchnormalization layers.

Parameters:
  • decay_rate (float) – Deccay rate of running mean and variance.

  • eps (float) – Tiny value to avoid zero division by std.

set_parameters(params, raise_if_missing=False)[source]

Set parameters for the module.

Parameters:
  • params (OrderedDict) – The parameters which will be loaded.

  • raise_if_missing (bool, optional) – Raise exception if some parameters are missing. Defaults to False.

Raises:

ValueError – Parameters are not found.

set_valid_arch(genotype)[source]
class nnabla_nas.contrib.classification.ofa.networks.ofa_xception.ProcessGenotype[source]

Bases: object

This class defines the search space and contains functions to process the genotypes and op_candidates to get the subnet architecture or the search space.

Operator candidates: “XP{E} {K}x{K} {D}”, E=expand_ratio, K=kernel_size, D=depth_of_block

Note: If depth of a block==1, expand_ratio will be ignored since we just need in_channels and out_channels for a block with a single layer. So blocks: [“XP0.6 KxK 1”, “XP0.8 KxK 1”, “XP1 KxK 1”] are equivalent in this architecture design.

CANDIDATES = {'XP0.6 3x3 1': {'depth': 1, 'expand_ratio': 0.6, 'ks': 3}, 'XP0.6 3x3 2': {'depth': 2, 'expand_ratio': 0.6, 'ks': 3}, 'XP0.6 3x3 3': {'depth': 3, 'expand_ratio': 0.6, 'ks': 3}, 'XP0.6 5x5 1': {'depth': 1, 'expand_ratio': 0.6, 'ks': 5}, 'XP0.6 5x5 2': {'depth': 2, 'expand_ratio': 0.6, 'ks': 5}, 'XP0.6 5x5 3': {'depth': 3, 'expand_ratio': 0.6, 'ks': 5}, 'XP0.6 7x7 1': {'depth': 1, 'expand_ratio': 0.6, 'ks': 7}, 'XP0.6 7x7 2': {'depth': 2, 'expand_ratio': 0.6, 'ks': 7}, 'XP0.6 7x7 3': {'depth': 3, 'expand_ratio': 0.6, 'ks': 7}, 'XP0.8 3x3 1': {'depth': 1, 'expand_ratio': 0.8, 'ks': 3}, 'XP0.8 3x3 2': {'depth': 2, 'expand_ratio': 0.8, 'ks': 3}, 'XP0.8 3x3 3': {'depth': 3, 'expand_ratio': 0.8, 'ks': 3}, 'XP0.8 5x5 1': {'depth': 1, 'expand_ratio': 0.8, 'ks': 5}, 'XP0.8 5x5 2': {'depth': 2, 'expand_ratio': 0.8, 'ks': 5}, 'XP0.8 5x5 3': {'depth': 3, 'expand_ratio': 0.8, 'ks': 5}, 'XP0.8 7x7 1': {'depth': 1, 'expand_ratio': 0.8, 'ks': 7}, 'XP0.8 7x7 2': {'depth': 2, 'expand_ratio': 0.8, 'ks': 7}, 'XP0.8 7x7 3': {'depth': 3, 'expand_ratio': 0.8, 'ks': 7}, 'XP1 3x3 1': {'depth': 1, 'expand_ratio': 1, 'ks': 3}, 'XP1 3x3 2': {'depth': 2, 'expand_ratio': 1, 'ks': 3}, 'XP1 3x3 3': {'depth': 3, 'expand_ratio': 1, 'ks': 3}, 'XP1 5x5 1': {'depth': 1, 'expand_ratio': 1, 'ks': 5}, 'XP1 5x5 2': {'depth': 2, 'expand_ratio': 1, 'ks': 5}, 'XP1 5x5 3': {'depth': 3, 'expand_ratio': 1, 'ks': 5}, 'XP1 7x7 1': {'depth': 1, 'expand_ratio': 1, 'ks': 7}, 'XP1 7x7 2': {'depth': 2, 'expand_ratio': 1, 'ks': 7}, 'XP1 7x7 3': {'depth': 3, 'expand_ratio': 1, 'ks': 7}}
classmethod get_search_space(candidates)[source]
classmethod get_subnet_arch(op_candidates, genotype)[source]
class nnabla_nas.contrib.classification.ofa.networks.ofa_xception.SearchNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, base_stage_width=[32, 64, 128, 256, 728, 1024, 1536, 2048], width_mult=1.0, op_candidates='XP1 7x7 3', weights=None)[source]

Bases: OFAXceptionNet

Xception41 Search Net.

This defines the search space of OFA-Xception Model.

Parameters:
  • num_classes (int) – Number of classes

  • bn_param (tuple, optional) – BatchNormalization decay rate and eps.

  • drop_rate (float, optional) – Drop rate used in Dropout of classifier. Defaults to 0.1.

  • base_stage_width (list of int, optional) – A list of base stage channel size. Defaults to [32, 64, 128, 256, 728, 1024, 1536, 2048].

  • width_mult (float, optional) – Multiplier value to base stage channel size. Defaults to 1.0.

  • op_candidates (str or list of str, optional) – Operator choices. Defaults to “XP1 7x7 3” (the largest block in the search space)

  • weights (str, optional) – The path to weight file. Defaults to None.

re_organize_middle_weights(expand_ratio_stage=0)[source]
class nnabla_nas.contrib.classification.ofa.networks.ofa_xception.TrainNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, base_stage_width=[32, 64, 128, 256, 728, 1024, 1536, 2048], width_mult=1, op_candidates=None, genotype=None, weights=None)[source]

Bases: OFAXceptionNet

Xception41 Train Net.

This builds and initialises the OFA-Xception subnet architecture which is passed as a genotype list along with the corresponding op_candidates list to decode the genotypes.

Parameters:
  • num_classes (int) – Number of classes

  • bn_param (tuple, optional) – BatchNormalization decay rate and eps.

  • drop_rate (float, optional) – Drop rate used in Dropout of classifier. Defaults to 0.1.

  • base_stage_width (list of int, optional) – A list of base stage channel size. Defaults to [32, 64, 128, 256, 728, 1024, 1536, 2048].

  • width_mult (float, optional) – Multiplier value to base stage channel size. Defaults to 1.0.

  • op_candidates (str or list of str, optional) – Operator choices. Defaults to None. [Necessary Argument]

  • genotype (list of int, optional) – A list to operators. Defaults to None.

  • weights (str, optional) – The path to weight file. Defaults to None.

call(x)[source]

Implement the call of module. Inputs should only be Variables.

OFAResnet50

class nnabla_nas.contrib.classification.ofa.networks.ofa_resnet50.OFAResNet50(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, depth_list=2, expand_ratio_list=0.25, width_mult_list=1.0, weight_init='he_fout', weights=None)[source]

Bases: ClassificationModel

OFAResNet50 Base Class.

This is the Base Class used for both TrainNet and SearchNet. This implementation is based on the PyTorch implementation given in References.

Parameters:
  • num_classes (int) – Number of classes

  • bn_param (tuple, optional) – BatchNormalization decay rate and eps.

  • drop_rate (float, optional) – Drop rate used in Dropout in classifier. Defaults to 0.1.

  • depth_list (int or list of int, optional) – Candidates of depth for each layer. Defaults to 2.

  • expand_ratio_list (float or list of float, optional) – Candidates of expand ratio for middle bottleneck layers. Defaults to 0.25.

  • width_mult_list (float or list of float, optional) – Candidates of width multiplication ratio for input/output feature size of bottleneck layers. Defaults to 1.0.

  • weight_init (str, optional) – Weight initialization method. Defaults to ‘he_fout’.

  • weight (str, optional) – Path to weight file. Defaults to None.

References

[1] Cai, Han, et al. “Once-for-all: Train one network and specialize it for

efficient deployment.” arXiv preprint arXiv:1908.09791 (2019).

[2] GitHub implementation of Once-for-All.

https://github.com/mit-han-lab/once-for-all

BASE_DEPTH_LIST = [2, 2, 4, 2]
STAGE_WIDTH_LIST = [256, 512, 1024, 2048]
call(input)[source]

Implement the call of module. Inputs should only be Variables.

extra_repr()[source]

Set the extra representation for the module.

get_net_parameters(grad_only=False)[source]

Returns an OrderedDict containing architecture parameters.

Parameters:

grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.

Returns:

A dictionary containing parameters.

Return type:

OrderedDict

get_random_active_subnet()[source]
property grouped_block_index
kd_loss(outputs, logits, targets, loss_weights=None)[source]
loss(outputs, targets, loss_weights=None)[source]

Return loss computed from a list of outputs and list of targets.

Parameters:
  • outputs (list of nn.Variable) – A list of output variables computed from the model.

  • targets (list of nn.Variable) – A list of target variables loaded from the data.

  • loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.

Returns:

A scalar NNabla Variable represents the loss.

Return type:

nn.Variable

sample_active_subnet()[source]
set_active_subnet(d=None, e=None, w=None, **kwargs)[source]
set_bn_param(decay_rate, eps, **kwargs)[source]

Sets decay_rate and eps to batchnormalization layers. :param decay_rate: Deccay rate of running mean and variance. :type decay_rate: float :param eps: Tiny value to avoid zero division by std. :type eps: float

set_parameters(params, raise_if_missing=False)[source]

Set parameters for the module.

Parameters:
  • params (OrderedDict) – The parameters which will be loaded.

  • raise_if_missing (bool, optional) – Raise exception if some parameters are missing. Defaults to False.

Raises:

ValueError – Parameters are not found.

set_valid_arch(genotype)[source]
class nnabla_nas.contrib.classification.ofa.networks.ofa_resnet50.SearchNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, depth_list=2, expand_ratio_list=0.25, width_mult_list=1.0, weight_init='he_fout', weights=None)[source]

Bases: OFAResNet50

OFAResNet50 Search Net.

This defines the search space of OFA-ResNet50 model.

Parameters:
  • num_classes (int) – Number of classes

  • bn_param (tuple, optional) – BatchNormalization decay rate and eps.

  • drop_rate (float, optional) – Drop rate used in Dropout in classifier. Defaults to 0.1.

  • depth_list (int or list of int, optional) – Candidates of depth for each layer. Defaults to 2.

  • expand_ratio_list (float or list of float, optional) – Candidates of expand ratio for middle bottleneck layers. Defaults to 0.25.

  • width_mult_list (float or list of float, optional) – Candidates of width multiplication ratio for input/output feature size of bottleneck layers. Defaults to 1.0.

  • weight_init (str, optional) – Weight initialization method. Defaults to ‘he_fout’.

  • weight (str, optional) – Path to weight file. Defaults to None.

re_organize_middle_weights(expand_ratio_stage=0)[source]
class nnabla_nas.contrib.classification.ofa.networks.ofa_resnet50.TrainNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, depth_list=None, expand_ratio_list=None, width_mult_list=None, genotype=None, weights=None)[source]

Bases: SearchNet

OFAResNet50 Train Net.

This builds and initialises the OFA-ResNet50 subnet architecture which is passed as a genotype list along with the corresponding depth, expand ratio, and width mult candidate list to decode the genotypes.

Parameters:
  • num_classes (int) – Number of classes

  • bn_param (tuple, optional) – BatchNormalization decay rate and eps.

  • drop_rate (float, optional) – Drop rate used in Dropout in classifier. Defaults to 0.1.

  • depth_list (int or list of int, optional) – Candidates of depth for each layer. Defaults to None.

  • expand_ratio_list (float or list of float, optional) – Candidates of expand ratio for middle bottleneck layers. Defaults to None.

  • width_mult_list (float or list of float, optional) – Candidates of width multiplication ratio for input/output feature size of bottleneck layers. Defaults to None.

  • genotype (list, optional) – A list to operators. Defaults to None.

  • weight (str, optional) – Path to weight file. Defaults to None.

call(x)[source]

Implement the call of module. Inputs should only be Variables.