API Documentation¶
nnabla_nas.module¶
BatchNormalization¶
- class nnabla_nas.module.batchnorm.BatchNormalization(n_features, n_dims, axes=[1], decay_rate=0.9, eps=1e-05, output_stat=False, fix_parameters=False, param_init=None, name='')[source]¶
Bases:
Module
Batch normalization layer.
- Parameters:
n_features (int) – Number of dimentional features.
n_dims (int) – Number of dimensions.
axes (
tuple
ofint
) – Mean and variance for each element inaxes
are calculated using elements on the rest axes. For example, if an input is 4 dimensions, andaxes
is[1]
, batch mean is calculated asnp.mean(inp.d, axis=(0, 2, 3), keepdims=True)
(using numpy expression as an example).decay_rate (float, optional) – Decay rate of running mean and variance. Defaults to 0.9.
eps (float, optional) – Tiny value to avoid zero division by std. Defaults to 1e-5.
output_stat (bool, optional) – Output batch mean and variance. Defaults to False.
fix_parameters (bool) – When set to True, the beta and gamma will not be updated.
param_init (dict) –
Parameter initializers can be set with a dict. A key of the dict must be
'beta'
,'gamma'
,'mean'
or'var'
. A value of the dict must be anInitializer
or anumpy.ndarray
. E.g.:{ 'beta': ConstantIntializer(0), 'gamma': np.ones(gamma_shape) * 2 }
name (string) – the name of this module
- Returns:
N-D array.
- Return type:
Variable
References
- Ioffe and Szegedy, Batch Normalization: Accelerating Deep
Network Training by Reducing Internal Covariate Shift. https://arxiv.org/abs/1502.03167
Container¶
- class nnabla_nas.module.container.ModuleList(modules=None)[source]¶
Bases:
Module
Hold submodules in a list. This implementation mainly follows the Pytorch implementation.
- Parameters:
modules (iterable, optional) – An iterable of modules to add.
- append(module)[source]¶
Appends a given module to the end of the list.
- Parameters:
module (Module) – A module to append.
- class nnabla_nas.module.container.ParameterList(parameters=None)[source]¶
Bases:
Module
Hold parameters in a list.
- Parameters:
parameters (iterable, optional) – An iterable of parameters to add.
- append(parameter)[source]¶
Appends a given module to the end of the list.
- Parameters:
parameter (Parameter) – A parameter to append.
- class nnabla_nas.module.container.Sequential(*args)[source]¶
Bases:
ModuleList
A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an ordered dict of modules can also be passed in.
Convolution¶
- class nnabla_nas.module.convolution.Conv(in_channels, out_channels, kernel, pad=None, stride=None, dilation=None, group=1, w_init=None, b_init=None, base_axis=1, fix_parameters=False, rng=None, with_bias=True, channel_last=False, name='')[source]¶
Bases:
Module
N-D Convolution layer.
- Parameters:
in_channels (
int
) – Number of convolution kernels (which is equal to the number of input channels).out_channels (
int
) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.kernel (
tuple
ofint
) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).pad (
tuple
ofint
, optional) – Padding sizes for dimensions. Defaults to None.stride (
tuple
ofint
, optional) – Stride sizes for dimensions. Defaults to None.dilation (
tuple
ofint
, optional) – Dilation sizes for dimensions. Defaults to None.group (int, optional) – Number of groups of channels. This makes connections across channels more sparse by grouping connections along map direction. Defaults to 1.
w_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
, optional) – Initializer for weight. By default, it is initialized withnnabla.initializer.UniformInitializer
within the range determined bynnabla.initializer.calc_uniform_lim_glorot
.b_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
, optional) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.base_axis (
int
, optional) – Dimensions up to base_axis are treated as the sample dimensions. Defaults to 1.fix_parameters (bool, optional) – When set to True, the weights and biases will not be updated. Defaults to False.
rng (numpy.random.RandomState, optional) – Random generator for Initializer. Defaults to None.
with_bias (bool, optional) – Specify whether to include the bias term. Defaults to True.
channel_last (bool, optional) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. Defaults to False.
name (string) – the name of this module
- class nnabla_nas.module.convolution.DwConv(in_channels, kernel, pad=None, stride=None, dilation=None, multiplier=1, w_init=None, b_init=None, base_axis=1, fix_parameters=False, rng=None, with_bias=True, name='')[source]¶
Bases:
Module
N-D Depthwise Convolution layer.
- Parameters:
in_channels (
int
) – Number of convolution kernels (which is equal to the number of input channels).kernel (
tuple
ofint
) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).pad (
tuple
ofint
, optional) – Padding sizes for dimensions. Defaults to None.stride (
tuple
ofint
, optional) – Stride sizes for dimensions. Defaults to None.dilation (
tuple
ofint
, optional) – Dilation sizes for dimensions. Defaults to None.multiplier (
int
, optional) – Number of output feature maps per input feature map. Defaults to 1.w_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
, optional) – Initializer for weight. By default, it is initialized withnnabla.initializer.UniformInitializer
within the range determined bynnabla.initializer.calc_uniform_lim_glorot
.b_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
, optional) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.base_axis (
int
, optional) – Dimensions up to base_axis are treated as the sample dimensions. Defaults to 1.fix_parameters (bool, optional) – When set to True, the weights and biases will not be updated. Defaults to False.
rng (numpy.random.RandomState, optional) – Random generator for Initializer. Defaults to None.
with_bias (bool, optional) – Specify whether to include the bias term. Defaults to True.
name (string) – the name of this module
References
- Chollet: Chollet, Francois. “Xception: Deep Learning with
Depthwise Separable Convolutions. https://arxiv.org/abs/1610.02357
Dropout¶
- class nnabla_nas.module.dropout.Dropout(drop_prob=0.5, name='')[source]¶
Bases:
Module
Dropout layer.
During training, randomly zeroes some of the elements of the input tensor with probability
p
using samples from a Bernoulli distribution. Each channel will be zeroed out independently on every forward call.- Parameters:
drop_prob (
int
, optional) – The probability of an element to be zeroed. Defaults to 0.5.name (string) – the name of this module
Identity¶
Linear¶
- class nnabla_nas.module.linear.Linear(in_features, out_features, base_axis=1, w_init=None, b_init=None, rng=None, bias=True, name='')[source]¶
Bases:
Module
Linear layer. Applies a linear transformation to the incoming data: \(y = xA^T + b\)
- Parameters:
in_features (int) – The size of each input sample.
in_features – The size of each output sample.
base_axis (int, optional) – Dimensions up to base_axis are treated as the sample dimensions. Defaults to 1.
w_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
) – Initializer for weight. By default, it is initialized withnnabla.initializer.UniformInitializer
within the range determined bynnabla.initializer.calc_uniform_lim_glorot
.b_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.rng (numpy.random.RandomState) – Random generator for Initializer.
with_bias (bool) – Specify whether to include the bias term.
name (string) – the name of this module
Merging¶
- class nnabla_nas.module.merging.Merging(mode, axis=1, name='')[source]¶
Bases:
Module
Merging layer.
Merges a list of NNabla Variables.
- Parameters:
mode (str) – The merging mode (‘concat’, ‘add’, ‘mul’), where concat indicates that the inputs will be concatenated, add means the element-wise addition, and mul means the element-wise multiplication.
axis (int, optional) – The axis for merging when ‘concat’ is used. Defaults to 1.
name (string) – the name of this module
MixedOp¶
- class nnabla_nas.module.mixedop.MixedOp(operators, mode='full', alpha=None, rng=None, name='')[source]¶
Bases:
Module
Mixed Operator layer.
Selects a single operator or a combination of different operators that are allowed in this module.
- Parameters:
operators (List of Module) – A list of modules.
mode (str, optional) – The selecting mode for this module. Defaults to full. Possible modes are sample, full, max, or ‘fair’.
alpha (Parameter, optional) – The weights used to calculate the evaluation probabilities. Ignored in ‘fair’ mode. Defaults to None.
rng (numpy.random.RandomState) – Random generator for random choice.
name (string) – the name of this module
- property active_index¶
- class nnabla_nas.module.module.Module(name='')[source]¶
Bases:
object
Module base for all nnabla neural network modules.
Your models should also subclass this class. Modules can also contain other Modules, allowing to nest them in a tree structure.
- calc_latency_all_modules(path, graph, func_latency=None)[source]¶
Calculate the latency for each of the modules in a graph. The modules are extracted using the graph structure information. The latency is then calculated based on each individual module’s nnabla graph. It also saves the accumulated latency of all modules.
- Parameters:
path –
graph –
func_latency – function to use to calc latency of each of the modules This function needs to work based on the graph
- convert_npp_to_onnx(path, opset='opset_11')[source]¶
Finds all nnp files in the given path and its subfolders and converts them to ONNX For this to run smoothly, nnabla_cli must be installed and added to your python path.
- Parameters:
path –
opset –
The actual bash shell command used is:
> find <DIR> -name '*.nnp' -exec echo echo {} \| awk -F \. '\{print "nnabla_cli convert -b 1 -d opset_11 "\$0" "\$1"\."\$2"\.onnx"\}' \; | sh | sh
which, for each file found with find, outputs the following:
> echo <FILE>.nnp | awk -F \. '{print "nnabla_cli convert -b 1 -d opset_11 "$0" "$1"."$2".onnx"}' # noqa: E501,W605
which, for each file, generates the final conversion command:
> nnabla_cli convert -b 1 -d opset_11 <FILE>.nnp <FILE>.nnp.onnx
- get_latency(estimator, active_only=True)[source]¶
Function to use to calc latency This function needs to work based on the graph :param estimator: a graph-based estimator :param active_only: get latency of active modules only
- Returns:
list of all latencies of each module accum_lat: total sum of latencies of all modules
- Return type:
latencies
- get_latency_by_mod(estimator, active_only=True)[source]¶
* Note: This function is deprecated. Use get_latency() * Function to use to calc latency This function needs to work based on the module :param estimator: a module-based estimator :param active_only: get latency of active modules only
- Returns:
list of all latencies of each module accum_lat: total sum of latencies of all modules
- Return type:
latencies
- get_modules(prefix='', memo=None)[source]¶
Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
- Parameters:
prefix (str, optional) – Additional prefix to name modules. Defaults to ‘’.
memo (dict, optional) – Memorize all parsed modules. Defaults to None.
- Yields:
(str, Module) – a submodule.
- get_parameters(grad_only=False)[source]¶
Return an OrderedDict containing all parameters in the module.
- Parameters:
grad_only (bool, optional) – If need_grad=True is required. Defaults to False.
- Returns:
A dictionary containing parameters of module.
- Return type:
OrderedDict
- property input_shapes¶
Return a list of input shapes used during call function.
- property is_active¶
Whether the module was called.
- load_parameters(path, raise_if_missing=False)[source]¶
Loads parameters from a file with the specified format.
- Parameters:
path (str) – Relative path to the parameter file (based on the original working directory).
raise_if_missing (bool, optional) – Raise exception if some parameters are missing. Defaults to False.
- property modules¶
Return an OrderedDict containing immediate modules.
- property modules_to_profile¶
Returns a list with the modules that will be profiled when the Profiler/Estimator functions are called. All other modules in the network will not be profiled.
- property name¶
The name of the module.
- Returns:
the name of the module
- Return type:
string
- property need_grad¶
Whether the module needs gradient.
- property parameters¶
Return an OrderedDict containing immediate parameters.
- save_modules_nnp(path, active_only=False, calc_latency=False, func_latency=None)[source]¶
Saves all modules of the network as individual nnp files, using folder structure given by name convention. The modules are extracted going over the module list, not over the graph structure. The latency is then calculated based on each individual module’s nnabla graph (e.g. [LatencyGraphEstimator])
- Parameters:
path –
active_only – if True, only active modules are saved
calc_latency – flag for calc latency
func_latency – function to use to calc latency of each of the extracted modules This function needs to work based on the graph
- save_modules_nnp_by_mod(path, active_only=False, calc_latency=False, func_latency=None)[source]¶
* Note: This function is deprecated. Use save_modules_nnp() * Saves all modules of the network as individual nnp files, using folder structure given by name convention. The modules are extracted going over the module list, not over the graph structure. The latency is then calculated using the module themselves (e.g. [LatencyEstimator])
- Parameters:
path –
active_only – if True, only active modules are saved
calc_latency – flag for calc latency
func_latency – function to use to calc latency of each of the extracted modules This function needs to work based on the modules
- save_net_nnp(path, inp, out, calc_latency=False, func_real_latency=None, func_accum_latency=None, save_params=None)[source]¶
Saves whole net as one nnp Calc whole net (real) latency (using e.g.Nnabla’s [Profiler]) Calculate also layer-based latency The modules are discovered using the nnabla graph of the whole net The latency is then calculated based on each individual module’s nnabla graph (e.g. [LatencyGraphEstimator])
- Parameters:
path – absolute path
inp – input of the created network
out – output of the created network
calc_latency – flag for calc latency
func_real_latency – function to use to calc actual latency
func_accum_latency – function to use to calc accum. latency, this is, dissecting the network layer by layer using the graph of the network, calculate the latency for each layer and add up all these results.
- save_parameters(path, params=None, grad_only=False)[source]¶
Saves the parameters to a file.
- Parameters:
path (str) – Absolute path to file.
params (OrderedDict, optional) – An OrderedDict containing parameters. If params is None, then the current parameters will be saved.
grad_only (bool, optional) – If need_grad=True is required for parameters which will be saved. Defaults to False.
- set_parameters(params, raise_if_missing=False)[source]¶
Set parameters for the module.
- Parameters:
params (OrderedDict) – The parameters which will be loaded.
raise_if_missing (bool, optional) – Raise exception if some parameters are missing. Defaults to False.
- Raises:
ValueError – Parameters are not found.
- property training¶
The training mode of module.
Lambda¶
Parameter¶
- class nnabla_nas.module.parameter.Parameter(shape, need_grad=True, initializer=None, scope='')[source]¶
Bases:
Variable
Parameter is a Variable.
A kind of Variable that is to be considered a module parameter. Parameters are
Variable
subclasses, that have a very special property when used withModule
s - when they’re assigned as Module attributes they are automatically added to the list of its parameters.- Parameters:
shape (tuple of int) – The shape of Parameter.
need_grad (bool, optional) – If the parameter requires gradient. Defaults to True.
initializer (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
) – An initialization function to be applied to the parameter.numpy.ndarray
can also be given to initialize parameters from numpy array data. Defaults to None.
Pooling¶
- class nnabla_nas.module.pooling.AvgPool(kernel, stride=None, ignore_border=True, pad=None, channel_last=False, name='')[source]¶
Bases:
Module
Average pooling layer. It pools the averaged values inside the scanning kernel.
- Parameters:
kernel (
tuple
ofint
) – Kernel sizes for each spatial axis.stride (
tuple
ofint
, optional) – Subsampling factors for each spatial axis. Defaults to None.ignore_border (bool) – If false, kernels covering borders are also
True. (considered for the output. Defaults to) –
pad (
tuple
ofint
, optional) – Border padding values for each spatial axis. Padding will be added both sides of the dimension. Defaults to(0,) * len(kernel)
.channel_last (bool) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. Defaults to
False
.name (string) – the name of this module
- class nnabla_nas.module.pooling.GlobalAvgPool(name='')[source]¶
Bases:
Module
Global average pooling layer. It pools an averaged value from the whole image. :param name: the name of this module :type name: string
- class nnabla_nas.module.pooling.MaxPool(kernel, stride=None, pad=None, channel_last=False, name='')[source]¶
Bases:
Module
Max pooling layer. It pools the maximum values inside the scanning kernel.
- Parameters:
kernel (
tuple
ofint
) – Kernel sizes for each spatial axis.stride (
tuple
ofint
, optional) – Subsampling factors for each spatial axis. Defaults to None.pad (
tuple
ofint
, optional) – Border padding values for each spatial axis. Padding will be added both sides of the dimension. Defaults to(0,) * len(kernel)
.channel_last (bool) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. Defaults to
False
.name (string) – the name of this module
Relu¶
- class nnabla_nas.module.relu.LeakyReLU(alpha=0.1, inplace=False, name='')[source]¶
Bases:
Module
LeakyReLU layer. Element-wise Leaky Rectified Linear Unit (ReLU) function.
- Parameters:
alpha (float, optional) – The slope value multiplied to negative numbers. \(\alpha\) in the definition. Defaults to 0.1.
inplace (bool, optional) – can optionally do the operation in-place. Default:
False
.name (string) – the name of this module
- class nnabla_nas.module.relu.ReLU(inplace=False, name='')[source]¶
Bases:
Module
ReLU layer. Applies the rectified linear unit function element-wise.
- Parameters:
inplace (bool, optional) – can optionally do the operation in-place. Default:
False
.name (string) – the name of this module
Zero¶
nnabla_nas.module.static¶
- class nnabla_nas.module.static.AvgPool(parents, name='', eval_prob=None, *args, **kwargs)[source]¶
-
The AvgPool module performs avg pooling on the output of its parent. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
kernel (
tuple
ofint
) – Kernel sizes for each spatial axis.stride (
tuple
ofint
, optional) – Subsampling factors for each spatial axis. Defaults to None.pad (
tuple
ofint
, optional) – Border padding values for each spatial axis. Padding will be added both sides of the dimension. Defaults to(0,) * len(kernel)
.channel_last (bool) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. Defaults to
False
.
- class nnabla_nas.module.static.BatchNormalization(parents, name='', eval_prob=None, *args, **kwargs)[source]¶
Bases:
BatchNormalization
,Module
The BatchNormalization module is the static version of nnabla_nas.modules.BatchNormalization. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
n_features (int) – Number of dimentional features.
n_dims (int) – Number of dimensions.
axes (
tuple
ofint
) – Mean and variance for each element inaxes
are calculated using elements on the rest axes. For example, if an input is 4 dimensions, andaxes
is[1]
, batch mean is calculated asnp.mean(inp.d, axis=(0, 2, 3), keepdims=True)
(using numpy expression as an example).decay_rate (float, optional) – Decay rate of running mean and variance. Defaults to 0.9
eps (float, optional) – Tiny value to avoid zero division by std. Defaults to 1e-5.
output_stat (bool, optional) – Output batch mean and variance. Defaults to False.
fix_parameters (bool) – When set to True, the beta and gamma will not be updated.
param_init (dict) –
Parameter initializers can be set with a dict. A key of the dict must be
'beta'
,'gamma'
,'mean'
or'var'
. A value of the dict must be anInitializer
or anumpy.ndarray
. E.g.:{ 'beta': ConstantIntializer(0), 'gamma': np.ones(gamma_shape) * 2 }
- Returns:
N-D array.
- Return type:
Variable
References
- Ioffe and Szegedy, Batch Normalization: Accelerating Deep
Network Training by Reducing Internal Covariate Shift. https://arxiv.org/abs/1502.03167
- class nnabla_nas.module.static.Collapse(parents, name='')[source]¶
Bases:
Module
The Collapse module removes the last two singleton dimensions of an 4D input. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
- call(*inputs)[source]¶
The input to output mapping of the module. Given some inputs, it constructs the computational graph of this module. This method must be implemented for custom modules.
- Parameters:
*input – the output of the parents
- Returns:
the output of the module
- Return type:
nnabla variable
Examples
>>> out = my_module(inp_a, inp_b)
- class nnabla_nas.module.static.Conv(parents, name='', eval_prob=None, *args, **kwargs)[source]¶
-
The Conv module performs a convolution on the output of its parent. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
in_channels (
int
) – Number of convolution kernels (which is equal to the number of input channels).out_channels (
int
) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.kernel (
tuple
ofint
) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).pad (
tuple
ofint
, optional) – Padding sizes for dimensions. Defaults to None.stride (
tuple
ofint
, optional) – Stride sizes for dimensions. Defaults to None.dilation (
tuple
ofint
, optional) – Dilation sizes for dimensions. Defaults to None.group (int, optional) – Number of groups of channels. This makes connections across channels more sparse by grouping connections along map direction. Defaults to 1.
w_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
, optional) – Initializer for weight. By default, it is initialized withnnabla.initializer.UniformInitializer
within the range determined bynnabla.initializer.calc_uniform_lim_glorot
.b_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
, optional) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.base_axis (
int
, optional) – Dimensions up to base_axis are treated as the sample dimensions. Defaults to 1.fix_parameters (bool, optional) – When set to True, the weights and biases will not be updated. Defaults to False.
rng (numpy.random.RandomState, optional) – Random generator for Initializer. Defaults to None.
with_bias (bool, optional) – Specify whether to include the bias term. Defaults to True.
channel_last (bool, optional) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. Defaults to False.
- class nnabla_nas.module.static.Dropout(parents, name='', *args, **kwargs)[source]¶
-
The Dropout module is the static version of nnabla_nas.modules.Dropout. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
drop_prob (
int
, optional) – The probability of an element to be zeroed. Defaults to 0.5.
- class nnabla_nas.module.static.DwConv(parents, name='', eval_prob=None, *args, **kwargs)[source]¶
-
The DwConv module performs a depthwise convolution on the output of its parent. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
in_channels (
int
) – Number of convolution kernels (which is equal to the number of input channels).kernel (
tuple
ofint
) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).pad (
tuple
ofint
, optional) – Padding sizes for dimensions. Defaults to None.stride (
tuple
ofint
, optional) – Stride sizes for dimensions. Defaults to None.dilation (
tuple
ofint
, optional) – Dilation sizes for dimensions. Defaults to None.multiplier (
int
, optional) – Number of output feature maps per input feature map. Defaults to 1.w_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
, optional) – Initializer for weight. By default, it is initialized withnnabla.initializer.UniformInitializer
within the range determined bynnabla.initializer.calc_uniform_lim_glorot
.b_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
, optional) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.base_axis (
int
, optional) – Dimensions up to base_axis are treated as the sample dimensions. Defaults to 1.fix_parameters (bool, optional) – When set to True, the weights and biases will not be updated. Defaults to False.
rng (numpy.random.RandomState, optional) – Random generator for Initializer. Defaults to None.
with_bias (bool, optional) – Specify whether to include the bias term. Defaults to True.
References
- Chollet: Chollet, Francois. “Xception: Deep Learning with
Depthwise Separable Convolutions. https://arxiv.org/abs/1610.02357
- class nnabla_nas.module.static.GlobalAvgPool(parents, name='', eval_prob=None, *args, **kwargs)[source]¶
Bases:
GlobalAvgPool
,Module
The GlobalAvgPool module performs global avg pooling on the output of its parent. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
- class nnabla_nas.module.static.Graph(parents=[], name='', eval_prob=None, *args, **kwargs)[source]¶
Bases:
ModuleList
,Module
The static version of nnabla_nas.module.ModuleList. A Graph which can contain many modules. A graph can also be used as a module within another graph. Any graph must define self._output, i.e. the StaticModule which acts as the output node of this graph.
- get_gv_graph(active_only=True, color_map={<class 'nnabla_nas.module.static.static_module.Join'>: 'blue', <class 'nnabla_nas.module.static.static_module.Merging'>: 'green', <class 'nnabla_nas.module.static.static_module.Zero'>: 'red'})[source]¶
Construct a graphviz graph object that can be used to visualize the graph.
- Parameters:
active_only (bool) – whether or not to add inactive modules, i.e., modules which are not part of the computational graph
color_map (dict) – the mapping of class instance to vertice color used to visualize the graph.
- property output¶
The output module of this module. If the module is not a graph, it will return self.
- Returns:
the output module
- Return type:
- property shape¶
The output determines the shape of the graph.
- class nnabla_nas.module.static.Identity(parents, name='', eval_prob=None, *args, **kwargs)[source]¶
-
The Identity module does not alter the input. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
Examples
>>> import nnabla as nn >>> from nnabla_nas.module import static as smo >>> >>> nn.Variable((10, 3, 32, 32)) >>> >>> inp_module = smo.Input(value=input) >>> identity = smo.Identity(parents=[inp_module])
- class nnabla_nas.module.static.Input(value=None, name='', eval_prob=None, *args, **kwargs)[source]¶
Bases:
Module
A static module that can serve as an input, i.e., it has no parents but is provided with a value which it can pass to its children.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
value (nnabla variable) – the nnabla variable which serves as the input value
Examples
>>> import nnabla as nn >>> from nnabla_nas.module import static as smo >>> input = nn.Variable((10, 3, 32, 32)) >>> inp_module = smo.Input(value=input)
- property value¶
- class nnabla_nas.module.static.Join(parents, join_parameters, name='', mode='linear', *args, **kwargs)[source]¶
Bases:
Module
The Join module is used to fuse the output of multiple parents. It can either superpose them linearly, sample one of the input or select the maximum probable input. It accepts multiple parents. However, the output of all parents must have the same shape.
- Parameters:
join_parameters (nnabla variable) – a vector containing unnormalized categorical probabilities. It must have the same number of elements as the module has parents. The selection probability of each parent is calculated, using the softmax function.
mode (string) – can be ‘linear’/’sample’/’max’. Determines how Join combines the output of the parents.
- property mode¶
- class nnabla_nas.module.static.LeakyReLU(parents, name='', eval_prob=None, *args, **kwargs)[source]¶
-
The LeakyReLu module is the static version of nnabla_nas.modules.ReLU. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
inplace (bool, optional) – can optionally do the operation in-place. Default:
False
.
- class nnabla_nas.module.static.Linear(parents, name='', *args, **kwargs)[source]¶
-
The Linear module performs an affine transformation on the output of its parent. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
in_features (int) – The size of each input sample.
in_features – The size of each output sample.
base_axis (int, optional) – Dimensions up to base_axis are treated as the sample dimensions. Defaults to 1.
w_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
) – Initializer for weight. By default, it is initialized withnnabla.initializer.UniformInitializer
within the range determined bynnabla.initializer.calc_uniform_lim_glorot
.b_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.rng (numpy.random.RandomState) – Random generator for Initializer.
with_bias (bool) – Specify whether to include the bias term.
- class nnabla_nas.module.static.MaxPool(parents, name='', eval_prob=None, *args, **kwargs)[source]¶
-
The MaxPool module performs max pooling on the output of its parent. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
kernel (
tuple
ofint
) – Kernel sizes for each spatial axis.stride (
tuple
ofint
, optional) – Subsampling factors for each spatial axis. Defaults to None.pad (
tuple
ofint
, optional) – Border padding values for each spatial axis. Padding will be added both sides of the dimension. Defaults to(0,) * len(kernel)
.channel_last (bool) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. Defaults to
False
.
- class nnabla_nas.module.static.Merging(parents, mode, name='', eval_prob=None, axis=1)[source]¶
-
The Merging module is the static version of nnabla_nas.modules.Merging. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
mode (str) – The merging mode (‘concat’, ‘add’).
axis (int, optional) – The axis for merging when ‘concat’ is used. Defaults to 1.
- class nnabla_nas.module.static.Module(parents=[], name='', eval_prob=None, *args, **kwargs)[source]¶
Bases:
Module
A static module is a module that encodes the graph structure, i.e., it has parents and children. Static modules can be used to define graphs that can run run simple graph optimizations when constructing the nnabla graph.
- Parameters:
parents (list) – a list of static modules that are parents to this module
name (string, optional) – the name of the module
eval_prob (nnabla variable, optional) – the evaluation probability of this module
Examples
>>> from nnabla_nas.module import static as smo >>> class MyModule(smo.Module): >>> def __init__(self, parents): >>> smo.Module.__init__(self, parents=parents) >>> smo.Module.__init__(self, parents=parents) >>> self.linear = mo.Linear(in_features=5, out_features=3) >>> >>> def call(self, *input): >>> return self.linear(*input) >>> >>> module_1 = smo.Module(name='module_1') >>> module_2 = smo.MyModule(parents=[module_1], name='module_2')
- add_child(child)[source]¶
Adds a static_module as a child to self
- Parameters:
child (static_module) – the module to add as a child
- call(*inputs)[source]¶
The input to output mapping of the module. Given some inputs, it constructs the computational graph of this module. This method must be implemented for custom modules.
- Parameters:
*input – the output of the parents
- Returns:
the output of the module
- Return type:
nnabla variable
Examples
>>> out = my_module(inp_a, inp_b)
- property children¶
The child modules
- Returns:
the children of the module
- Return type:
list
- property eval_prob¶
The evaluation probability of this module. It is 1.0 if not specified otherwise.
- Returns:
the evaluation probability
- Return type:
nnabla variable
- property input_shapes¶
A list of input shapes of this module, i.e., the output shapes of all parent modules.
- Returns:
- a list of tuples storing the
output shape of all parent modules
- Return type:
list
- property name¶
The name of the module.
- Returns:
the name of the module
- Return type:
string
- property output¶
The output module of this module. If the module is not a graph, it will return self.
- Returns:
the output module
- Return type:
- property parents¶
The parents of the module
- Returns:
the parents of the module
- Return type:
list
- property shape¶
The output shape of the static_module.
- Returns:
the shape of the output tensor
- Return type:
tuple
- class nnabla_nas.module.static.ReLU(parents, name='', eval_prob=None, *args, **kwargs)[source]¶
-
The ReLu module is the static version of nnabla_nas.modules.ReLU. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
inplace (bool, optional) – can optionally do the operation in-place. Default:
False
.
- class nnabla_nas.module.static.ReLU6(parents, name='', eval_prob=None, *args, **kwargs)[source]¶
-
The ReLu6 module is the static version of nnabla_nas.modules.ReLU. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
inplace (bool, optional) – can optionally do the operation in-place. Default:
False
.
- class nnabla_nas.module.static.Zero(parents, name='', eval_prob=None, *args, **kwargs)[source]¶
-
The Zero module returns a tensor with zeros, which has the same shape as the ouput of its parent. It accepts only a single parent.
- Parameters:
parents (list) – the parents of this module
name (string) – the name of this module
Examples
>>> my_module = Zero(parents=[...], name='my_module')
- call(*inputs)[source]¶
The input to output mapping of the module. Given some inputs, it constructs the computational graph of this module. This method must be implemented for custom modules.
- Parameters:
*input – the output of the parents
- Returns:
the output of the module
- Return type:
nnabla variable
Examples
>>> out = my_module(inp_a, inp_b)
nnabla_nas.runner¶
- class nnabla_nas.runner.runner.Runner(model, optimizer, regularizer, dataloader, hparams, args)[source]¶
Bases:
ABC
Runner is a basic class for training a model.
You can adapt this class for your own runner by reimplementing the abstract methods of this class.
- Parameters:
model (nnabla_nas.contrib.model.Model) – The search model used to search the architecture.
optimizer (dict) – This stores optimizers for both train and valid graphs. Must only store instances of Optinmizer
regularizer (dict) – This stores regularizers such as the latency and memory estimators
dataloader (dict) – This stores dataloaders for both train and valid graphs.
hparams (Configuration) – This stores all hyperparmeters used during training.
args (Configuration) – This stores other variables used during for training: event, communicator, output_path…
- property fast_mode¶
- abstract train_on_batch(key='train')[source]¶
Runs the model update on a single batch of train data.
Searcher¶
DartsSearcher¶
ProxylessNasSearcher¶
FairNasSearcher¶
OFASearcher¶
- class nnabla_nas.runner.searcher.ofa.OFASearcher(model, optimizer, regularizer, dataloader, hparams, args)[source]¶
Bases:
Searcher
An implementation of OFA.
- get_net_parameters_with_keys(keys, mode='include', grad_only=False)[source]¶
Returns an OrderedDict containing model parameters.
- Parameters:
keys (list of str) – Patterns of parameters to be considered for inclusion or exclusion. Note: Keys passed must be in regular expression format.
mode (str, optional) – Mode of getting network parameters with keys. - Selects parameters satisfying the keys if mode==’include’ - Selects parameters not satisfying the keys if mode==’exclude’ Choices: [‘include’, ‘exclude’]. Defaults to ‘include’.
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.
- Returns:
A dictionary containing parameters.
- Return type:
OrderedDict
- reset_running_statistics(net=None, subset_size=2000, subset_batch_size=200, dataloader=None, dataloader_batch_size=None, inp_shape=None)[source]¶
Trainer¶
- class nnabla_nas.runner.trainer.OFATrainer(model, optimizer, regularizer, dataloader, hparams, args)[source]¶
Bases:
Runner
Trainer class for OFA
nnabla_nas.utils¶
Profiler¶
- nnabla_nas.utils.data.transforms.CIFAR10_transform(key='train')[source]¶
Return a transform applied to data augmentation for CIFAR10.
- class nnabla_nas.utils.data.transforms.Compose(transforms)[source]¶
Bases:
object
Composes several transforms together.
- Parameters:
transforms (list of
Transform
objects) – list of transforms to compose.
- class nnabla_nas.utils.data.transforms.Cutout(length, prob=0.5, seed=-1)[source]¶
Bases:
object
Cutout layer.
Cutout is a simple regularization technique for convolutional neural networks that involves removing contiguous sections of input images, effectively augmenting the dataset with partially occluded versions of existing samples.
- Parameters:
length (int) – The lenth of region, which will be cutout.
prob (float, optional) – Probability of earasing. Defaults to 0.5.
References
- [1] DeVries, Terrance, and Graham W. Taylor. “Improved regularization
of convolutional neural networks with cutout.” arXiv preprint arXiv:1708.04552 (2017).
- nnabla_nas.utils.data.transforms.ImageNet_transform(key='train')[source]¶
Return a transform applied to data augmentation for ImageNet.
- class nnabla_nas.utils.data.transforms.Lambda(func)[source]¶
Bases:
object
Apply a user-defined lambda as a transform.
- Parameters:
func (function) – Lambda/function to be used for transform.
- class nnabla_nas.utils.data.transforms.Normalize(mean, std, scale)[source]¶
Bases:
object
Normalizes a input image with mean and standard deviation.
Given mean:
(M1,...,Mn)
and std:(S1,..,Sn)
forn
channels, this transform will normalize each channel of the input image i.e.input[channel] = (input[channel] - mean[channel]) / std[channel]
- Parameters:
mean (sequence) – Sequence of means for each channel.
std (sequence) – Sequence of standard deviations for each channel.
scale (float) – Scales the inputs by a scalar.
- class nnabla_nas.utils.data.transforms.RandomCrop(shape, pad_width=None)[source]¶
Bases:
object
RandomCrop randomly extracts a portion of an array.
- Parameters:
shape ([type]) – [description]
pad_width (tuple of int, optional) – Iterable of before and after pad values. Defaults to None. Pad the input N-D array x over the number of dimensions given by half the length of the pad_width iterable, where every two values in pad_width determine the before and after pad size of an axis. The pad_width iterable must hold an even number of positive values which may cover all or fewer dimensions of the input variable x.
- class nnabla_nas.utils.data.transforms.RandomHorizontalFlip[source]¶
Bases:
object
Horizontally flip the given Image randomly with a probability 0.5.
- class nnabla_nas.utils.data.transforms.RandomResizedCrop(shape, scale=None, ratio=None, interpolation='linear')[source]¶
Bases:
object
Crop a random portion of image and resize it.
- Parameters:
shape (tuple of int) – The output image shape.
scale (tuple of float) – lower and upper scale ratio when randomly scaling the image.
ratio (float) – The aspect ratio range when randomly deforming the image. For example, to deform aspect ratio of image from 1:1.3 to 1.3:1, specify “1.3”. To not apply random deforming, specify “1.0”.
interpolation (str) – Interpolation mode chosen from (‘linear’|’nearest’). The default is ‘linear’.
- class nnabla_nas.utils.data.transforms.RandomVerticalFlip[source]¶
Bases:
object
Vertically flip the given PIL Image randomly with a probability 0.5.
- class nnabla_nas.utils.data.transforms.Resize(size, interpolation='linear')[source]¶
Bases:
object
Resize an ND array with interpolation.
- Parameters:
size (tuple of int) – The output sizes for axes. If this is given, the scale factors are determined by the output sizes and the input sizes.
interpolation (str) – Interpolation mode chosen from (‘linear’|’nearest’). The default is ‘linear’.
Estimator¶
SummaryWriter¶
- class nnabla_nas.utils.tensorboard.writer.FileWriter(log_dir, max_queue=10, flush_secs=120, filename_suffix='')[source]¶
Bases:
object
Write protocol buffers to event files.
- Parameters:
log_dir (str) – Directory where event file will be written.
max_queue (int, optional) – Size of the queue for pending events and summaries before one of the ‘add’ calls forces a flush to disk. Defaults to 10.
flush_secs (int, optional) – How often, in seconds, to flush the pending events and summaries to disk. Defaults to every two minutes (120s).
filename_suffix (str, optional) – Suffix added to all event filenames in the log_dir directory.
- add_event(event, step=None, walltime=None)[source]¶
Adds an event to the event file.
- Parameters:
event – An Event protocol buffer.
step (int, optional) – Optional global step value for training process to record with the event.
walltime – float. Optional walltime to override the default (current) walltime (from time.time()) seconds after epoch.
- add_graph(graph_profile, walltime=None)[source]¶
Adds a Graph and step stats protocol buffer to the event file.
- Parameters:
graph_profile – A Graph and step stats protocol buffer.
walltime (float, optional) – Optional walltime to override the default (current) walltime (from time.time()) seconds after epoch.
- add_summary(summary, global_step=None, walltime=None)[source]¶
Adds a Summary protocol buffer to the event file.
- Parameters:
summary – A Summary protocol buffer.
global_step (int, optional) – Optional global step value for training process to record with the summary.
walltime (float, optional) – Optional walltime to override the default (current) walltime (from time.time()) seconds after epoch.
- class nnabla_nas.utils.tensorboard.writer.SummaryWriter(log_dir=None, comment='', purge_step=None, max_queue=10, flush_secs=120, filename_suffix='')[source]¶
Bases:
object
Creates a SummaryWriter that will write out events and summaries to the event file.
- Parameters:
log_dir (string) – Save directory location. Default is runs/CURRENT_DATETIME_HOSTNAME, which changes after each run. Use hierarchical folder structure to compare between runs easily. e.g. pass in ‘runs/exp1’, ‘runs/exp2’, etc. for each new experiment to compare across them.
comment (string) – Comment log_dir suffix appended to the default log_dir. If log_dir is assigned, this argument has no effect.
purge_step (int) – Note that crashed and resumed experiments should have the same
log_dir
.max_queue (int) – Size of the queue for pending events and summaries before one of the ‘add’ calls forces a flush to disk. Default is ten items.
flush_secs (int) – How often, in seconds, to flush the pending events and summaries to disk. Default is every two minutes.
filename_suffix (string) – Suffix added to all event filenames in the log_dir directory. More details on filename construction in tensorboard.summary.writer.event_file_writer.EventFileWriter.
nnabla_nas.contrib¶
DARTS¶
- class nnabla_nas.contrib.classification.darts.SearchNet(in_channels, init_channels, num_cells, num_classes, num_choices=4, multiplier=4, mode='full', shared=False, stem_multiplier=3)[source]¶
Bases:
ClassificationModel
DARTS: Differentiable Architecture Search.
This is the search space for DARTS.
- Parameters:
in_channels (int) – The number of input channels.
init_channels (int) – The initial number of channels on each cell.
num_cells (int) – The number of cells.
num_classes (int) – The number of classes.
num_choices (int, optional) – The number of choice blocks on each cell. Defaults to 4.
multiplier (int, optional) – The multiplier. Defaults to 4.
mode (str, optional) – The sampling strategy (‘full’, ‘max’, ‘sample’). Defaults to ‘full’.
shared (bool, optional) – If parameters are shared between cells. Defaults to False.
stem_multiplier (int, optional) – The multiplier used for stem convolution. Defaults to 3.
- get_arch_parameters(grad_only=False)[source]¶
Returns an OrderedDict containing architecture parameters.
- Parameters:
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.
- Returns:
A dictionary containing parameters.
- Return type:
OrderedDict
- get_net_parameters(grad_only=False)[source]¶
Returns an OrderedDict containing model parameters.
- Parameters:
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.
- Returns:
A dictionary containing parameters.
- Return type:
OrderedDict
- loss(outputs, targets, loss_weights=None)[source]¶
Return loss computed from a list of outputs and list of targets.
- Parameters:
outputs (list of nn.Variable) – A list of output variables computed from the model.
targets (list of nn.Variable) – A list of target variables loaded from the data.
loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.
- Returns:
A scalar NNabla Variable represents the loss.
- Return type:
nn.Variable
- save_net_nnp(path, inp, out, calc_latency=False, func_real_latency=None, func_accum_latency=None, save_params=None)[source]¶
Saves whole net as one nnp Calc whole net (real) latency (using e.g.Nnabla’s [Profiler]) Calculate also layer-based latency The modules are discovered using the nnabla graph of the whole net The latency is then calculated based on each individual module’s nnabla graph (e.g. [LatencyGraphEstimator])
- Parameters:
path – absolute path
inp – input of the created network
out – output of the created network
calc_latency – flag for calc latency
func_real_latency – function to use to calc actual latency
func_accum_latency – function to use to calc accum. latency, this is, dissecting the network layer by layer using the graph of the network, calculate the latency for each layer and add up all these results.
- save_parameters(path=None, params=None, grad_only=False)[source]¶
Saves the parameters to a file.
- Parameters:
path (str) – Absolute path to file.
params (OrderedDict, optional) – An OrderedDict containing parameters. If params is None, then the current parameters will be saved.
grad_only (bool, optional) – If need_grad=True is required for parameters which will be saved. Defaults to False.
- class nnabla_nas.contrib.classification.darts.TrainNet(in_channels, init_channels, num_cells, num_classes, genotype, num_choices=4, multiplier=4, stem_multiplier=3, drop_path=0, auxiliary=False)[source]¶
Bases:
ClassificationModel
TrainNet used for DARTS.
- loss(outputs, targets, loss_weights=None)[source]¶
Return loss computed from a list of outputs and list of targets.
- Parameters:
outputs (list of nn.Variable) – A list of output variables computed from the model.
targets (list of nn.Variable) – A list of target variables loaded from the data.
loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.
- Returns:
A scalar NNabla Variable represents the loss.
- Return type:
nn.Variable
MobileNet V2¶
- class nnabla_nas.contrib.classification.mobilenet.network.SearchNet(name='', num_classes=1000, width_mult=1, settings=None, drop_rate=0, candidates=None, mode='sample', skip_connect=True)[source]¶
Bases:
ClassificationModel
MobileNet V2 search space.
This implementation is based on the PyTorch implementation.
- Parameters:
num_classes (int) – Number of classes
width_mult (float, optional) – Width multiplier - adjusts number of channels in each layer by this amount
settings (list, optional) – Network structure. Defaults to None.
drop_rate (float, optional) – Drop rate used in Dropout. Defaults to 0.
candidates (list of str, optional) – A list of candicates. Defaults to None.
skip_connect (bool, optional) – Whether the skip connect is used. Defaults to True.
References
- Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.C., 2018.
Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4510-4520).
- get_arch_parameters(grad_only=False)[source]¶
Returns an OrderedDict containing architecture parameters.
- Parameters:
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.
- Returns:
A dictionary containing parameters.
- Return type:
OrderedDict
- get_net_parameters(grad_only=False)[source]¶
Returns an OrderedDict containing model parameters.
- Parameters:
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.
- Returns:
A dictionary containing parameters.
- Return type:
OrderedDict
- loss(outputs, targets, loss_weights=None)[source]¶
Return loss computed from a list of outputs and list of targets.
- Parameters:
outputs (list of nn.Variable) – A list of output variables computed from the model.
targets (list of nn.Variable) – A list of target variables loaded from the data.
loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.
- Returns:
A scalar NNabla Variable represents the loss.
- Return type:
nn.Variable
- property modules_to_profile¶
Returns a list with the modules that will be profiled when the Profiler functions are called. All other modules in the network will not be profiled
- class nnabla_nas.contrib.classification.mobilenet.network.TrainNet(num_classes=1000, width_mult=1, settings=None, drop_rate=0, candidates=None, mode='sample', skip_connect=True, genotype=None)[source]¶
Bases:
SearchNet
MobileNet V2 Train Net.
- Parameters:
num_classes (int) – Number of classes
width_mult (float, optional) – Width multiplier - adjusts number of channels in each layer by this amount
settings (list, optional) – Network structure. Defaults to None.
round_nearest (int, optional) – Round the number of channels in each layer to be a multiple of this number. Set to 1 to turn off rounding.
n_max (int, optional) – The number of blocks. Defaults to 4.
block – Module specifying inverted residual building block for mobilenet. Defaults to None.
mode (str, optional) – The sampling strategy (‘full’, ‘max’, ‘sample’). Defaults to ‘full’.
skip_connect (bool, optional) – Whether the skip connect is used. Defaults to True.
genotype (str, optional) – The path to architecture file. Defaults to None.
References
- [1] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.C.,
2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4510-4520).
Random Wired¶
- class nnabla_nas.contrib.classification.random_wired.random_wired.AvgPool2x2(parents, channels, name='')[source]¶
Bases:
RandomModule
A avg pooling module that accepts multiple parents. This pooling module is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.
- Parameters:
parents (list) – the parent modules to this module
name (string, optional) – the name of the module
channels (int) – ignored
- class nnabla_nas.contrib.classification.random_wired.random_wired.Conv(parents, channels, kernel, pad, name='')[source]¶
Bases:
RandomModule
A convolution that accepts multiple parents. This convolution is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.
- Parameters:
parents (list) – the parent modules to this module
name (string, optional) – the name of the module
channels (int) – the number of output channels of this module
kernel (tuple) – the kernel shape
pad (tuple) – the padding scheme used
- class nnabla_nas.contrib.classification.random_wired.random_wired.Conv3x3(parents, channels, name='')[source]¶
Bases:
Conv
A convolution of shape 3x3 that accepts multiple parents. This convolution is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.
- Parameters:
parents (list) – the parent modules to this module
name (string, optional) – the name of the module
channels (int) – the number of output channels of this module
- class nnabla_nas.contrib.classification.random_wired.random_wired.Conv5x5(parents, channels, name='')[source]¶
Bases:
Conv
A convolution of shape 5x5 that accepts multiple parents. This convolution is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.
- Parameters:
parents (list) – the parent modules to this module
name (string, optional) – the name of the module
channels (int) – the number of output channels of this module
- class nnabla_nas.contrib.classification.random_wired.random_wired.MaxPool2x2(parents, channels, name='')[source]¶
Bases:
RandomModule
A max pooling module that accepts multiple parents. This pooling module is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.
- Parameters:
parents (list) – the parent modules to this module
name (string, optional) – the name of the module
channels (int) – ignored
- class nnabla_nas.contrib.classification.random_wired.random_wired.RandomModule(parents, channels, name='')[source]¶
Bases:
Graph
A module that automatically aggregates all the output tensors generated by its parents. Therefore, we automatically adjusts the input channel count and the input feature map dimensions of each input through 1x1 convolution and pooling. The result is summed up. Please refer to [Xie et. al]
- Parameters:
parents (list) – the parent modules to this module
name (string, optional) – the name of the module
channels (int) – the number of output channels of this module
References
Xie, Saining, et al. “Exploring randomly wired neural networks for image recognition.” Proceedings of the IEEE International Conference on Computer Vision. 2019.
- class nnabla_nas.contrib.classification.random_wired.random_wired.SepConv(parents, channels, kernel, pad, name='')[source]¶
Bases:
RandomModule
A separable convolution that accepts multiple parents. This convolution is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.
- Parameters:
parents (list) – the parent modules to this module
name (string, optional) – the name of the module
channels (int) – the number of output channels of this module
kernel (tuple) – the kernel shape
pad (tuple) – the padding scheme used
- class nnabla_nas.contrib.classification.random_wired.random_wired.SepConv3x3(parents, channels, name='')[source]¶
Bases:
SepConv
A separable convolution of shape 3x3 that accepts multiple parents. This convolution is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.
- Parameters:
parents (list) – the parent modules to this module
name (string, optional) – the name of the module
channels (int) – the number of output channels of this module
- class nnabla_nas.contrib.classification.random_wired.random_wired.SepConv5x5(parents, channels, name='')[source]¶
Bases:
SepConv
A separable convolution of shape 5x5 that accepts multiple parents. This convolution is a random module, meaning that it automatically adjusts the dimensions of all input tensors and aggregates the result before applying the convolution.
- Parameters:
parents (list) – the parent modules to this module
name (string, optional) – the name of the module
channels (int) – the number of output channels of this module
- class nnabla_nas.contrib.classification.random_wired.random_wired.TrainNet(n_vertices=20, input_shape=(3, 32, 32), n_classes=10, candidates=[<class 'nnabla_nas.contrib.classification.random_wired.random_wired.RandomModule'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.SepConv3x3'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.SepConv5x5'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.RandomModule'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.SepConv3x3'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.SepConv5x5'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.RandomModule'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.SepConv3x3'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.SepConv5x5'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.MaxPool2x2'>, <class 'nnabla_nas.contrib.classification.random_wired.random_wired.AvgPool2x2'>], min_channels=128, max_channels=1024, k=4, p=0.75, name='')[source]¶
Bases:
ClassificationModel
,Graph
A randomly wired DNN that uses the Watts-Strogatz process to generate random DNN architectures. Please refer to [Xie et. al]
- Parameters:
n_vertice (int) – the number of random modules within this network
input_shape (tuple) – the shape of the input of this network
n_classes (int) – the number of output classes of this network
candidates (list) – a list of random_modules which are randomly instantiated as vertices
min_channels (int) – the minimum channel count of a vertice
max_channels (int) – the maximum channel count of a vertice
k (int) – the connectivity parameter of the Watts-Strogatz process
p (float) – the re-wiring probability parameter of the Watts-Strogatz process
name (string) – the name of the network
References
- Xie, Saining, et al. “Exploring randomly wired neural networks
for image recognition.” Proceedings of the IEEE International Conference on Computer Vision. 2019.
- get_arch_parameters(grad_only=False)[source]¶
- Returns an OrderedDict containing all architecture parameters of
the model.
- Parameters:
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True will be retrieved. Defaults to False.
- Raises:
NotImplementedError – [description]
- get_net_parameters(grad_only=False)[source]¶
Returns an OrderedDict containing all network parmeters of the model.
- Parameters:
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True will be retrieved. Defaults to False.
- Raises:
NotImplementedError –
- property input_shapes¶
Return a list of input shapes used during call function.
- property modules_to_profile¶
Returns a list with the modules that will be profiled when the Profiler functions are called. All other modules in the network will not be profiled
Zoph¶
- class nnabla_nas.contrib.classification.zoph.zoph.AveragePool3x3(parents, channels, name='', eval_prob=None)[source]¶
Bases:
Graph
A static average pooling of size 3x3 followed by batch normalization and ReLU
- Parameters:
parents (list) – a list of static modules that are parents to this module
channels (int) – the number of features
name (string, optional) – the name of the module
- class nnabla_nas.contrib.classification.zoph.zoph.DilSepConv3x3(parents, channels, name='', eval_prob=None)[source]¶
Bases:
SepConvBN
A static dilated separable convolution of shape 3x3 that applies batchnorm and relu at the end.
- Parameters:
parents (list) – a list of static modules that are parents to this module
channels (
int
) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.name (string, optional) – the name of the module
- class nnabla_nas.contrib.classification.zoph.zoph.DilSepConv5x5(parents, channels, name='', eval_prob=None)[source]¶
Bases:
SepConvBN
A static dilated separable convolution of shape 5x5 that applies batchnorm and relu at the end.
- Parameters:
parents (list) – a list of static modules that are parents to this module
channels (
int
) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.name (string, optional) – the name of the module
- class nnabla_nas.contrib.classification.zoph.zoph.MaxPool3x3(parents, channels, name='', eval_prob=None)[source]¶
Bases:
Graph
A static max pooling of size 3x3 followed by batch normalization and ReLU
- Parameters:
parents (list) – a list of static modules that are parents to this module
channels (int) – the number of features
name (string, optional) – the name of the module
- class nnabla_nas.contrib.classification.zoph.zoph.SearchNet(name='', input_shape=(3, 32, 32), n_classes=10, stem_channels=128, cells=[<class 'nnabla_nas.contrib.classification.zoph.zoph.ZophCell'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.ZophCell'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.ZophCell'>], cell_depth=[7, 7, 7], cell_channels=[128, 256, 512], reducing=[False, True, True], join_parameters=[[None, None, None, None, None, None, None], [None, None, None, None, None, None, None], [None, None, None, None, None, None, None]], candidates=[<class 'nnabla_nas.contrib.classification.zoph.zoph.SepConv3x3'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.SepConv5x5'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.DilSepConv3x3'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.DilSepConv5x5'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.MaxPool3x3'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.AveragePool3x3'>, <class 'nnabla_nas.module.static.static_module.Identity'>, <class 'nnabla_nas.module.static.static_module.Zero'>], mode='sample')[source]¶
Bases:
ClassificationModel
,Graph
A search space as defined in [Bender et. al]
- Parameters:
name (string, optional) – the name of the module
input_shape (tuple) – the shape of the network input
n_classes (int) – the number of output classes
stem_channels (int) – the number of channels for the stem convolutions
cells (list) – the type of the cells used within this search space
cell_depth (list) – the number of modules within each cell
reducing (list) – specifies for each cell if it reduces the feature map dimensions through pooling
join_parameters (list) – the join_parameters used in each cell and block.
candidates (list, optional) – the candidate modules instantiated within this block (e.g. ZOPH_CANDIDATES)
mode (string) – the mode which the join modules within this network use
References
- Bender, Gabriel. “Understanding and simplifying one-shot
architecture search.” (2019).
- get_arch_parameters(grad_only=False)[source]¶
- Returns an OrderedDict containing all architecture parameters of
the model.
- Parameters:
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True will be retrieved. Defaults to False.
- Raises:
NotImplementedError – [description]
- get_net_parameters(grad_only=False)[source]¶
Returns an OrderedDict containing all network parmeters of the model.
- Parameters:
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True will be retrieved. Defaults to False.
- Raises:
NotImplementedError –
- property input_shapes¶
Return a list of input shapes used during call function.
- property modules_to_profile¶
Returns a list with the modules that will be profiled when the Profiler functions are called. All other modules in the network will not be profiled
- class nnabla_nas.contrib.classification.zoph.zoph.SepConv(parents, in_channels, out_channels, kernel, pad, dilation, with_bias, name='', eval_prob=None)[source]¶
Bases:
Graph
A static separable convolution (DepthWise conv + PointWise conv)
- Parameters:
parents (list) – a list of static modules that are parents to this module
in_channels (
int
) – Number of convolution kernels (which is equal to the number of input channels).out_channels (
int
) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.kernel (
tuple
ofint
) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).pad (
tuple
ofint
, optional) – Padding sizes for dimensions. Defaults to None.dilation (
tuple
ofint
, optional) – Dilation sizes for dimensions. Defaults to None.with_bias (bool, optional) – Specify whether to include the bias term. Defaults to True.
name (string, optional) – the name of the module
- class nnabla_nas.contrib.classification.zoph.zoph.SepConv3x3(parents, channels, name='', eval_prob=None)[source]¶
Bases:
SepConvBN
A static separable convolution of shape 3x3 that applies batchnorm and relu at the end.
- Parameters:
parents (list) – a list of static modules that are parents to this module
channels (
int
) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.name (string, optional) – the name of the module
- class nnabla_nas.contrib.classification.zoph.zoph.SepConv5x5(parents, channels, name='', eval_prob=None)[source]¶
Bases:
SepConvBN
A static separable convolution of shape 5x5 that applies batchnorm and relu at the end.
- Parameters:
parents (list) – a list of static modules that are parents to this module
channels (
int
) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.name (string, optional) – the name of the module
- class nnabla_nas.contrib.classification.zoph.zoph.SepConvBN(parents, out_channels, kernel, dilation, name='', eval_prob=None)[source]¶
Bases:
Graph
Two static separable convolutions followed by batchnorm and relu at the end.
- Parameters:
parents (list) – a list of static modules that are parents to this module
out_channels (
int
) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.kernel (
tuple
ofint
) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).dilation (
tuple
ofint
, optional) – Dilation sizes for dimensions. Defaults to None.name (string, optional) – the name of the module
- class nnabla_nas.contrib.classification.zoph.zoph.TrainNet(name, input_shape=(3, 32, 32), n_classes=10, stem_channels=128, cells=[<class 'nnabla_nas.contrib.classification.zoph.zoph.ZophCell'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.ZophCell'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.ZophCell'>], cell_depth=[7, 7, 7], cell_channels=[128, 256, 512], reducing=[False, True, True], join_parameters=[[None, None, None, None, None, None, None], [None, None, None, None, None, None, None], [None, None, None, None, None, None, None]], candidates=[<class 'nnabla_nas.contrib.classification.zoph.zoph.SepConv3x3'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.SepConv5x5'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.DilSepConv3x3'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.DilSepConv5x5'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.MaxPool3x3'>, <class 'nnabla_nas.contrib.classification.zoph.zoph.AveragePool3x3'>, <class 'nnabla_nas.module.static.static_module.Identity'>, <class 'nnabla_nas.module.static.static_module.Zero'>], param_path=None, *args, **kwargs)[source]¶
Bases:
SearchNet
A search space as defined in [Bender et. al]. Its the same as SearchNet, just that mode is fixed to ‘max’.
- Parameters:
name (string, optional) – the name of the module
input_shape (tuple) – the shape of the network input
n_classes (int) – the number of output classes
stem_channels (int) – the number of channels for the stem convolutions
cells (list) – the type of the cells used within this search space
cell_depth (list) – the number of modules within each cell
reducing (list) – specifies for each cell if it reduces the feature map dimensions through pooling
join_parameters (list) – the join_parameters used in each cell and block
candidates (list, optional) – the candidate modules instantiated within this block (e.g. ZOPH_CANDIDATES)
mode (string) – the mode which the join modules within this network use
References
- Bender, Gabriel. “Understanding and simplifying one-shot
architecture search.” (2019).
- class nnabla_nas.contrib.classification.zoph.zoph.ZophBlock(parents, candidates, channels, name='', join_parameters=None)[source]¶
Bases:
Graph
A zoph block as defined in [Bender et. al]
- Parameters:
parents (list) – a list of static modules that are parents to this module
name (string, optional) – the name of the module
candidates (list) – the candidate modules instantiated within this block (e.g. ZOPH_CANDIDATES)
channels (int) – the number of output channels of this block
join_parameters (nnabla variable, optional) – the architecture parameters used to join the outputs of the candidate modules. join_parameters must have the same number of elements as we have candidates.
References
- Bender, Gabriel. “Understanding and simplifying one-shot
architecture search.” (2019).
- class nnabla_nas.contrib.classification.zoph.zoph.ZophCell(parents, candidates, channels, name='', n_modules=3, reducing=False, join_parameters=[None, None, None])[source]¶
Bases:
Graph
A zoph cell that consists of multiple zoph blocks, as defined in [Bender et. al]
- Parameters:
parents (list) – a list of static modules that are parents to this module
name (string, optional) – the name of the module
candidates (list) – the candidate modules instantiated within this block (e.g. ZOPH_CANDIDATES)
channels (int) – the number of output channels of this block
join_parameters (list of nnabla variable, optional) – lift of the architecture parameters used to join the outputs of the candidate modules. Each element in join_parameters must have the same number of elements as we have candidates. The length of this list must be n_modules.
References
- Bender, Gabriel. “Understanding and simplifying one-shot
architecture search.” (2019).
FairNas¶
- class nnabla_nas.contrib.classification.fairnas.SearchNet(num_classes=1000, width_mult=1, settings=None, drop_rate=0, candidates=None, skip_connect=True, weights=None, seed=123)[source]¶
Bases:
ClassificationModel
MobileNet V2 search space.
This implementation is based on the PyTorch implementation.
- Parameters:
num_classes (int) – Number of classes
width_mult (float, optional) – Width multiplier - adjusts number of channels in each layer by this amount
settings (list, optional) – Network structure. Defaults to None.
drop_rate (float, optional) – Drop rate used in Dropout. Defaults to 0.
candidates (list of str, optional) – A list of candicates. Defaults to None.
skip_connect (bool, optional) – Whether the skip connect is used. Defaults to True.
weight (str, optional) – The path to weight file. Defaults to None.
seed (int, optional) – The seed for the random generator.
References
- Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.C., 2018.
Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4510-4520).
- get_net_parameters(grad_only=False)[source]¶
Returns an OrderedDict containing architecture parameters.
- Parameters:
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.
- Returns:
A dictionary containing parameters.
- Return type:
OrderedDict
- loss(outputs, targets, loss_weights=None)[source]¶
Return loss computed from a list of outputs and list of targets.
- Parameters:
outputs (list of nn.Variable) – A list of output variables computed from the model.
targets (list of nn.Variable) – A list of target variables loaded from the data.
loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.
- Returns:
A scalar NNabla Variable represents the loss.
- Return type:
nn.Variable
- class nnabla_nas.contrib.classification.fairnas.TrainNet(num_classes=1000, width_mult=1, settings=None, drop_rate=0, candidates=None, skip_connect=True, genotype=None, weights=None)[source]¶
Bases:
SearchNet
MobileNet V2 Train Net.
- Parameters:
num_classes (int) – Number of classes
width_mult (float, optional) – Width multiplier - adjusts number of channels in each layer by this amount
settings (list, optional) – Network structure. Defaults to None.
round_nearest (int, optional) – Round the number of channels in each layer to be a multiple of this number. Set to 1 to turn off rounding.
n_max (int, optional) – The number of blocks. Defaults to 4.
block – Module specifying inverted residual building block for mobilenet. Defaults to None.
skip_connect (bool, optional) – Whether the skip connect is used. Defaults to True.
genotype (str, optional) – The path to architecture file. Defaults to None.
References
- Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.C., 2018.
Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4510-4520).
OFAMobileNetV3¶
- class nnabla_nas.contrib.classification.ofa.networks.ofa_mbv3.OFAMbv3Net(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, base_stage_width=None, width_mult=1.0, op_candidates='MB6 3x3', depth_candidates=4, compound=False, fixed_kernel=False, weight_init='he_fout', weights=None)[source]¶
Bases:
ClassificationModel
MobileNet V3 Search Net.
- Parameters:
num_classes (int, optional) – Number of classes. Defaults to 1000.
bn_param (tuple, optional) – BatchNormalization decay rate and eps. Defaults to (0.9, 1e-5).
drop_rate (float, optional) – Drop rate used in Dropout. Defaults to 0.1.
base_stage_width (list of int, optional) – A list of base stage channel size. Defaults to None.
width_mult (float, optional) – Multiplier value to base stage channel size. Defaults to 1.0.
op_candidates (str or list of str, optional) – Operator choices. Defaults to “MB6 3x3”.
depth_candidates (int or list of int, optional) – Depth choices. Defaults to 4.
compound (bool, optional) – Use CompOFA or not. Defaults to False.
fixed_kernel (bool, optional) – Fix kernel or not. Defaults to False.
weight_init (str, optional) – Weight initializer. Defaults to ‘he_fout’.
weights (str, optional) – The relative path to weight file. Defaults to None.
References
- [1] Cai, Han, et al. “Once-for-all: Train one network and specialize it for
efficient deployment.” arXiv preprint arXiv:1908.09791 (2019).
- CHANNEL_DIVISIBLE = 8¶
- get_net_parameters(grad_only=False)[source]¶
Returns an OrderedDict containing architecture parameters.
- Parameters:
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.
- Returns:
A dictionary containing parameters.
- Return type:
OrderedDict
- property grouped_block_index¶
- loss(outputs, targets, loss_weights=None)[source]¶
Return loss computed from a list of outputs and list of targets.
- Parameters:
outputs (list of nn.Variable) – A list of output variables computed from the model.
targets (list of nn.Variable) – A list of target variables loaded from the data.
loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.
- Returns:
A scalar NNabla Variable represents the loss.
- Return type:
nn.Variable
- set_bn_param(decay_rate, eps, **kwargs)[source]¶
Sets decay_rate and eps to batchnormalization layers.
- Parameters:
decay_rate (float) – Deccay rate of running mean and variance.
eps (float) – Tiny value to avoid zero division by std.
- set_parameters(params, raise_if_missing=False)[source]¶
Set parameters for the module.
- Parameters:
params (OrderedDict) – The parameters which will be loaded.
raise_if_missing (bool, optional) – Raise exception if some parameters are missing. Defaults to False.
- Raises:
ValueError – Parameters are not found.
- class nnabla_nas.contrib.classification.ofa.networks.ofa_mbv3.SearchNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, base_stage_width=None, width_mult=1.0, op_candidates='MB6 3x3', depth_candidates=4, compound=False, fixed_kernel=False, weight_init='he_fout', weights=None)[source]¶
Bases:
OFAMbv3Net
- class nnabla_nas.contrib.classification.ofa.networks.ofa_mbv3.TrainNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, base_stage_width=None, width_mult=1, op_candidates=None, depth_candidates=None, genotype=None, weights=None)[source]¶
Bases:
OFAMbv3Net
MobileNet V3 Train Net.
- Parameters:
num_classes (int, optional) – Number of classes. Defaults to 1000.
bn_param (tuple, optional) – BatchNormalization decay rate and eps. Defaults to (0.9, 1e-5).
drop_rate (float, optional) – Drop rate used in Dropout. Defaults to 0.1.
base_stage_width (list of int, optional) – A list of base stage channel size. Defaults to None.
width_mult (float, optional) – Multiplier value to base stage channel size. Defaults to 1.0.
op_candidates (str or list of str, optional) – Operator choices. Defaults to None.
depth_candidates (int or list of int, optional) – Depth choices. Defaults to None.
genotype (list of int, optional) – A list to operators. Defaults to None.
weights (str, optional) – Relative path to the weights file. Defaults to None.
OFAXception¶
- class nnabla_nas.contrib.classification.ofa.networks.ofa_xception.OFAXceptionNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, base_stage_width=[32, 64, 128, 256, 728, 1024, 1536, 2048], op_candidates='XP1 7x7 3', width_mult=1.0, weights=None)[source]¶
Bases:
ClassificationModel
Xception41 Base Class
This is the Base Class used for both TrainNet and SearchNet. This implementation is based on the PyTorch implementation given in References.
- Parameters:
num_classes (int) – Number of classes
bn_param (tuple, optional) – BatchNormalization decay rate and eps.
drop_rate (float, optional) – Drop rate used in Dropout in classifier. Defaults to 0.1.
base_stage_width (list of int, optional) – A list of base stage channel size. Defaults to None.
op_candidates (str or list of str, optional) – Operator choices. Defaults to “XP1 7x7 3” (the largest block in the search space).
width_mult (float, optional) – Multiplier value to base stage channel size. Defaults to 1.0.
weight (str, optional) – The path to weight file. Defaults to None.
References
- [1] Cai, Han, et al. “Once-for-all: Train one network and specialize it for
efficient deployment.” arXiv preprint arXiv:1908.09791 (2019).
- [2] GitHub implementation of Xception41.
https://github.com/Cadene/pretrained-models.pytorch/blob/master/pretrainedmodels/models/xception.py
- CHANNEL_DIVISIBLE = 8¶
- NUM_MIDDLE_BLOCKS = 8¶
- get_arch_parameters(grad_only=False)[source]¶
Returns an OrderedDict containing architecture parameters.
- Parameters:
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.
- Returns:
A dictionary containing parameters.
- Return type:
OrderedDict
- get_bn_param()[source]¶
Return dict of batchnormalization params.
- Returns:
A dictionary containing decay_rate and eps of batchnormalization
- Return type:
dict
- get_net_parameters(grad_only=False)[source]¶
Returns an OrderedDict containing architecture parameters.
- Parameters:
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.
- Returns:
A dictionary containing parameters.
- Return type:
OrderedDict
- loss(outputs, targets, loss_weights=None)[source]¶
Return loss computed from a list of outputs and list of targets.
- Parameters:
outputs (list of nn.Variable) – A list of output variables computed from the model.
targets (list of nn.Variable) – A list of target variables loaded from the data.
loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.
- Returns:
A scalar NNabla Variable represents the loss.
- Return type:
nn.Variable
- set_bn_param(decay_rate, eps, **kwargs)[source]¶
Sets decay_rate and eps to batchnormalization layers.
- Parameters:
decay_rate (float) – Deccay rate of running mean and variance.
eps (float) – Tiny value to avoid zero division by std.
- set_parameters(params, raise_if_missing=False)[source]¶
Set parameters for the module.
- Parameters:
params (OrderedDict) – The parameters which will be loaded.
raise_if_missing (bool, optional) – Raise exception if some parameters are missing. Defaults to False.
- Raises:
ValueError – Parameters are not found.
- class nnabla_nas.contrib.classification.ofa.networks.ofa_xception.ProcessGenotype[source]¶
Bases:
object
This class defines the search space and contains functions to process the genotypes and op_candidates to get the subnet architecture or the search space.
Operator candidates: “XP{E} {K}x{K} {D}”, E=expand_ratio, K=kernel_size, D=depth_of_block
Note: If depth of a block==1, expand_ratio will be ignored since we just need in_channels and out_channels for a block with a single layer. So blocks: [“XP0.6 KxK 1”, “XP0.8 KxK 1”, “XP1 KxK 1”] are equivalent in this architecture design.
- CANDIDATES = {'XP0.6 3x3 1': {'depth': 1, 'expand_ratio': 0.6, 'ks': 3}, 'XP0.6 3x3 2': {'depth': 2, 'expand_ratio': 0.6, 'ks': 3}, 'XP0.6 3x3 3': {'depth': 3, 'expand_ratio': 0.6, 'ks': 3}, 'XP0.6 5x5 1': {'depth': 1, 'expand_ratio': 0.6, 'ks': 5}, 'XP0.6 5x5 2': {'depth': 2, 'expand_ratio': 0.6, 'ks': 5}, 'XP0.6 5x5 3': {'depth': 3, 'expand_ratio': 0.6, 'ks': 5}, 'XP0.6 7x7 1': {'depth': 1, 'expand_ratio': 0.6, 'ks': 7}, 'XP0.6 7x7 2': {'depth': 2, 'expand_ratio': 0.6, 'ks': 7}, 'XP0.6 7x7 3': {'depth': 3, 'expand_ratio': 0.6, 'ks': 7}, 'XP0.8 3x3 1': {'depth': 1, 'expand_ratio': 0.8, 'ks': 3}, 'XP0.8 3x3 2': {'depth': 2, 'expand_ratio': 0.8, 'ks': 3}, 'XP0.8 3x3 3': {'depth': 3, 'expand_ratio': 0.8, 'ks': 3}, 'XP0.8 5x5 1': {'depth': 1, 'expand_ratio': 0.8, 'ks': 5}, 'XP0.8 5x5 2': {'depth': 2, 'expand_ratio': 0.8, 'ks': 5}, 'XP0.8 5x5 3': {'depth': 3, 'expand_ratio': 0.8, 'ks': 5}, 'XP0.8 7x7 1': {'depth': 1, 'expand_ratio': 0.8, 'ks': 7}, 'XP0.8 7x7 2': {'depth': 2, 'expand_ratio': 0.8, 'ks': 7}, 'XP0.8 7x7 3': {'depth': 3, 'expand_ratio': 0.8, 'ks': 7}, 'XP1 3x3 1': {'depth': 1, 'expand_ratio': 1, 'ks': 3}, 'XP1 3x3 2': {'depth': 2, 'expand_ratio': 1, 'ks': 3}, 'XP1 3x3 3': {'depth': 3, 'expand_ratio': 1, 'ks': 3}, 'XP1 5x5 1': {'depth': 1, 'expand_ratio': 1, 'ks': 5}, 'XP1 5x5 2': {'depth': 2, 'expand_ratio': 1, 'ks': 5}, 'XP1 5x5 3': {'depth': 3, 'expand_ratio': 1, 'ks': 5}, 'XP1 7x7 1': {'depth': 1, 'expand_ratio': 1, 'ks': 7}, 'XP1 7x7 2': {'depth': 2, 'expand_ratio': 1, 'ks': 7}, 'XP1 7x7 3': {'depth': 3, 'expand_ratio': 1, 'ks': 7}}¶
- class nnabla_nas.contrib.classification.ofa.networks.ofa_xception.SearchNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, base_stage_width=[32, 64, 128, 256, 728, 1024, 1536, 2048], width_mult=1.0, op_candidates='XP1 7x7 3', weights=None)[source]¶
Bases:
OFAXceptionNet
Xception41 Search Net.
This defines the search space of OFA-Xception Model.
- Parameters:
num_classes (int) – Number of classes
bn_param (tuple, optional) – BatchNormalization decay rate and eps.
drop_rate (float, optional) – Drop rate used in Dropout of classifier. Defaults to 0.1.
base_stage_width (list of int, optional) – A list of base stage channel size. Defaults to [32, 64, 128, 256, 728, 1024, 1536, 2048].
width_mult (float, optional) – Multiplier value to base stage channel size. Defaults to 1.0.
op_candidates (str or list of str, optional) – Operator choices. Defaults to “XP1 7x7 3” (the largest block in the search space)
weights (str, optional) – The path to weight file. Defaults to None.
- class nnabla_nas.contrib.classification.ofa.networks.ofa_xception.TrainNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, base_stage_width=[32, 64, 128, 256, 728, 1024, 1536, 2048], width_mult=1, op_candidates=None, genotype=None, weights=None)[source]¶
Bases:
OFAXceptionNet
Xception41 Train Net.
This builds and initialises the OFA-Xception subnet architecture which is passed as a genotype list along with the corresponding op_candidates list to decode the genotypes.
- Parameters:
num_classes (int) – Number of classes
bn_param (tuple, optional) – BatchNormalization decay rate and eps.
drop_rate (float, optional) – Drop rate used in Dropout of classifier. Defaults to 0.1.
base_stage_width (list of int, optional) – A list of base stage channel size. Defaults to [32, 64, 128, 256, 728, 1024, 1536, 2048].
width_mult (float, optional) – Multiplier value to base stage channel size. Defaults to 1.0.
op_candidates (str or list of str, optional) – Operator choices. Defaults to None. [Necessary Argument]
genotype (list of int, optional) – A list to operators. Defaults to None.
weights (str, optional) – The path to weight file. Defaults to None.
OFAResnet50¶
- class nnabla_nas.contrib.classification.ofa.networks.ofa_resnet50.OFAResNet50(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, depth_list=2, expand_ratio_list=0.25, width_mult_list=1.0, weight_init='he_fout', weights=None)[source]¶
Bases:
ClassificationModel
OFAResNet50 Base Class.
This is the Base Class used for both TrainNet and SearchNet. This implementation is based on the PyTorch implementation given in References.
- Parameters:
num_classes (int) – Number of classes
bn_param (tuple, optional) – BatchNormalization decay rate and eps.
drop_rate (float, optional) – Drop rate used in Dropout in classifier. Defaults to 0.1.
depth_list (int or list of int, optional) – Candidates of depth for each layer. Defaults to 2.
expand_ratio_list (float or list of float, optional) – Candidates of expand ratio for middle bottleneck layers. Defaults to 0.25.
width_mult_list (float or list of float, optional) – Candidates of width multiplication ratio for input/output feature size of bottleneck layers. Defaults to 1.0.
weight_init (str, optional) – Weight initialization method. Defaults to ‘he_fout’.
weight (str, optional) – Path to weight file. Defaults to None.
References
- [1] Cai, Han, et al. “Once-for-all: Train one network and specialize it for
efficient deployment.” arXiv preprint arXiv:1908.09791 (2019).
- [2] GitHub implementation of Once-for-All.
- BASE_DEPTH_LIST = [2, 2, 4, 2]¶
- STAGE_WIDTH_LIST = [256, 512, 1024, 2048]¶
- get_net_parameters(grad_only=False)[source]¶
Returns an OrderedDict containing architecture parameters.
- Parameters:
grad_only (bool, optional) – If sets to True, then only parameters with need_grad=True are returned. Defaults to False.
- Returns:
A dictionary containing parameters.
- Return type:
OrderedDict
- property grouped_block_index¶
- loss(outputs, targets, loss_weights=None)[source]¶
Return loss computed from a list of outputs and list of targets.
- Parameters:
outputs (list of nn.Variable) – A list of output variables computed from the model.
targets (list of nn.Variable) – A list of target variables loaded from the data.
loss_weights (list of float, optional) – A list specifying scalar coefficients to weight the loss contributions of different model outputs. It is expected to have a 1:1 mapping to model outputs. Defaults to None.
- Returns:
A scalar NNabla Variable represents the loss.
- Return type:
nn.Variable
- set_bn_param(decay_rate, eps, **kwargs)[source]¶
Sets decay_rate and eps to batchnormalization layers. :param decay_rate: Deccay rate of running mean and variance. :type decay_rate: float :param eps: Tiny value to avoid zero division by std. :type eps: float
- set_parameters(params, raise_if_missing=False)[source]¶
Set parameters for the module.
- Parameters:
params (OrderedDict) – The parameters which will be loaded.
raise_if_missing (bool, optional) – Raise exception if some parameters are missing. Defaults to False.
- Raises:
ValueError – Parameters are not found.
- class nnabla_nas.contrib.classification.ofa.networks.ofa_resnet50.SearchNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, depth_list=2, expand_ratio_list=0.25, width_mult_list=1.0, weight_init='he_fout', weights=None)[source]¶
Bases:
OFAResNet50
OFAResNet50 Search Net.
This defines the search space of OFA-ResNet50 model.
- Parameters:
num_classes (int) – Number of classes
bn_param (tuple, optional) – BatchNormalization decay rate and eps.
drop_rate (float, optional) – Drop rate used in Dropout in classifier. Defaults to 0.1.
depth_list (int or list of int, optional) – Candidates of depth for each layer. Defaults to 2.
expand_ratio_list (float or list of float, optional) – Candidates of expand ratio for middle bottleneck layers. Defaults to 0.25.
width_mult_list (float or list of float, optional) – Candidates of width multiplication ratio for input/output feature size of bottleneck layers. Defaults to 1.0.
weight_init (str, optional) – Weight initialization method. Defaults to ‘he_fout’.
weight (str, optional) – Path to weight file. Defaults to None.
- class nnabla_nas.contrib.classification.ofa.networks.ofa_resnet50.TrainNet(num_classes=1000, bn_param=(0.9, 1e-05), drop_rate=0.1, depth_list=None, expand_ratio_list=None, width_mult_list=None, genotype=None, weights=None)[source]¶
Bases:
SearchNet
OFAResNet50 Train Net.
This builds and initialises the OFA-ResNet50 subnet architecture which is passed as a genotype list along with the corresponding depth, expand ratio, and width mult candidate list to decode the genotypes.
- Parameters:
num_classes (int) – Number of classes
bn_param (tuple, optional) – BatchNormalization decay rate and eps.
drop_rate (float, optional) – Drop rate used in Dropout in classifier. Defaults to 0.1.
depth_list (int or list of int, optional) – Candidates of depth for each layer. Defaults to None.
expand_ratio_list (float or list of float, optional) – Candidates of expand ratio for middle bottleneck layers. Defaults to None.
width_mult_list (float or list of float, optional) – Candidates of width multiplication ratio for input/output feature size of bottleneck layers. Defaults to None.
genotype (list, optional) – A list to operators. Defaults to None.
weight (str, optional) – Path to weight file. Defaults to None.