Introduction

The success of Deep Neural Networks (DNNs) is due to their ability to automate the feature engineering process. This success has been shown in many tasks, including s image recognition, speech recognition, and machine translation. The choice of network architecture is a particularly important step when we design DNN based machine learning algorithms. A network architecture is a description of which layers are used in a DNN, how each layer is parametrized and how the layers are connected. Commonly known classes of network architectures are for example feed-forward DNNs, recursive DNNs, ResNets, Inception networks or MobileNets.

By improving the DNN architecture such that it is tailored specifically to one given task, we can further increase the performance of deep learning models [Elsken2018]. However, most of the neural architectures are designed manually. This is time-consuming, expensive and does not scale with an increasing number of new domains and learning tasks. A promising direction in automating machine learning is automating architecture engineering, the so-called neural network architecture search (NAS). Neural network architecture search is closely related to hyperparameter optimization and is a subfield of automated machine learning (AutoML). NNablaNAS is a framework for architecture search in computer vision domain. The main aim is to provide a modular, easy, and extendible toolbox for deep learning practitioners. In this section, an overview of the neural architecture search is introduced.

NAS Algorithms

NAS is a combinatorial and therefore a computationally complex optimization problem. A variety of different NAS algorithms has been proposed in the past. To name a few, there are:

  • Reinforcement learning-based NAS algorithms. The seminal paper about NAS proposed such a reinforcement learning approach. Reinforcement learning-based algorithms use an actor that generates neural architectures. To this end, the actor follows a policy, which is optimized, such that the validation accuracy of the generated neural architectures is maximized.

  • Stochastic NAS algorithms, which randomly generate neural architectures and keep the last best architecture.

  • Evolutionary Algorithm (EA) based NAS algorithms. An EA uses mechanisms inspired by biological evolution, such as reproduction, mutation, recombination, and selection. Candidate solutions to the optimization problem play the role of individuals in a population, and the fitness function determines the quality of the solutions. The evolution of the population then takes place after the repeated application of reproduction, mutation, recombination, and selection.

  • Bayesian Optimization (BO) based algorithms. BO is a powerful optimization method to optimize non-differentiable black-box functions that are complex to evaluate. In the case of NAS, this function is the validation accuracy of a network architecture. It is very complex to evaluate because to calculate it we need to train a DNN until convergence.

  • Differentiable NAS algorithms like DARTS, which relax the optimization problem such that it becomes differentiable and can be solved with gradient-based optimization algorithms.

  • Proxyless NAS (PNAS), which uses a ping-pong optimization scheme that switches between gradient descent based model parameter updates that minimize the training error and reinforce based architecture parameter updates that minimize the test error, i.e.,

    \[\begin{split}\max_{\alpha} &\quad \mathbb{E}_{z \sim p_{\alpha}(z)} \big[\text{score}(z, \Phi^{*})\big] \\ \text{s.t.} & \quad \Phi^{*} = \underset{\Phi}{\arg \min} \quad \text{loss}(z, \Phi)\end{split}\]

NNablaNAS implements DARTS and PNAS algorithms. Both report a good performance on multiple datasets. For a detailed description of the algorithms, we refer to section NAS Algorithms or to the original papers [liu2018] and [Cai2018].

Code structure

The most fundamental source codes are in the nnabla_nas folder. See below for a high-level overview of the repository.

  • contrib: Search spaces and neural architectures are defined in this folder.

  • dataset: Datasets related are implemented in this folder. NNablaNAS uses a dataloader to feed data into the model.

  • module: Most basic modules to define search spaces and to construct a neural network.

  • optimizer: Simple optimizers to update the parameters of the neural networks as well as architecture parameters.

  • runner: Search and retraining algorithms are defined in this folder. Any new architecture search algorithm should follow the same API.

  • utils: Utilities functions related to logging, visualization, and profiling.

_images/high_level_API.png

Fig. 2. A high-level API of the NNablaNAS framework.