Diversis has developed the “Deep Learning Optimization” software framework, which allows users to achieve the highest performance for a given data set without limiting the design of the artificial neural network model. This software framework allows the user to improve performance without any additional costs as it does not require any changes to the preferred artificial neural network model.
Deep learning is a subfield of machine learning that uses large multi-layer artificial neural networks (referred to as networks henceforth) as the main feature extractor and inference. What differentiates deep learning from the earlier applications of multi-layer networks is the exceptionally large number of layers of the applied network architectures.
Deep learning based solutions consist of three main development phases, namely, model design or selection, model training and inference. Model design or selection phase is where an artificial neural network architecture is either designed from scratch or selected from a set of proven architectures that have been applied at the similar problem domain and known to perform well. Once a model architecture is decided, problem specific data is used to train this model to perform well with the problem at hand. Finally, in the inference phase, the trained network is applied to the problem at hand.
The training phase at its core consist of two stages, namely, loss function design or selection and network parameter optimization. Loss function design or selection is mathematically defining what your chosen network is expected to do well. Once a loss function is decided, network parameter optimization stage uses some optimization technique (such as backpropagation) to obtain a set of network parameters that minimize the chosen loss function for the available problem specific data.
Putting it in its simplest form, we have invented a new way of training deep networks that improves inference accuracy on test data. This new training approach is applicable to various types of deep neural network architectures. The proposed approach is NOT
- A new loss function such as Hinge loss
- A new optimization technique such as Adam optimizer
- A new data augmentation technique such as affine image warps, adding noise or GAN based data creation
- A network structure modification such as residual blocks as used in ResNet or random dropouts for regularization.
- A way of network coefficient modification such as increasing/decreasing coefficient bit-depths or zeroing our selected network parameters.
To use the Deep Optimizer Framework, the users do not need to change anything in their network design. The users design their own network architecture without any structural or optimization limitations. They simply activate the Deep Optimizer Framework to get better test accuracies for their deep learning applications. Any regularizer and any loss function can be used. In fact, Deep Optimizer Framework is invisible to the user, it only changes the training mechanism for better test accuracy.
We have tested our proposed Deep Optimizer Framework with six different Deep Neural Network Architectures including Convolutional Neural Networks (CNNs) and Fully Connected Networks (FCNs). The details of network architectures can be seen at the following link.
The obtained results for these network structures are as follows.
Test Accuracy Results (%) Difference (%) Net Type Classical DeepOptimizer 1 71,8850 74,9800 3,0950 2 81,5805 84,7957 3,2152 3 75,0401 75,7612 0,7211 4 10,016 83,2933 73,2773 5 55,3986 56,1799 0,7813 6 51,8530 53,6158 1,7628
The testing codes are available at GitHub.