An experimental study on hyper parameters for training deep convolutional networks
Citation
Temiz, H. (2020, October). An Experimental Study on Hyper Parameters for Training Deep Convolutional Networks. In 2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) (pp. 1-8). IEEE.Abstract
When training deep networks, it is crucial to
obtain a network that offers optimum performance by trying
different values of many hyper parameters and combinations of
these values. Theoretically, the optimal set of values, which
ensure the maximum performance of the network, can be found
by giving these parameters numerous different values. However,
it is not feasible to try all combinations of values. The a priori
information regarding the contribution levels of hyper
parameters and their values to the performance of the network
will narrow the search space and enable researchers to easily
and quickly obtain the network with optimum performance. In
this study, a priori information is investigated that will guide in
searching for most important hyper parameters and their ideal
values that ensure optimum performance of a typical
convolutional neural network in single image super resolution.
For this purpose, the importance levels of the 5 most commonly
used hyper parameters in training, and their optimum values
were investigated. By giving two different values that are widely
used or known to give good results from previous works in the
literature for each hyper parameter, in total, 32 different
training procedure were performed. The results showed that the
learning rate has the most important effect on the performance
of the network, then normalization, and then the size of the input
image given to the model during training. It has also been found
that the batch number and step count parameter values do not
make a significant change in the performance of the network.
The results obtained from this study could help researchers in
determining the training parameters and their values in order
to efficiently and rapidly obtain optimum network performance