An experimental study on hyper parameters for training deep convolutional networks
MetadataShow full item record
CitationTemiz, H. (2020, October). An Experimental Study on Hyper Parameters for Training Deep Convolutional Networks. In 2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) (pp. 1-8). IEEE.
When training deep networks, it is crucial to obtain a network that offers optimum performance by trying different values of many hyper parameters and combinations of these values. Theoretically, the optimal set of values, which ensure the maximum performance of the network, can be found by giving these parameters numerous different values. However, it is not feasible to try all combinations of values. The a priori information regarding the contribution levels of hyper parameters and their values to the performance of the network will narrow the search space and enable researchers to easily and quickly obtain the network with optimum performance. In this study, a priori information is investigated that will guide in searching for most important hyper parameters and their ideal values that ensure optimum performance of a typical convolutional neural network in single image super resolution. For this purpose, the importance levels of the 5 most commonly used hyper parameters in training, and their optimum values were investigated. By giving two different values that are widely used or known to give good results from previous works in the literature for each hyper parameter, in total, 32 different training procedure were performed. The results showed that the learning rate has the most important effect on the performance of the network, then normalization, and then the size of the input image given to the model during training. It has also been found that the batch number and step count parameter values do not make a significant change in the performance of the network. The results obtained from this study could help researchers in determining the training parameters and their values in order to efficiently and rapidly obtain optimum network performance