7th informal meeting of Big Data @ LIP
Friday 6 July 2018 -
11:30
Monday 2 July 2018
Tuesday 3 July 2018
Wednesday 4 July 2018
Thursday 5 July 2018
Friday 6 July 2018
11:30
Introduction and general discussion
-
Nuno Castro
(
LIP, DF/ECUM
)
Guilherme Milhano
(
LIP
)
Introduction and general discussion
Nuno Castro
(
LIP, DF/ECUM
)
Guilherme Milhano
(
LIP
)
11:30 - 12:00
Room: Zoom video-conference only
Please bring up any relevant points you might have
12:00
Journal Club: tuning of hyperparameters in neural networks
-
Giles Strong
(
LIP
)
Journal Club: tuning of hyperparameters in neural networks
Giles Strong
(
LIP
)
12:00 - 12:40
Room: Zoom video-conference only
discussion of https://arxiv.org/abs/1803.09820 A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay by Leslie N. Smith Abstract: Although deep learning has produced dazzling successes for applications of image, speech, and video processing in the past few years, most trainings are with suboptimal hyper-parameters, requiring unnecessarily long training times. Setting the hyper-parameters remains a black art that requires years of experience to acquire. This report proposes several efficient ways to set the hyper-parameters that significantly reduce training time and improves performance. Specifically, this report shows how to examine the training validation/test loss function for subtle clues of underfitting and overfitting and suggests guidelines for moving toward the optimal balance point. Then it discusses how to increase/decrease the learning rate/momentum to speed up training. Our experiments show that it is crucial to balance every manner of regularization for each dataset and architecture. Weight decay is used as a sample regularizer to show how its optimal value is tightly coupled with the learning rates and momentums. Files to help replicate the results reported here are available.