7th informal meeting of Big Data @ LIP

Zoom video-conference only

Zoom video-conference only


The meeting will start at 11h30 PT (GTM+0) time. We will use ZOOM for the vidyoconference:


ZOOM works on most of the common OS, including mobile devices, and should be fairly straightforward to use.

Just connect to the above link (or use the meeting code - 257850266 - directly in the app) and follow the instructions. Further information on ZOOM can be found in: https://support.zoom.us/hc/en-us


Supported by project BigDataHEP, PTDC/FIS-PAR/29147/2017, PTDC/FIS-PAR/29147/2017, POCI/01-0145-FEDER-029147 (FCT, Portugal 2020, Compete 2020, Lisboa 2020, Norte 2020, UE, FEDER)

    • 11:30 12:00
      Introduction and general discussion 30m
      Please bring up any relevant points you might have
      Speakers: Guilherme Milhano (LIP), Nuno Castro (LIP, DF/ECUM)
    • 12:00 12:40
      Journal Club: tuning of hyperparameters in neural networks 40m

      discussion of https://arxiv.org/abs/1803.09820

      A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay

      by Leslie N. Smith

      Although deep learning has produced dazzling successes for applications of image, speech, and video processing in the past few years, most trainings are with suboptimal hyper-parameters, requiring unnecessarily long training times. Setting the hyper-parameters remains a black art that requires years of experience to acquire. This report proposes several efficient ways to set the hyper-parameters that significantly reduce training time and improves performance. Specifically, this report shows how to examine the training validation/test loss function for subtle clues of underfitting and overfitting and suggests guidelines for moving toward the optimal balance point. Then it discusses how to increase/decrease the learning rate/momentum to speed up training. Our experiments show that it is crucial to balance every manner of regularization for each dataset and architecture. Weight decay is used as a sample regularizer to show how its optimal value is tightly coupled with the learning rates and momentums. Files to help replicate the results reported here are available.

      Speaker: Giles Strong (LIP)

      Docker instructions:

      docker pull gilesstrong/smith_hyperparams1_demo

      docker run -d -p 8888:8888 --name=smith gilesstrong/smith_hyperparams1_demo

      docker exec smith jupyter notebook list