Deep Learning (DL) has gained a lot of popularity in recent years. While the basic techniques have been known for several years, the exponential growth of available computational power and training data enabled major breakthroughs in the application of nature-inspired computing with (deep) artificial neural networks (DNNs) in many domains, that were not possible prior to this. This, in turn, has renewed their interest in Academia and the Industry. Although a strong math background might be neccesary to truly understand how neural networks are trained behind the scenes, one can already start learning about DL superficially and implementing DNNs on real world problems with just basic machine learning (ML) knowledge and some programming skills. In this seminar students "learn by doing" about DL by training a DNN in a real world dataset in the domains of the Automotive Industry and Industry 4.0. Since not everyone can afford the hardware necessary to train DL models, participants will be given access to some of our computers at the DFKI.
- Offered by: Chair of Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
- Lecturers: Michael Feld, Matthias Klusch, Guillermo Reyes
- Time: Thursdays, 12-14h (c.t.)
- Kick-off: Thursday, 18th October 2018
- Location: Building E1 3, Seminar Room 016
- Credit Points: 7 ECTS. This Seminar will not be offered as a Proseminar
- The final presentations will be on 07.02.19 from 12 - 3 pm.
The final report is due one week later on 14.02.19
- Published group assignments.
- Introduction slides can be found here
- Good English skills (literature will be English)
- Good programming skills (the practical part will involve programming)
- Basic ML knowledge (to understand the papers and common concepts)
This seminar covers theoretical and practical aspects of DL. Attendees will form teams of 2-3 members and as a team fulfill the following tasks:
- Read and present an assigned scientific paper (45 min + 15 min discussion)
- Implement and train a model to solve a certain task
- Present your work to the rest of the class (20 min + 10 min discussion)
- Write a final report on the work done
- 25% - Presentation of the assigned scientific paper
- 15% - Active participation in the discussion of presented topics
- 20% - Final presentation
- 40% - Final report
Regitration for the seminar will take place during the first week. Please register by sending an email to Michael Feld with the subject "[UPLINX] Seminar Registration WS18/19" and the following information:
- Full Name
- Matriculation Number
- Team Members (if you already have them)
- Topics of Interest (in order of interest)
This registration is NOT final. The final teams will be announced in the website before the second session. Priority will be given to registrarions of already formed teams. Attendance to the kick-off meeting is mandatory for a valid registration.
1) Image Classification
Image Classification is the computer vision task of matching a label to an input image from a set of possible labels.
2) Object Detection
In addition to assigning a label to objects in an image, Object Detection deals with the localization of these objects within the image by "drawing" a bounding box around them.
3) Semantic Segmentation
Like Object Detection, Semantic Segmentation tries to classify and localize objects in input images, but instead of using bounding boxes, the localization is done pixel-wise by producing a mask of the same size as the original image, where each pixel represents the class of the object that contains it.
4) Gesture Recognition
Similarly to Image Classification, Gesture Recognition is the task of detecting when a gesture is performed and classifying what kind of gesture it is. The main difference here is that, while Image Classification deals with static images, Gesture Recognition takes a sequence of frames of arbitrary length as input.
In case you don't have access to your paper, please contact your supervisor.
|29.11.2018.||Mehboob, Alam, Kadir||Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (NIPS). pp. 1097-1105.[pdf]||Guillermo Reyes|
|06.12.2018||Schlinkmann, Papakerashvili, Iqbal||Long, J., Shelhamer, E., & Darrell T. (2015). Fully Convolutional Networks for Semantic Segmentation. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3431-3440. [pdf]||Matthias Klusch|
|-||-||Ronneberger O., Fischer P., & Brox T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention (MICCAI). pp. 234-241 [pdf]||-|
|13.12.2018||Prange, Nimer, Goodarzi||Dai, J., Li, Y., He, K., & Sun, J. (2016). R-fcn: Object detection via region-based fully convolutional networks. In Advances in neural information processing systems (NIPS). pp. 379-387 [pdf]||Matthias Klusch|
|-||-||Redmon, J., Divvala, S., Girshick, R., & Farhadi. A. (2016). You Only Look Once: Unified, Real-Time Object Detection. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 779-788 [pdf]||-|
|20.12.2018||Paulus, Paulus, Gomaa||Tsironi, E., Barros, P.V., & Wermter, S. (2016). Gesture Recognition with a Convolutional Long Short-Term Memory Recurrent Neural Network. Proceedings of the Twenty-Fourth European Symposium on Artifical Neural Networks, Computational Intelligence and Machine Learning (ESANN), pp. 213-218 [pdf]||Guillermo Reyes|
Each group will be required to complete the task of implementing and training a DNN on an assigned dataset. Additionally, all teams should do further experimentation to e.g. optimize hyperparameters and architechture, see the effect of noise in the data and data augmentation techniques, compare transfer learning and incrementally trained models to models trained once from scratch, etc.
- Berkley Deep Drive
- Kaggle State Farm Distracted Driver Detection
- LISA Traffic Signs
- LISA Hand Gestures
Ian Goodfellow, Yoshua Bengio & Aaron Courville (2016). Deep Learning. MIT Press, http://www.deeplearningbook.