The aim of this practical seminar is the realization of a complete pipeline of a project from the problem statement to finding solutions using methods of machine learning on our Deep Learning cluster. The topics are proposed by different groups of the Mathematics and Computer Science departments consisting for example of topics in computer linguistics, bioinformatics, computer vision, computer graphics, language processing, and optimization. Each topic will be supervised by the group that proposed the topic. Up to three students will be working on one topic.

The project seminar consists of three parts. In the first part, the student group working on a topic will get acquainted with the topic, the input data, and research about potential solutions for the given problem. The second part is the implementation and testing of solutions for the problem. At the end of the course, the groups will present their topics and the solutions in a seminar. Successful completion of the seminar is rewarded with 7 CP for Bioinformatics master students. Data Science & Artificial Intelligence bachelor students receive 9 CP.

News: Final topics have been assigned.


Organizers:

Prof. Dr. Andreas Keller, Georges P. Schmartz

Please see below for the individual project supervisors/tutors.


Dates:

Registration *HERE from April 12 to April 19, 2021
Kick-off meetingApril 20, 2021, 09:00-10:00, Online conference via Microsoft Teams with invitation to all registered participants
Deadline to register in HISPOS OR de-register from seminar *HERE until May 17, 2021
Lecture datesGeneral introduction: April 21, 2021, 9:00-10:00, Online conference via Microsoft Teams with invitation to all registered participants

Technical introduction: May 5, 2021, 9:00-10:00, Online conference via Microsoft Teams with invitation to confirmed participants
Intermediate Progress MeetingMay 26, 2021, 10:00, Online conference via Microsoft Teams with invitation to confirmed participants
Deadline for handing in results of implementationJuly 12, 2021
PresentationsJuly 14, 2021, 10:00, Online conference via Microsoft Teams with invitation to confirmed participants
Final report deadlineJuly 19, 2021

* If you want to deregister from the seminar, please send the tutor an email irrespectively whether you (de)registered in HISPOS or not. Requirements for participation:

  • at least one passed course of Machine Learning or Elements of [Statistical; Machine] Learning or Neural Networks: Implementation and Applications

Certificate requirements:

Final grade:

  • Successful presentation:
    • Talk: 30 minutes
    • Questions from the tutors/audience after the presentation
  • Taking minutes during the practical part to make clear which student worked on which part of the project.
  • Handing in a final report after the presentation along with the protocol of the practical part.
  • DSAI students only:
    • Successful completion of an additional project-specific task
  • Based on the given presentation (see “Certificate requirements”)
  • May be influenced by the submitted report and handling of the practical part

Topics

SupervisorTopic titleStudentsDescription
Dr. Anne HeckstedenThe Athlete Performance Passport Urs, Nora, FrederikAI led targeting of anti-doping tests: Detecting doping rule violations is essential for clean sports. With limited testing resources, targeting anti-doping tests is important. As the main aim of doping is an increase in performance, an initial suspicion can be gained (in agreement with common sense) from striking performance developments. Altered competition schemes (few but top performances) or background data may also be informative. The main aim of this project is to Identify conspicuous patterns in publicly available athlete performance data. A dataset from swimming (allowing for supervised learning approaches) is available. 
Dr. Cosima ZemlinLung metastasis or not?– This question is essential for patients with the initial diagnosis of mamma carcinoma since lung metastases have great influence on the survival. Current guidelines require performing a thorax computer tomography (CT) during staging of a newly diagnosed mamma carcinoma. However, a thorax CT consists of numerous images and sometimes it is difficult to differentiate between a metastatis and non-specific lesions that can arise from previous pneumonia, tuberculosis etc. In this project we want to use annotated thorax CTs from patients of the department for gynecology at the UKS to identify characteristics of lung metastases in thorax CTs. 
Jenifer BarrireroThree-dimensional segmentation of near-atomically resolved tomographiesAtom  probe  tomography  is  a  characterization method  in  materials  science  used  to  get  three-dimensional  reconstructions  with  near-atomic  resolution.  Datasets  can  be  described  as  three-dimensional  clouds  of  millions  of  points  corresponding  to  atoms  in  a  material.  The  analysis  of  the segregation of atoms inside these datasets allows to understand and optimize mechanical properties. Automatic segmentation to distinguish dissolved and segregated atoms, based on iso-concentration surfaces, reaches its  limit  when only  few  atoms  of  a  species  are  present. In  those  cases,  a  manual segmentation is possible, but itisa tedious and time-consuming task.The  goal of this project  is the development  of  a ML/  data  mining tool for  this  segmentation  task.  As  ground  truth, manual segmentations are available.
Dr. Markus LangerFake it till you make itJonathan, Dominik, Lennard, FlorianA colleague of mine has forwarded me data from 100 mock-job interviews, where participants responded to 8 interview questions and where performance on those questions was rated by trained raters. Additionally, participants were instructed to roll a die and lie on the respective interview question if they rolled a specific number.  The data I have readily available are transcripts of those interviews as well as the ratings and the die rolls (so we know for which interview questions people actually lied). I can also manage to get the interview recordings to make this a bit more interesting. The question would be, whether and to what extent it is possible to predict whether people lied during the interview questions and also predict job interview performance ratings.
Dennis FinkUsing AI to identify elementary particles seen in world largest cloud chamberNicola, Lukas, DanielThe cloud chamber in the Luxemburgs Science Center’s exhibition room allows visitors to visualize elementary particles (alpha, beta, protons etc.…), which helps them to understand natural radioactivity and background radiation. It is the largest such apparatus in the world and generates a large amount of exciting data we intend to analyze using AI.
This device works the following way. First the machine evaporates isopropanol, so that the chamber is filled with an isopropanol gas. At the bottom of the chamber, the alcohol gas is then cooled down to -30°C and so should condensate into a liquid. However, it can only do this if it has something it can grab onto. Because the chamber is cleaned from dust particles, only elementary particles flying through, have enough energy to condensate the isopropanol. This process finally produces white streaks of different thicknesses, lengths, and forms, depending on the underlying particles.
The purpose of the AI would be to first detect those white streaks, identify the underlying particles, as well as well-known subatomic processes producing them and analyze them statistically. The cloud chamber has a camera build-in, which should allow us to capture raw images in a good quality and train the artificial intelligence. In a second stage, deeper studies could be targeted, at recording and analyzing rare events like spallations, delta rays, cosmic showers, kaons or muon decays, etc.
No specific particle physics background is required for this project: LSC will provide the necessary knowledge.
Martin MüllerSegmentation of microscopic imagesDuring analyzing the microstructure of a material, determination of grain size  and phase  fraction of microstructure  constituents  are  key  aspects  for  quality  control  as  well  as  materials  development.  It requires   the segmentation   of   the microstructure   image   taken   in   a   microscope,   and   simple segmentation  techniques  like  thresholding  are  still  state  of  the  art.  However,  due  to  insufficient contrasting of the microstructure during specimen preparation, these approaches quickly reach their limit,  and  significant  manual  improvements  can  become  necessary.  The  emergence  of machine learning offers a promising alternative for better segmentation workflows. The goal of this project is to build a deep learning segmentation model for steel microstructures.
Prof.  Dr. Michael ZemlinBonchopulmonary DysplasiaPrediction of lung disease (bronchopulmonary dysplasia, bpd) in very preterm neonates. Preterm neonates with a birth weight below 1.500g are at high risk to develop bpd which is classified as „mild“, „moderate“, or „severe“ once the baby reaches the expected date of birth. An early identification of those infants that later develop severe bpd would allow an early preventive personalized therapy. In this project we want to use the annotated chest x-rays of the extremely preterm neonates that were born in our clinic at their day of birth to find predictors for the later development of bpd. 
Martin MüllerPrediction of chemical composition using microscopic image dataJulius, Dennis, PhillipDuring   analyzing   the   microstructure   of   a   material,   determining   the   chemical   composition   of microstructure constituents is a crucial step for solving many research questions. Usually, this is done using  spectroscopic  data,  e.g.,  data  from  X-ray  analysis,  which  is a reliable  but  time-consumingtechnique.   However,   when   using suitableimaging   techniques,   information   about   chemical composition  is  also  included  in  theseimages.  The  goal  of  this  project  is  to  train  a  machine  learning model,  using  images  and  spectroscopic  data  as  the  ground  truth,  that  can  predict  the  chemical composition just based on images
Dr. Casper MarkusDeep-learning based Diagnosis of Atrophic Gastritis by EndoscopySiwen, Mohammadmahdi, PegahAtrophic Gastritis is characterized by long term inflammation of the gastric mucosa, caused e.g. by bacterial infection of a patient. If left untreated, Atrophic Gastritis can potentially lead to gastric cancer. In order to prevent a serious disease progression, reliable diagnosis by endoscopic imaging is needed. However, interpretation of this image data remains a challenge, even for medical experts. Here, we hope to find a machine learning based solution.
Martin MüllerClassification of MicrostructuresMathias, Hanna, Polina, MohammadThe field of material science is fast moving with major innovations happening at the smallest size spectrum. This is because, the frequency and distribution of different microstructures dictates a wide range of material properties and may be used as basis in the evaluation of competing manufacturing processes. In order to improve scalability and reliability of future annotations of these microstructures, we seek a machine learning based solution to classify a set of images into seven different microstructure classes.