The aim of this practical seminar is the realization of a complete pipeline of a project from the problem statement to finding solutions using methods of machine learning on our Deep Learning cluster. The topics are proposed by different groups of the Mathematics and Computer Science departments consisting for example of topics in computer linguistics, bioinformatics, computer vision, computer graphics, language processing, and optimization. Each topic will be supervised by the group that proposed the topic. Up to three students will be working on one topic.
The project seminar consists of three parts. In the first part, the student group working on a topic will get acquainted with the topic, the input data, and research about potential solutions for the given problem. The second part is the implementation and testing of solutions for the problem. At the end of the course, the groups will present their topics and the solutions in a seminar. Successful completion of the seminar is rewarded with 7 CP for Bioinformatics master students. Data Science & Artificial Intelligence bachelor students receive 9 CP.
News: Final topics have been assigned.
Organizers:
Prof. Dr. Andreas Keller, Georges P. Schmartz
Please see below for the individual project supervisors/tutors.
Dates:
Registration * | HERE from April 12 to April 19, 2021 |
Kick-off meeting | April 20, 2021, 09:00-10:00, Online conference via Microsoft Teams with invitation to all registered participants |
Deadline to register in HISPOS OR de-register from seminar * | HERE until May 17, 2021 |
Lecture dates | General introduction: April 21, 2021, 9:00-10:00, Online conference via Microsoft Teams with invitation to all registered participants Technical introduction: May 5, 2021, 9:00-10:00, Online conference via Microsoft Teams with invitation to confirmed participants |
Intermediate Progress Meeting | May 26, 2021, 10:00, Online conference via Microsoft Teams with invitation to confirmed participants |
Deadline for handing in results of implementation | July 12, 2021 |
Presentations | July 14, 2021, 10:00, Online conference via Microsoft Teams with invitation to confirmed participants |
Final report deadline | July 19, 2021 |
* If you want to deregister from the seminar, please send the tutor an email irrespectively whether you (de)registered in HISPOS or not. Requirements for participation:
- at least one passed course of Machine Learning or Elements of [Statistical; Machine] Learning or Neural Networks: Implementation and Applications
Certificate requirements:
Final grade:
- Successful presentation:
- Talk: 30 minutes
- Questions from the tutors/audience after the presentation
- Taking minutes during the practical part to make clear which student worked on which part of the project.
- Handing in a final report after the presentation along with the protocol of the practical part.
- DSAI students only:
- Successful completion of an additional project-specific task
- Based on the given presentation (see “Certificate requirements”)
- May be influenced by the submitted report and handling of the practical part
Topics
Supervisor | Topic title | Students | Description |
Dr. Anne Hecksteden | The Athlete Performance Passport | Urs, Nora, Frederik | AI led targeting of anti-doping tests: Detecting doping rule violations is essential for clean sports. With limited testing resources, targeting anti-doping tests is important. As the main aim of doping is an increase in performance, an initial suspicion can be gained (in agreement with common sense) from striking performance developments. Altered competition schemes (few but top performances) or background data may also be informative. The main aim of this project is to Identify conspicuous patterns in publicly available athlete performance data. A dataset from swimming (allowing for supervised learning approaches) is available. |
Dr. Cosima Zemlin | Lung metastasis or not? | – This question is essential for patients with the initial diagnosis of mamma carcinoma since lung metastases have great influence on the survival. Current guidelines require performing a thorax computer tomography (CT) during staging of a newly diagnosed mamma carcinoma. However, a thorax CT consists of numerous images and sometimes it is difficult to differentiate between a metastatis and non-specific lesions that can arise from previous pneumonia, tuberculosis etc. In this project we want to use annotated thorax CTs from patients of the department for gynecology at the UKS to identify characteristics of lung metastases in thorax CTs. | |
Jenifer Barrirero | Three-dimensional segmentation of near-atomically resolved tomographies | Atom probe tomography is a characterization method in materials science used to get three-dimensional reconstructions with near-atomic resolution. Datasets can be described as three-dimensional clouds of millions of points corresponding to atoms in a material. The analysis of the segregation of atoms inside these datasets allows to understand and optimize mechanical properties. Automatic segmentation to distinguish dissolved and segregated atoms, based on iso-concentration surfaces, reaches its limit when only few atoms of a species are present. In those cases, a manual segmentation is possible, but itisa tedious and time-consuming task.The goal of this project is the development of a ML/ data mining tool for this segmentation task. As ground truth, manual segmentations are available. | |
Dr. Markus Langer | Fake it till you make it | Jonathan, Dominik, Lennard, Florian | A colleague of mine has forwarded me data from 100 mock-job interviews, where participants responded to 8 interview questions and where performance on those questions was rated by trained raters. Additionally, participants were instructed to roll a die and lie on the respective interview question if they rolled a specific number. The data I have readily available are transcripts of those interviews as well as the ratings and the die rolls (so we know for which interview questions people actually lied). I can also manage to get the interview recordings to make this a bit more interesting. The question would be, whether and to what extent it is possible to predict whether people lied during the interview questions and also predict job interview performance ratings. |
Dennis Fink | Using AI to identify elementary particles seen in world largest cloud chamber | Nicola, Lukas, Daniel | The cloud chamber in the Luxemburgs Science Center’s exhibition room allows visitors to visualize elementary particles (alpha, beta, protons etc.…), which helps them to understand natural radioactivity and background radiation. It is the largest such apparatus in the world and generates a large amount of exciting data we intend to analyze using AI. This device works the following way. First the machine evaporates isopropanol, so that the chamber is filled with an isopropanol gas. At the bottom of the chamber, the alcohol gas is then cooled down to -30°C and so should condensate into a liquid. However, it can only do this if it has something it can grab onto. Because the chamber is cleaned from dust particles, only elementary particles flying through, have enough energy to condensate the isopropanol. This process finally produces white streaks of different thicknesses, lengths, and forms, depending on the underlying particles. The purpose of the AI would be to first detect those white streaks, identify the underlying particles, as well as well-known subatomic processes producing them and analyze them statistically. The cloud chamber has a camera build-in, which should allow us to capture raw images in a good quality and train the artificial intelligence. In a second stage, deeper studies could be targeted, at recording and analyzing rare events like spallations, delta rays, cosmic showers, kaons or muon decays, etc. No specific particle physics background is required for this project: LSC will provide the necessary knowledge. |
Martin Müller | Segmentation of microscopic images | During analyzing the microstructure of a material, determination of grain size and phase fraction of microstructure constituents are key aspects for quality control as well as materials development. It requires the segmentation of the microstructure image taken in a microscope, and simple segmentation techniques like thresholding are still state of the art. However, due to insufficient contrasting of the microstructure during specimen preparation, these approaches quickly reach their limit, and significant manual improvements can become necessary. The emergence of machine learning offers a promising alternative for better segmentation workflows. The goal of this project is to build a deep learning segmentation model for steel microstructures. | |
Prof. Dr. Michael Zemlin | Bonchopulmonary Dysplasia | Prediction of lung disease (bronchopulmonary dysplasia, bpd) in very preterm neonates. Preterm neonates with a birth weight below 1.500g are at high risk to develop bpd which is classified as „mild“, „moderate“, or „severe“ once the baby reaches the expected date of birth. An early identification of those infants that later develop severe bpd would allow an early preventive personalized therapy. In this project we want to use the annotated chest x-rays of the extremely preterm neonates that were born in our clinic at their day of birth to find predictors for the later development of bpd. | |
Martin Müller | Prediction of chemical composition using microscopic image data | Julius, Dennis, Phillip | During analyzing the microstructure of a material, determining the chemical composition of microstructure constituents is a crucial step for solving many research questions. Usually, this is done using spectroscopic data, e.g., data from X-ray analysis, which is a reliable but time-consumingtechnique. However, when using suitableimaging techniques, information about chemical composition is also included in theseimages. The goal of this project is to train a machine learning model, using images and spectroscopic data as the ground truth, that can predict the chemical composition just based on images |
Dr. Casper Markus | Deep-learning based Diagnosis of Atrophic Gastritis by Endoscopy | Siwen, Mohammadmahdi, Pegah | Atrophic Gastritis is characterized by long term inflammation of the gastric mucosa, caused e.g. by bacterial infection of a patient. If left untreated, Atrophic Gastritis can potentially lead to gastric cancer. In order to prevent a serious disease progression, reliable diagnosis by endoscopic imaging is needed. However, interpretation of this image data remains a challenge, even for medical experts. Here, we hope to find a machine learning based solution. |
Martin Müller | Classification of Microstructures | Mathias, Hanna, Polina, Mohammad | The field of material science is fast moving with major innovations happening at the smallest size spectrum. This is because, the frequency and distribution of different microstructures dictates a wide range of material properties and may be used as basis in the evaluation of competing manufacturing processes. In order to improve scalability and reliability of future annotations of these microstructures, we seek a machine learning based solution to classify a set of images into seven different microstructure classes. |