EML Munich

Thesis Topics

If you are a student at the Technical University of Munich in a MSc degree program at the department of Computer Science, and are interested in working with us (e.g. for a master's thesis, research internship, research project, or as a HiWi student), please send an enquiry to leander.girrbach@helmholtz-munich.de. Please include a current CV including a description of previous research/work experiences, a transcript of all previous courses and grades, and a brief statement of what research you want to do and why you want to join our group. A list of open topics is below— however, note that this list is not exhaustive, and we might have additional topics available.

Adaptation of Text-to-Image Models

Text-to-Image (T2I) generative models have excelled in crafting visually appealing images from text prompts. After training, T2I models can be adapted to e.g. introduce precise control over the generated content or improve the prompt-following capabilities of the model. This can be realized through either fine-tuning or test-time adaptation methods applied to the T2I model. We aim to explore how to improve upon existing adaptation methods.

Analysis of Large Multimodal Models

The emergence of Large Multimodal Models (LMMs) has revolutionized many multimodal tasks that were previously considered impractical. However, it is of utmost importance to be aware of their shortcomings as well as their capabilities. One such limitation is that, like any other machine learning system, LMMs may be prone to exhibit human-like biases. We aim to explore how to identify and evaluate these biases and further develop methods to mitigate them.

Continual Learning with Pretrained Models

Continual Learning (CL) was originally used to train models from scratch on a stream of tasks. However, with the rise of ubiquitous pretrained foundation models, the focus in CL has shifted to continually adapting these pretrained models for downstream tasks. Parameter Efficient Fine Tuning (PEFT) has proven highly successful in this regard. Nonetheless, several open challenges remain. Our aim is to explore how PEFT can be used in a continual or lifelong learning setting to adapt a pretrained model to various downstream tasks.

Distillation of Vision Language Models

Vision Language Models, with the Large Language Models as the backbone, have showcased impressive skills in tasks related to visual understanding and reasoning. Yet, their widespread application faces obstacles due to the high computational demands during both training and inference phases. A common technique to reduce high computational resource demand is knowledge distillation. We aim to explore how to distill the knowledge from larger models into smaller models.

Mixture of parameter-efficient experts

Parameter-Efficient Fine-Tuning (PEFT) techniques can be used to achieve model specialization by introducing small, task-specific modules that are finetuned while keeping the base model frozen. This approach creates specialized versions for different applications without the need to retrain the entire model. Conversely, Mixture of Experts (MoE) models implicitly enable specialization of separate model components, called experts, through a dynamic routing mechanism. This mechanism directs different inputs to specific experts, allowing the model to develop specialized subnetworks for various types of data or subtasks. Our aim is to integrate these two techniques by combining the dynamic routing mechanism of MoE with the parameter-efficiency and modularity of PEFT methods.

Training with Synthetic Data

Curating large datasets is expensive. However, recently Text-to-Image (T2I) generative models have excelled in crafting visually appealing images from text prompts. This opens the door to the mass-generation of synthetic data for downstream training. We aim to explore how to best use T2I models to generate image suitable for training downstream models on.

Measuring the Role of Compositionality in Sequences

This project focuses on studying methods to measure the level of compositionality in sequential data. Many sequential datasets contain repeated patterns organized in hierarchical, compositional structures. Extrapolating these patterns as tokens is vital for machine learning models to effectively process and compute representations from sequences. The research will involve conducting a literature review on this topic, testing various measures of compositionality, and exploring different tokenization methods to improve machine learning models. The aim is to enhance our understanding of compositional structures in sequential data and their impact on model performance.