Recent events

Mechanistic interpretability of neural networks

Jonas Fischer MPI-INF - D2
06 Nov 2024, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
Modern machine learning (ML) has largely been driven by neural networks, delivering outstanding results in not only in traditional ML applications, but also permeating into classical sciences solving open problems in physics, chemistry, and biology. This success came at a cost, as typical neural network architectures are inherently complex and their reasoning processes opaque. In domains were insights weigh more than prediction, such as modeling of systems biology, or in high-stakes decision making, such as healthcare and finance, ...
Modern machine learning (ML) has largely been driven by neural networks, delivering outstanding results in not only in traditional ML applications, but also permeating into classical sciences solving open problems in physics, chemistry, and biology. This success came at a cost, as typical neural network architectures are inherently complex and their reasoning processes opaque. In domains were insights weigh more than prediction, such as modeling of systems biology, or in high-stakes decision making, such as healthcare and finance, the reasoning process of a neural network however needs to be transparent. Recently, the mechanistic interpretation of this reasoning process has gained significant attention, which is concerned with understanding the *internal reasoning process* of a network, including what information particular neurons respond to and how these specific neurons are organized into larger circuits. In this talk, I will (1) gently introduce the topic of mechanistic interpretability in machine learning, (2) show how to discover mechanistic circuits within a neural network, (3) discuss the relevance of mechanistic interpretability in real-world applications, and (4) discuss what is still missing in the field.
Read more

Semantics of declassification in higher-order languages

Jan Menz Max Planck Institute for Software Systems
29 Oct 2024, 5:00 pm - 6:00 pm
Saarbrücken building E1 5, room 029
SWS Student Defense Talks - Thesis Proposal
Declassification allows a programmer to purposfully reveal secret information. In my thesis, I will argue that treating higher-order programs explicitly and compositionally is important when dealing with declassification, and using purpose-built logical-relations semantics, we can reason about declassification in new ways while fully supporting higher-order programs. To support this claim, I will study two different styles of declassification and their semantics. In the first part of my thesis, I will show how to reason about where declassification, ...
Declassification allows a programmer to purposfully reveal secret information. In my thesis, I will argue that treating higher-order programs explicitly and compositionally is important when dealing with declassification, and using purpose-built logical-relations semantics, we can reason about declassification in new ways while fully supporting higher-order programs. To support this claim, I will study two different styles of declassification and their semantics. In the first part of my thesis, I will show how to reason about where declassification, an intensional declassification policy, in the higher-order setting by adapting logical relations techniques developed for noninterference. This approach leads to the first compositional semantics for an intensional declassification policy in the higher-order setting. In the second part of my thesis, I turn towards a style of declassification that has already been more extensively studied in the higher-order setting using logical relation semantics: what declassification. I will present a new approach to what declassification in higher-order languages. Specifically, I will present a novel dependency analysis using a dependent type system and show that what-declassification guarantees can be obtained from this dependency analysis using logical relations semantics.
Read more

Strategic and counterfactual reasoning in AI-assisted decision making

Stratis Tsirtsis Max Planck Institute for Software Systems
10 Oct 2024, 5:00 pm - 6:00 pm
Virtual talk
SWS Student Defense Talks - Thesis Proposal
From finance and healthcare to criminal justice and transportation, various domains that require critical decisions, traditionally made by humans, have increasingly incorporated artificial intelligence (AI) systems into the decision making process. Despite unprecedented advances in AI capabilities, in most real-world contexts, AI systems do not make highly consequential decisions autonomously; instead, they operate in conjunction with a human decision maker. The promise of human-AI collaboration lies in the complementary strengths of each to lead to better decisions overall. ...
From finance and healthcare to criminal justice and transportation, various domains that require critical decisions, traditionally made by humans, have increasingly incorporated artificial intelligence (AI) systems into the decision making process. Despite unprecedented advances in AI capabilities, in most real-world contexts, AI systems do not make highly consequential decisions autonomously; instead, they operate in conjunction with a human decision maker. The promise of human-AI collaboration lies in the complementary strengths of each to lead to better decisions overall. In this thesis, I present contributions on multiple dimensions of AI-assisted decision making that range from technical methodologies to empirical findings. I start by focusing on ways that AI systems can support strategic reasoning—the ability of humans to make decisions based on expectations about others’ behavior—and I introduce algorithmic methods that support a human decision maker in allocating resources to a population of other (strategic) humans. Then, I switch focus to counterfactual reasoning—the ability to reflect on how different decisions could have led events of the past to turn out differently. In that context, I propose algorithmic methods to help a human decision maker improve their future decisions by highlighting specific past decisions that could have led to better outcomes and may serve as a good learning signal. I conclude the thesis by proposing a computational model based on counterfactual simulations to predict the extent to which people hold a human and an AI agent responsible for an outcome in cases where the two agents are working together towards a common goal. The performance of this model is evaluated through an experiment with human participants conducted online.
Read more

Computing Graph Isomorphisms and Similarities

Daniel Neuen MPI-INF - D1
02 Oct 2024, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
Deciding if two graphs are isomorphic, i.e., whether they are structurally identical, is a fundamental computational problem that has many important applications. Despite extensive efforts in the past decades, GI has remained one of only few natural problems contained in NP that are neither known to be NP-complete nor known to be contained in P. In 20216, Babai proved in a breakthrough result that GI can be solved in quasipolynomial time. In the first part of my talk, ...
Deciding if two graphs are isomorphic, i.e., whether they are structurally identical, is a fundamental computational problem that has many important applications. Despite extensive efforts in the past decades, GI has remained one of only few natural problems contained in NP that are neither known to be NP-complete nor known to be contained in P. In 20216, Babai proved in a breakthrough result that GI can be solved in quasipolynomial time. In the first part of my talk, I will give an overview on recent developments on testing isomorphism. A closely related problem is that of determining the similarity between two graphs. Arguably, this problem is even more relevant for many applications such as machine learning tasks on graph-structured data. Natural approaches such as counting edge modifications required to turn one graph into the other are computationally intractable even on very restricted input graphs. As a result, various other methods have been proposed to measure similarity between graphs such as graph motif counts, i.e., measuring similarity based on counting certain patterns appearing in the graphs. In the second part of the talk, we will consider different ways of measuring similarity and discuss relations to standard heuristics for isomorphism testing that have been discovered in recent years.
Read more

Scaling Up Relaxed Memory Verification with Separation Logics

Hoang-Hai Dang Max Planck Institute for Software Systems
10 Sep 2024, 3:00 pm - 4:00 pm
Saarbrücken building E1 5, room 005
SWS Student Defense Talks - Thesis Defense
Reasoning about concurrency in a realistic, non-toy language like C/C++ or Rust, which encompasses many interweaving complex features, is very hard. Yet, realistic concurrency involves relaxed memory models, which are significantly harder to reason about than the simple, traditional concurrency model that is sequential consistency. To scale up verifications to realistic concurrency, we need a few ingredients: (1) strong but abstract reasoning principles so that we can avoid the too tedious details of the underlying concurrency model; ...
Reasoning about concurrency in a realistic, non-toy language like C/C++ or Rust, which encompasses many interweaving complex features, is very hard. Yet, realistic concurrency involves relaxed memory models, which are significantly harder to reason about than the simple, traditional concurrency model that is sequential consistency. To scale up verifications to realistic concurrency, we need a few ingredients: (1) strong but abstract reasoning principles so that we can avoid the too tedious details of the underlying concurrency model; (2) modular reasoning so that we can compose smaller verification results into larger ones; (3) reasoning extensibility so that we can derive new reasoning principles for both complex language features and algorithms without rebuilding our logic from scratch; and (4) machine-checked proofs so that we do not miss potential unsoundness in our verifications. With these ingredients in hand, a logic designer can flexibly accommodate the intricacy of relaxed memory features and the ingenuity of programmers who exploit those features. In this dissertation, I present how to develop strong, abstract, modular, extensible, and machine-checked separation logics for realistic relaxed memory concurrency in the Iris framework, using multiple layers of abstractions. I report two main applications of such logics: (i) the verification of the Rust type system with a relaxed memory model, where relaxed memory effects are encapsulated behind the safe interface of libraries and thus are not visible to clients, and (ii) the compositional specification and verification of relaxed memory libraries, in which relaxed memory effects are exposed to clients.
Read more

Reaching New Shores with Modern Separation Logic

Simon Spies Max Planck Institute for Software Systems
09 Sep 2024, 3:00 pm - 4:00 pm
Saarbrücken building E1 5, room 029
SWS Student Defense Talks - Thesis Proposal
The question of how to scalably verify large, stateful programs is one of the oldest—and still unsolved—questions of computer science. Over the last two decades, considerable progress has been made toward this goal with the advent of separation logic, a verification technique designed for modularly reasoning about stateful programs. While the original separation logic was only geared at verifying imperative, pointer-manipulating programs, modern versions of separation logic have become an essential tool in the toolbox of the working semanticist for modelling programming languages and verifying programs. ...
The question of how to scalably verify large, stateful programs is one of the oldest—and still unsolved—questions of computer science. Over the last two decades, considerable progress has been made toward this goal with the advent of separation logic, a verification technique designed for modularly reasoning about stateful programs. While the original separation logic was only geared at verifying imperative, pointer-manipulating programs, modern versions of separation logic have become an essential tool in the toolbox of the working semanticist for modelling programming languages and verifying programs.

With my thesis, I will present a line of work that focuses on the weak spots of modern separation logic. Concretely, in the context of the separation logic framework Iris, I will target two broader areas, *step-indexing* and *automation*. Step-indexing is a powerful technique for recursive definitions that is crucial for handling many of the more advanced features of modern languages. But if one does not closely follow the path laid out by its inventors, perfectly natural proof strategies turn into dead ends. Automation, on the other hand, is important for reducing the overhead of verification, which is all too often preventing verification from scaling to larger code bases.

Regarding step-indexing, the thesis will present two projects, Transfinite Iris and Later Credits. Transfinite Iris shows how to generalize step-indexing—which traditionally applies only to safety properties—to proving liveness properties. Later Credits leverage separation logic to develop an amortized form of step-indexing that enables more flexible proof patterns. Regarding automation, the thesis will present Quiver and, potentially, Daenerys. Quiver introduces a new form of guided specification inference to reduce the specification overhead of separation logic verification. Daenerys brings heap-dependent assertions— logic-level assertions containing program-level expressions—to separation logic. Unlike traditional separation-logic assertions about programs, these heap-dependent assertions validate first-order reasoning principles thus laying the groundwork for SMT-solver based automation.
Read more

Towards Better Foundations for Foundational Models: A Cognitivist Approach to Studying Large Language Models (LLMs)

Krishna Gummadi Max Planck Institute for Software Systems
04 Sep 2024, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
I will begin by demoing and releasing an LLM-based assistant that allows scientists to convert their papers (with a simple drag and drop) into short podcasts for communicating their research to a general audience. While we built the tool, we can’t explain its unreasonable effectiveness, i.e., we don’t really understand why it works or when it might fail. So in the rest of the talk, I will present our investigations into some basic curiosity-driven questions about LLMs; ...
I will begin by demoing and releasing an LLM-based assistant that allows scientists to convert their papers (with a simple drag and drop) into short podcasts for communicating their research to a general audience. While we built the tool, we can’t explain its unreasonable effectiveness, i.e., we don’t really understand why it works or when it might fail. So in the rest of the talk, I will present our investigations into some basic curiosity-driven questions about LLMs; specifically, how do LLMs receive, process, organize, store, and retrieve information. 

Our analysis is centered around engaging LLMs in two specific types of cognitive tasks: first, syntactically-rich (semantically-poor) tasks such as recognizing formal grammars, and next, semantically-rich (syntactically-poor) tasks such as answering factual knowledge questions about real-world entities. Using carefully designed experimental frameworks, we attempt to answer the following foundational questions: (a) how can we estimate what latent skills and knowledge a (pre-trained) LLM possesses? (b) (how) can we distinguish whether some LLM has learnt some training data by rote vs. by understanding? (c) what is the minimum amount of training data/costs needed for a (pre-trained) LLM to acquire a new skill or knowledge?  (d) when solving a task, is training over task examples better, worse, or similar to providing them as in-context demonstrations?

I will present some initial empirical results from experimenting with a number of large open-source language models and argue that our findings they have important implications for the privacy of training data (including potential for memorization), the reliability of generated outputs (including potential for hallucinations), and the robustness of LLM-based applications (including our podcast assistant for science communication). 
Read more

Foundational Verification of Low-Level Code: RefinedC and Beyond

Deepak Garg Max Planck Institute for Software Systems
03 Jul 2024, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
Low-level code written in languages like C forms the bedrock of most computer systems today. Establishing the correctness of this code is important for the safety and security of our systems. The gold standard for code correctness is foundational verification -- the construction of mathematical proofs of program correctness verified in a proof assistant like Coq. However, foundational verification of low-level code is a tedious task that has, so far, largely relied on adapting code to suit the constraints of the verification framework, ...
Low-level code written in languages like C forms the bedrock of most computer systems today. Establishing the correctness of this code is important for the safety and security of our systems. The gold standard for code correctness is foundational verification -- the construction of mathematical proofs of program correctness verified in a proof assistant like Coq. However, foundational verification of low-level code is a tedious task that has, so far, largely relied on adapting code to suit the constraints of the verification framework, e.g., the programming idioms that are supported.

In this talk, I will present RefinedC, a new Coq-based C verification framework that aims to address the above limitation. By design, RefinedC is foundational (all proofs are mechanically verified within Coq), extensible (it can be extended to support new program properties, programming idioms and even new languages) and automation-friendly. Time permitting, I will also discuss projects that extend RefinedC's capabilities to reason about code written in other low-level languages like assembly, and code written in more than one language.

(This talk is based on joint work with several other research groups, notably the group of Derek Dreyer at MPI-SWS.)
Read more

Understanding LLMs Under the Hood

Ellie Pavlick Brown University
25 Jun 2024, 12:00 pm - 2:00 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
Large language models (LLMs) currently dominate AI. They exhibit impressive, often human-like, behavior across a wide range of linguistic and non-linguistic tasks. However, LLMs are the result of a combination of clever engineering and financial investment, rather than a result of rigorous scientific or theoretical study. Thus, very little is known about how or why LLMs exhibit the behaviors that they do. As a result, LLMs often seem mysterious and unpredictable. In this talk, I will discuss recent work which seeks to characterize the mechanisms that LLMs employ in terms of higher-level data structures and algorithms. ...
Large language models (LLMs) currently dominate AI. They exhibit impressive, often human-like, behavior across a wide range of linguistic and non-linguistic tasks. However, LLMs are the result of a combination of clever engineering and financial investment, rather than a result of rigorous scientific or theoretical study. Thus, very little is known about how or why LLMs exhibit the behaviors that they do. As a result, LLMs often seem mysterious and unpredictable. In this talk, I will discuss recent work which seeks to characterize the mechanisms that LLMs employ in terms of higher-level data structures and algorithms. Using results on a series of simple tasks, I will argue that LLMs are not as inscrutable as they seem, but rather make use of simple and often modular computational components to solve complex problems.
Read more

Formal Controller Synthesis for Dynamical Systems: Decidability and Scalability

Mahmoud Salamati Max Planck Institute for Software Systems
20 Jun 2024, 2:00 pm - 3:00 pm
Kaiserslautern building G26, room 111
SWS Student Defense Talks - Thesis Defense
In cyber-physical systems research, a significant challenge is synthesizing reliable controllers that meet general temporal specifications. These controllers must offer formal guarantees against various sources of disturbances. Efficient formal controller design opens up numerous exciting applications across different domains, including the autonomous vehicle industry, energy systems, and healthcare. In this talk, I will discuss formal controller synthesis for three different classes of dynamical systems. First, I will address our contributions to improving the scalability of abstraction-based controller design methods for dynamical systems with continuous state spaces and bounded uncertainty. ...
In cyber-physical systems research, a significant challenge is synthesizing reliable controllers that meet general temporal specifications. These controllers must offer formal guarantees against various sources of disturbances. Efficient formal controller design opens up numerous exciting applications across different domains, including the autonomous vehicle industry, energy systems, and healthcare. In this talk, I will discuss formal controller synthesis for three different classes of dynamical systems. First, I will address our contributions to improving the scalability of abstraction-based controller design methods for dynamical systems with continuous state spaces and bounded uncertainty. These contributions include: (1) a data-driven scheme for relaxing the requirement of having analytical model, (2) a neural abstraction method to lower memory requirements of both synthesis and deployment, and (3) a scalable method for solving reach-avoid problems for multi-agent systems. Second, I will briefly describe our contributions to time-bounded model checking for continuous-time Markov decision processes, specifically: (1) a conditional decidability result, and (2) a systematic method for improving the scalability of numerical approximation methods. Finally, I will briefly discuss our results related to the decidability of reachability-related problems for linear dynamical systems, which include: (1) a hardness result for point-to-point reachability of linear dynamical systems with hyper-rectangular control sets, and (2) decidability of pseudo-reachability for hyperplane target sets.
Read more

Hunting new antibiotics with computer science

Alexey Gurevich Fachrichtung Informatik - Saarbrücken
05 Jun 2024, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
WHO describes antimicrobial resistance (AMR) as one of the top global public health and development threats, while the World Bank estimates its additional healthcare burden as US$ 1 trillion by 2050. One strategy to combat AMR is significantly speeding up and reducing the cost of discovering new antibiotics, particularly those existing in nature but evading all current attempts to find them. Surprisingly, computer scientists play an increasingly important role in this endeavor. Modern biotechnology allows us to amass vast amounts of data on natural antibiotics and their tiny producers. ...
WHO describes antimicrobial resistance (AMR) as one of the top global public health and development threats, while the World Bank estimates its additional healthcare burden as US$ 1 trillion by 2050. One strategy to combat AMR is significantly speeding up and reducing the cost of discovering new antibiotics, particularly those existing in nature but evading all current attempts to find them. Surprisingly, computer scientists play an increasingly important role in this endeavor. Modern biotechnology allows us to amass vast amounts of data on natural antibiotics and their tiny producers. However, specialized software and algorithms are the key to unlocking this wealth of information and turning it into medically important discoveries.

In my talk, I will demonstrate how computer science methods, from classical graph and string algorithms to deep learning, help transform antibiotic discovery into a high-throughput technology and realize the promise of already collected and rapidly growing biological datasets. The particular focus will be on analyzing sequencing (strings over the DNA alphabet of {A, C, G, T} letters) and mass spectrometry data (two-dimensional arrays of floating point values). In both cases, we deal with noisy experimental data, often rely on heuristics to make the processing time reasonable, and finally, get biologically relevant findings suitable for verification by our wet-lab collaborators.
Read more