Upcoming events

Automated and Foundational Verification of Low-Level Programs

Michael Sammler MPI-SWS
04 Dec 2023, 5:00 pm - 6:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 113
SWS Student Defense Talks - Thesis Defense
Formal verification is a promising technique to ensure the reliability of low-level programs like operating systems and hypervisors, since it can show the absence of whole classes of bugs and prevent critical vulnerabilities. To realize the full potential of formal verification for real-world low-level programs, however one has to overcome several challenges, including: (1) dealing with the complexities of realistic models of real-world programming languages; (2) ensuring the trustworthiness of the verification, ideally by providing foundational proofs (i.e., ...
Formal verification is a promising technique to ensure the reliability of low-level programs like operating systems and hypervisors, since it can show the absence of whole classes of bugs and prevent critical vulnerabilities. To realize the full potential of formal verification for real-world low-level programs, however one has to overcome several challenges, including: (1) dealing with the complexities of realistic models of real-world programming languages; (2) ensuring the trustworthiness of the verification, ideally by providing foundational proofs (i.e., proofs that can be checked by a general-purpose proof assistant); and (3) minimizing the manual effort required for verification by providing a high degree of automation.

This dissertation presents multiple projects that advance formal verification along these three axes: RefinedC provides the first approach for verifying C code that combines foundational proofs with a high degree of automation via a novel refinement and ownership type system. Islaris shows how to scale verification of assembly code to realistic models of modern instruction set architectures-in particular, Armv8-A and RISC-V. DimSum develops a decentralized approach for reasoning about programs that consist of components written in multiple different languages (e.g., assembly and C), as is common for low-level programs. RefinedC and Islaris rest on Lithium, a novel proof engine for separation logic that combines automation with foundational proofs.
Read more

Improving Decision Making with Machine Learning, Provably

Manuel Gomez Rodriguez Max Planck Institute for Software Systems
06 Dec 2023, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
Decision support systems for classification tasks are predominantly designed to predict the value of the ground truth labels.  However, these systems also need to make human experts understand when and how to use these predictions to update their own  predictions. Unfortunately, this has been proven challenging. In this talk, I will introduce an alternative type of decision support systems  that circumvent this challenge by design. Rather than providing a single label prediction, these systems provide a set of label prediction  values, ...
Decision support systems for classification tasks are predominantly designed to predict the value of the ground truth labels.  However, these systems also need to make human experts understand when and how to use these predictions to update their own  predictions. Unfortunately, this has been proven challenging. In this talk, I will introduce an alternative type of decision support systems  that circumvent this challenge by design. Rather than providing a single label prediction, these systems provide a set of label prediction  values, namely a prediction set, and forcefully ask experts to predict a label value from the prediction set. Moreover, I will discuss how to use conformal prediction, online learning and counterfactual inference to efficiently construct prediction sets that optimize experts’ performance,  provably. Further, I will present the results of a large-scale human subject study, which show that, for decision support systems based on  prediction sets, limiting experts’ level of agency leads to greater performance than allowing experts to always exercise their own agency.
Read more

I never willingly consented to this! Investigating PII leakage through SSO logins

Ha Dao MPI-INF - INET
10 Jan 2024, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
Social login platforms have become increasingly popular for websites, enabling users to sign in with their social networking identities. This convenient process, also known as Single Sign-On (SSO), uses Personally Identifiable Information (PII) provided by identity providers, such as Google or Facebook, to log in to multiple websites. However, there is a significant concern that SSO can potentially facilitate the leakage of PII to third-party tracking without the user's awareness. In this talk, I will give an overview of our recent work towards PII leakage to third-party tracking through SSO logins, ...
Social login platforms have become increasingly popular for websites, enabling users to sign in with their social networking identities. This convenient process, also known as Single Sign-On (SSO), uses Personally Identifiable Information (PII) provided by identity providers, such as Google or Facebook, to log in to multiple websites. However, there is a significant concern that SSO can potentially facilitate the leakage of PII to third-party tracking without the user's awareness. In this talk, I will give an overview of our recent work towards PII leakage to third-party tracking through SSO logins, representing the first in-depth analysis of PII leakage in SSO authentication flows. I will conclude the talk by discussing PII leakage across a spectrum of scenarios and identifying a number of possible research directions to improve transparency on the World Wide Web.
Read more

New Tools for Text Indexing and Beyond: Substring Complexity and String Synchronizing Sets

Tomasz Kociumaka MPI-INF - D1
07 Feb 2024, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
In recent decades, the volume of sequential data processed in genomics and other disciplines has exploded. This trend escalates the challenge of designing text indexes – concise data structures that allow answering various queries efficiently. For example, pattern-matching queries ask if the text (a long string modeling the dataset) contains an occurrence of a given pattern (a short string provided at query time). The suffix array is a classic index for pattern matching and many other queries, ...
In recent decades, the volume of sequential data processed in genomics and other disciplines has exploded. This trend escalates the challenge of designing text indexes – concise data structures that allow answering various queries efficiently. For example, pattern-matching queries ask if the text (a long string modeling the dataset) contains an occurrence of a given pattern (a short string provided at query time). The suffix array is a classic index for pattern matching and many other queries, but modern applications demand space-efficient alternatives like the FM index. Still, even these data structures struggle with massive highly repetitive datasets, such as human genome collections. This scenario requires indexes whose size is comparable to the best text compression methods.

The talk will introduce two tools originating from my text indexing research. Substring complexity is a well-behaved measure of text repetitiveness that aids in comparing the performance of compression algorithms. String synchronizing sets enable consistent selection of small subsets of text substrings. Recently, these tools became the core of the asymptotically smallest compressed text index with suffix-array functionality. Furthermore, they find applications beyond text indexing: in combinatorics on words, quantum string algorithms, and more.
Read more