Events 2024

New Tools for Text Indexing and Beyond: Substring Complexity and String Synchronizing Sets

Tomasz Kociumaka MPI-INF - D1
07 Feb 2024, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
In recent decades, the volume of sequential data processed in genomics and other disciplines has exploded. This trend escalates the challenge of designing text indexes – concise data structures that allow answering various queries efficiently. For example, pattern-matching queries ask if the text (a long string modeling the dataset) contains an occurrence of a given pattern (a short string provided at query time). The suffix array is a classic index for pattern matching and many other queries, ...
In recent decades, the volume of sequential data processed in genomics and other disciplines has exploded. This trend escalates the challenge of designing text indexes – concise data structures that allow answering various queries efficiently. For example, pattern-matching queries ask if the text (a long string modeling the dataset) contains an occurrence of a given pattern (a short string provided at query time). The suffix array is a classic index for pattern matching and many other queries, but modern applications demand space-efficient alternatives like the FM index. Still, even these data structures struggle with massive highly repetitive datasets, such as human genome collections. This scenario requires indexes whose size is comparable to the best text compression methods.

The talk will introduce two tools originating from my text indexing research. Substring complexity is a well-behaved measure of text repetitiveness that aids in comparing the performance of compression algorithms. String synchronizing sets enable consistent selection of small subsets of text substrings. Recently, these tools became the core of the asymptotically smallest compressed text index with suffix-array functionality. Furthermore, they find applications beyond text indexing: in combinatorics on words, quantum string algorithms, and more.
Read more

I never willingly consented to this! Investigating PII leakage through SSO logins

Ha Dao MPI-INF - INET
10 Jan 2024, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
Social login platforms have become increasingly popular for websites, enabling users to sign in with their social networking identities. This convenient process, also known as Single Sign-On (SSO), uses Personally Identifiable Information (PII) provided by identity providers, such as Google or Facebook, to log in to multiple websites. However, there is a significant concern that SSO can potentially facilitate the leakage of PII to third-party tracking without the user's awareness. In this talk, I will give an overview of our recent work towards PII leakage to third-party tracking through SSO logins, ...
Social login platforms have become increasingly popular for websites, enabling users to sign in with their social networking identities. This convenient process, also known as Single Sign-On (SSO), uses Personally Identifiable Information (PII) provided by identity providers, such as Google or Facebook, to log in to multiple websites. However, there is a significant concern that SSO can potentially facilitate the leakage of PII to third-party tracking without the user's awareness. In this talk, I will give an overview of our recent work towards PII leakage to third-party tracking through SSO logins, representing the first in-depth analysis of PII leakage in SSO authentication flows. I will conclude the talk by discussing PII leakage across a spectrum of scenarios and identifying a number of possible research directions to improve transparency on the World Wide Web.
Read more