Blog

Understanding how AI models use training data: a dive into influence functions and datamodels and how they relate

January 2025

We talk about data attribution. I explain continuous influence functions and their application to deep learning, datamodeling, and some new math that links the two of these together.

read

Somewhat more efficient component attribution by extracted linear models

December 2024

While messing around with component attribution (Shah et al. (2024)) I stumbled upon a super simple and weird way to decrease the sample complexity required to get good attributions. Read on to find out how and what.

read

Model extraction, LLM abuse, steganography, and covert learning

October 2023

part 1: introduction to model extraction and observational defenses

In the first installment, we talk about model extraction attacks and explore Observational Model Extraction Defenses (OMEDs).

read part 1

part 2: understanding covert learning and its implications

The second part introduces Covert Learning, a novel approach that challenges the robustness of OMEDs.

read part 2

part 3: steganography, LLM abuse, and future defenses

In the concluding part, we explore the intersection of steganography with model extraction attacks and discuss the potential for abusing Large Language Models like ChatGPT.

read part 3