We propose CAMIA (Context-Aware Membership Inference Attack), a simple and effective method to infer if a large language model was pretrained on the given text.
Large language models (LLMs) are known to memorize parts of their training data, raising concerns about privacy, copyright, and evaluation contamination. Membership Inference Attacks (MIAs) are designed to assess such memorization by testing whether a given data point was part of a model's training set. However, MIAs applied to pre-trained LLMs have been largely ineffective. A key reason is that most algorithms were originally developed for classification models and fail to capture the generative nature of LLMs. While classification models output a single prediction, LLMs generate text token by token, with each prediction shaped by the prefix that comes before it. Consequently, MIAs that simply aggregate outputs over an entire sequence miss the crucial token-level dynamics that underlie memorization in LLMs.
Our key insight is that memorization in LLMs is context dependent.
We introduce a context aware MIA that analyzes how quickly a model moves from uncertainty to confident predictions as it generates text. CAMIA:
On the MIMIR benchmark, across Pythia and GPT Neo models and six domains, CAMIA consistently outperforms existing attacks. For example, when applied to Pythia 2.8B on the ArXiv dataset, it increases the true positive rate from 20.11% to 32.00% while keeping the false positive rate at 1%.
@article{chang2024context,
title={Context-aware membership inference attacks against pre-trained large language models},
author={Chang, Hongyan and Shamsabadi, Ali Shahin and Katevas, Kleomenis and Haddadi, Hamed and Shokri, Reza},
booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
year={2025}
}