Research

Generalization and mechanistic interpretability

We pursue a simple goal: to understand not just whether models work, but why they work, when they fail, and what they are truly relying on under the hood. Our research focuses on robust generalization under distribution shift, the emergence of spurious correlations and shortcut strategies, and the internal mechanisms that drive these behaviors. We develop methods that go beyond merely cataloging failures after the fact by revealing hidden biases in learned representations, tracing shortcut learning through embeddings and weight space, and testing whether models can transfer abstract knowledge beyond the settings in which it was first acquired.

Elena Burceanu, Antonio Barbalau, Cristian Păduraru

Pathways of Visual Information Flow in Vision-Language Models (Under review)
idSCD: Identifying Training Datasets through Semantic Correlation Descriptors (Under review)
How the Optimizer Shapes Learned Solutions in Equivariant Neural Networks (ICML 2026 Workshop on Weight-Space Symmetries)

All papers in this direction Generalization & Interpretability

Deepfake detection

We focus on advancing deepfake detection across video and audio modalities. Our research is guided by three goals: (i) Generalization: develop methods that transfer across diverse datasets and forgery techniques. (ii) Transparency: understand how detection models make decisions and ensure datasets are reliable (free of spurious shortcuts). (iii) Deployability: build systems that adapt and remain robust on unconstrained “in-the-wild” content.

Elisabeta Oneață, Dan Oneață, Ștefan Smeu, Dragoș-Alexandru Boldișor

All papers in this direction Deepfake Detection

Natural language processing

We focus on large language models, along with reliability, reasoning, and scientific machine learning, applying our work across areas such as code and low-level languages, multilingual systems, and structured domains like molecules.

Florin Brad, Andrei Manolache, Ioana Pintilie, Marius Drăgoi, Alexandra Dragomir

Protein Fold Classification at Scale: Benchmarking and Pretraining (ICML 2026)
JumpLoRA: Sparse Adapters for Continual Learning in Large Language Models (Under review)
CLewR: Curriculum Learning with Restarts for Machine Translation Preference Learning (ACL 2026)

All papers in this direction Natural Language Processing

Reinforcement learning

Within the field of artificial intelligence, reinforcement learning presents a natural setting for training agents that interact with the world we are living in. We engage in furthering the field by developing agents able to learn continuously and efficiently in complex and non-stationary environments.

Florin Gogianu

Revisiting Adam for Streaming Reinforcement Learning (Reinforcement Learning Conference (RLC) 2026)

All papers in this direction Reinforcement Learning

All publications

Every academic paper below shows its research direction tag — use it to open the full list for that area.

Academic Generalization & Interpretability

Pathways of Visual Information Flow in Vision-Language Models

Israfel Salazar, Stella Frank, Dan Oneata, Desmond Elliott, Constanza Fierro

Under review Jul 2026

Links: arXiv Abstract We study how visual information is routed in vision-language models (VLMs). Using causal patching on controlled synthetic and …

Academic Deepfake Detection

Anchoring the Unknown: Open-Set Model Attribution via Proxy-Anchor Learning

Cristian-Teodor Neamtu, Serban Mihalache, Stefan Smeu, Dan Oneata, Horia Cucu, Dragos Burileanu

Accepted at EUSIPCO 2026 Jun 2026

Links: arXiv Abstract The proliferation of text-to-speech (TTS) systems capable of generating realistic synthetic speech poses growing challenges for …

Academic Generalization & Interpretability

idSCD: Identifying Training Datasets through Semantic Correlation Descriptors

Andrada Gobeaja, Ionut Hodoroaga, Elena Burceanu, Marius Leordeanu

Under review May 2026

Links: arXiv Abstract Can a dataset be recognized from the spurious correlations it induces during training? We argue that datasets leave …

Academic Generalization & Interpretability

How the Optimizer Shapes Learned Solutions in Equivariant Neural Networks

Teodor-Mihai Stupariu, Andrei Manolache

Accepted at ICML 2026 Workshop on Weight-Space Symmetries May 2026

Links: arXiv Abstract Equivariant neural networks encode geometric symmetries by construction, yet they are often difficult to optimize and can …

Academic Natural Language Processing

Protein Fold Classification at Scale: Benchmarking and Pretraining

Dexiong Chen, Andrei Manolache, Mathias Niepert, Karsten Borgwardt

Accepted at ICML 2026 (Oral) May 2026

Links: arXiv GitHub Abstract Classifying protein topology is essential for deciphering biological function, but progress is held back by the lack of …

Academic Reinforcement Learning

Revisiting Adam for Streaming Reinforcement Learning

Florin Gogianu, Adrian Catalin Lutu, Razvan Pascanu

Accepted at Reinforcement Learning Conference (RLC) 2026 May 2026

Links: arXiv Abstract Learning from a sequence of interactions, as soon as observations are perceived and acted upon, without explicitly storing them, …

Academic Generalization & Interpretability

Fine-Tuning Regimes Define Distinct Continual Learning Problems

Paul-Tiberiu Iordache, Elena Burceanu

Under review Apr 2026

Links: arXiv Abstract Continual learning (CL) studies how models acquire tasks sequentially while retaining previously learned knowledge. Despite …

Academic Generalization & Interpretability

Temporal Taskification in Streaming Continual Learning: A Source of Evaluation Instability

Nicolae Filat, Ahmed Hussain, Konstantinos Kalogiannis, Elena Burceanu

Under review Apr 2026

Links: arXiv Abstract Streaming Continual Learning (CL) typically converts a continuous stream into a sequence of discrete tasks through temporal …

Academic Natural Language Processing

JumpLoRA: Sparse Adapters for Continual Learning in Large Language Models

Alexandra Dragomir, Ioana Pintilie, Antonio Barbalau, Marius Dragoi, Florin Brad, Cristian Paduraru, Alexandru Tifrea, Elena Burceanu, Radu Ionescu

Under review Apr 2026

Links: arXiv Abstract Adapter-based methods have become a cost-effective approach to continual learning (CL) for Large Language Models (LLMs), by …

Academic Generalization & Interpretability

Bridging Explainability and Embeddings: BEE Aware of Spuriousness

Cristian Daniel Paduraru, Antonio Barbalau, Radu Filipescu, Andrei Liviu Nicolicioiu, Elena Burceanu

Accepted at ICLR 2026 (poster) Apr 2026

Links: OpenReview GitHub Abstract Current methods for detecting spurious correlations rely on data splits or error patterns, leaving many harmful …

Academic Deepfake Detection

Investigating self-supervised representations for audio-visual deepfake detection

Dragos-Alexandru Boldisor, Stefan Smeu, Dan Oneata, Elisabeta Oneata

Accepted at CVPR 2026 Feb 2026

Links: arXiv GitHub Abstract Self-supervised representations excel at many vision and speech tasks, but their potential for audio-visual deepfake …

Academic Natural Language Processing

CLewR: Curriculum Learning with Restarts for Machine Translation Preference Learning

Alexandra Dragomir, Florin Brad, Radu Tudor Ionescu

Accepted at ACL 2026 Jan 2026

Links: arXiv GitHub Abstract Large language models (LLMs) have demonstrated competitive performance in zero-shot multilingual machine translation …

Academic Natural Language Processing

C-ing Clearly: Enhanced Binary Code Explanations using C code

Teodor Poncu, Ioana Pintilie, Marius Dragoi, Dragos Tantaru, Florin Brad

Under review Dec 2025

Links: arXiv Abstract Large Language Models (LLMs) typically excel at coding tasks involving high-level programming languages, as opposed to …

Academic Natural Language Processing

Beyond Pass@k: Breadth-Depth Metrics for Reasoning Boundaries

Marius Dragoi, Ioana Pintilie, Florin Gogianu, Florin Brad

Accepted at NeurIPS 2025 Workshop FoRLM (Poster) Dec 2025

Links: arXiv Abstract Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful paradigm to improve Large Language Models on …

Academic Natural Language Processing

ChronoGraph: A Real-World Graph-Based Multivariate Time Series Dataset

Adrian Catalin Lutu, Ioana Pintilie, Elena Burceanu, Andrei Manolache

Accepted at NeurIPS 2025 Workshop BERT2S (Oral) Dec 2025

Links: arXiv GitHub Abstract We present ChronoGraph, a graph-structured multivariate time series forecasting dataset built from real-world production …

Academic Generalization & Interpretability

Not All Splits Are Equal: Rethinking Attribute Generalization Across Unrelated Categories

Liviu Nicolae Fircă, Antonio Bărbălau, Dan Oneata, Elena Burceanu

Accepted at NeurIPS 2025 Workshop CauScien (Poster) Dec 2025

Links: arXiv GitHub Abstract Can models generalize attribute knowledge across semantically and perceptually dissimilar categories? While prior work …

Academic Natural Language Processing

Rethinking Sparse Autoencoders: Select-and-Project for Fairness and Control from Encoder Features Alone

Antonio Bărbălau, Cristian Daniel Păduraru, Teodor Poncu, Alexandru Tifrea, Elena Burceanu

Accepted at NeurIPS 2025 Workshops: Mechanistic Interpretability; Reliable ML from Unreliable Data (Poster) Dec 2025

Links: arXiv Abstract Sparse Autoencoders (SAEs) have proven valuable due to their ability to provide interpretable and steerable representations. …

Academic Generalization & Interpretability

Learning (Approximately) Equivariant Networks via Constrained Optimization

Andrei Manolache, Luiz F.O. Chamon, Mathias Niepert

Accepted at NeurIPS 2025 (Oral, top 0.4%) Dec 2025

Links: arXiv · GitHub Abstract Equivariant neural networks are designed to respect symmetries through their architecture, boosting generalization and …

Academic Natural Language Processing

Learning the Neighborhood: Contrast-Free Multimodal Self-Supervised Molecular Graph Pretraining

Boshra Ariguib, Mathias Niepert, Andrei Manolache

Under review Sep 2025

Links: arXiv GitHub Abstract High-quality molecular representations are essential for property prediction and molecular design, yet large labeled …

Academic Deepfake Detection

Circumventing shortcuts in audio-visual deepfake detection datasets with unsupervised learning

Stefan Smeu, Dragos-Alexandru Boldisor, Dan Oneata, Elisabeta Oneata

Accepted at CVPR 2025 (Highlight, top 3%) Jun 2025

Links: arXiv · GitHub Abstract Good datasets are essential for developing and benchmarking any machine learning system. Their importance is even more …

Academic Generalization & Interpretability

Do We Always Need the Simplicity Bias? Looking for Optimal Inductive Biases in the Wild

Damien Teney, Lianze Jiang, Florin Gogianu, Ehsan Abbasnejad

Accepted at CVPR 2025 (Oral, top 0.8%) Jun 2025

Links: arXiv · CVF Open Access Abstract Common choices of architecture give neural networks a preference for fitting data with simple functions. This …

Academic Deepfake Detection

DeCLIP: Decoding CLIP representations for deepfake localization

Stefan Smeu, Elisabeta Oneata, Dan Oneata

Accepted at WACV 2025 (Oral) Feb 2025

Links: arXiv Proceedings GitHub Abstract Generative models can create entirely new images, but they can also partially modify real images in ways that …

Academic Generalization & Interpretability

Robust Novelty Detection through Style-Conscious Feature Ranking

Stefan Smeu, Elena Burceanu, Emanuela Haller, Andrei Liviu Nicolicioiu

Accepted at WACV 2025 (Poster) Feb 2025

Links: arXiv Proceedings GitHub Abstract Novelty detection seeks to identify samples deviating from a known distribution, yet data shifts in a …

Academic Generalization & Interpretability

ConceptDrift: Uncovering Biases through the Lens of Foundational Models

Cristian Daniel Paduraru, Antonio Barbalau, Radu Filipescu, Andrei Liviu Nicolicioiu, Elena Burceanu

Accepted at NeurIPS 2024 Workshop Interpretable AI: Past, Present and Future Dec 2024

Links: arXiv Abstract Datasets and pre-trained models come with intrinsic biases. Most methods rely on spotting them by analysing misclassified …

Academic Natural Language Processing

MolMix: A Simple Yet Effective Baseline for Multimodal Molecular Representation Learning

A. Manolache, D. Tantaru, M. Niepert

Accepted at NeurIPS 2024 Workshop on Machine Learning for Structural Biology Dec 2024

Links: arXiv GitHub Abstract In this work, we propose a simple transformer-based baseline for multimodal molecular representation learning, …

Academic Generalization & Interpretability

WASP: A Weight-Space Approach to Detecting Learned Spuriousness

Cristian Daniel Paduraru, Antonio Barbalau, Radu Filipescu, Andrei Liviu Nicolicioiu, Elena Burceanu

Accepted at NeurIPS 2024 Workshop Interpretable AI: Past, Present and Future Dec 2024

Links: arXiv GitHub Abstract It is of crucial importance to train machine learning models such that they clearly understand what defines each class in …

Academic Generalization & Interpretability

Probabilistic Graph Rewiring via Virtual Nodes

C. Qian, A. Manolache, C. Morris, M. Niepert

Accepted at NeurIPS 2024 (Poster) Dec 2024

Links: arXiv GitHub Abstract Message-passing graph neural networks (MPNNs) have emerged as a powerful paradigm for graph-based machine learning. …

Academic Deepfake Detection

Towards generalisable and calibrated audio deepfake detection with self-supervised representations

Octavian Pascu, Adriana Stan, Dan Oneata, Elisabeta Oneata, Horia Cucu

Accepted at Interspeech 2024 Sep 2024

Links: Proceedings PDF Abstract Generalisation—the ability of a model to perform well on unseen data—is crucial for building reliable deepfake …

Academic Deepfake Detection

Weakly-supervised deepfake localization in diffusion-generated images

Dragos-Constantin Tantaru, Elisabeta Oneata, Dan Oneata

Accepted at Winter Conference on Applications of Computer Vision (WACV) 2024 Jan 2024

Links: arXiv Proceedings GitHub Abstract The remarkable generative capabilities of denoising diffusion models have raised new concerns regarding the …

Generalization and mechanistic interpretability#

Deepfake detection#

Natural language processing#

Reinforcement learning#

All publications

Generalization and mechanistic interpretability

Deepfake detection

Natural language processing

Reinforcement learning