News on Bitdefender AI Research

JumpLoRA: Sparse Adapters for Continual Learning in Large Language Models

Fri, 17 Apr 2026 00:00:00 +0000

Links: arXiv

Abstract

Adapter-based methods have become a cost-effective approach to continual learning (CL) for Large Language Models (LLMs), by sequentially learning a low-rank update matrix for each task. To mitigate catastrophic forgetting, state-of-the-art approaches impose constraints on new adapters with respect to the previous ones, by targeting either subspace or coordinate-wise interference. In this paper, we propose JumpLoRA, a novel framework to adaptively induce sparsity in the Low-Rank Adaptation (LoRA) blocks through the use of JumpReLU gating. The method achieves dynamic parameter isolation, which helps prevent task interference. We demonstrate that our method is highly modular and compatible with LoRA-based CL approaches. Specifically, it significantly boosts the performance of IncLoRA and outperforms the leading state-of-the-art CL method, ELLA.

Bridging Explainability and Embeddings: BEE Aware of Spuriousness

Tue, 07 Apr 2026 00:00:00 +0000

Links: OpenReview GitHub

Abstract

Current methods for detecting spurious correlations rely on data splits or error patterns, leaving many harmful shortcuts invisible when counterexamples are absent. We introduce BEE (Bridging Explainability and Embeddings), a framework that shifts the focus from model predictions to the weight space and embedding geometry underlying decisions. By analyzing how fine-tuning perturbs pretrained representations, BEE uncovers spurious correlations that remain hidden from conventional evaluation pipelines. We use linear probing as a transparent diagnostic lens, revealing spurious features that not only persist after full fine-tuning but also transfer across diverse state-of-the-art models. Our experiments cover numerous datasets and domains: vision (Waterbirds, CelebA, ImageNet-1k), language (CivilComments, MIMIC-CXR medical notes), and multiple embedding families (CLIP, CLIP-DataComp.XL, mGTE, BLIP2, SigLIP2). BEE consistently exposes spurious correlations: from concepts that slash the ImageNet accuracy by up to 95%, to clinical shortcuts in MIMIC-CXR notes that induce dangerous false negatives. Together, these results position BEE as a general and principled tool for diagnosing spurious correlations in weight space, enabling principled dataset auditing and more trustworthy foundation models.

Deep Learning @ FMI 2026 edition

Mon, 23 Feb 2026 00:00:00 +0000

The Deep Learning course at the Faculty of Mathematics and Computer Science, University of Bucharest is running again in 2026. It is aimed at third-year students. The 2026 edition starts in the second semester of the 2025-2026 academic year on 23 February 2026, and covers basic deep learning methods, with labs and a final project.

Investigating self-supervised representations for audio-visual deepfake detection

Sat, 21 Feb 2026 00:00:00 +0000

Links: arXiv GitHub

Abstract

Self-supervised representations excel at many vision and speech tasks, but their potential for audio-visual deepfake detection remains underexplored. Unlike prior work that uses these features in isolation or buried within complex architectures, we systematically evaluate them across modalities (audio, video, multimodal) and domains (lip movements, generic visual content). We assess three key dimensions: detection effectiveness, interpretability of encoded information, and cross-modal complementarity. We find that most self-supervised features capture deepfake-relevant information, and that this information is complementary. Moreover, models primarily attend to semantically meaningful regions rather than spurious artifacts. Yet none generalize reliably across datasets. This generalization failure likely stems from dataset characteristics, not from the features themselves latching onto superficial patterns. These results expose both the promise and fundamental challenges of self-supervised representations for deepfake detection: while they learn meaningful patterns, achieving robust cross-domain performance remains elusive.

CLewR: Curriculum Learning with Restarts for Machine Translation Preference Learning

Fri, 09 Jan 2026 00:00:00 +0000

Links: arXiv GitHub

Abstract

Large language models (LLMs) have demonstrated competitive performance in zero-shot multilingual machine translation (MT). Some follow-up works further improved MT performance via preference optimization, but they leave a key aspect largely underexplored: the order in which data samples are given during training. We address this topic by integrating curriculum learning into various state-of-the-art preference optimization algorithms to boost MT performance. We introduce a novel curriculum learning strategy with restarts (CLewR), which reiterates easy-to-hard curriculum multiple times during training to effectively mitigate the catastrophic forgetting of easy examples. We demonstrate consistent gains across several model families (Gemma2, Qwen2.5, Llama3.1) and preference optimization techniques.

Best NeurIPS reviewer recognition

Thu, 18 Dec 2025 00:00:00 +0000

A member of our team was recognized among the best reviewers at NeurIPS, reflecting sustained, high-quality peer review for the machine learning community.

C-ing Clearly: Enhanced Binary Code Explanations using C code

Tue, 16 Dec 2025 00:00:00 +0000

Links: arXiv

Abstract

Large Language Models (LLMs) typically excel at coding tasks involving high-level programming languages, as opposed to lower-level programming languages, such as assembly. We propose a synthetic data generation method named C-ing Clearly, which leverages the corresponding C code to enhance an LLM’s understanding of assembly. By fine-tuning on data generated through our method, we demonstrate improved LLM performance for binary code summarization and vulnerability detection. Our approach demonstrates consistent gains across different LLM families and model sizes.

Beyond Pass@k: Breadth-Depth Metrics for Reasoning Boundaries

Sat, 06 Dec 2025 00:00:00 +0000

Links: arXiv

Abstract

Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful paradigm to improve Large Language Models on reasoning tasks such as coding, math or logic. To assess the reasoning boundary (the fraction of problems a model can solve) researchers often report Pass@k at large sampling budgets. Recent results reveal a crossover phenomenon: while RLVR models outperform the base model at small k values, the base model usually outperforms them when sampling a very large number of completions. This has been interpreted as evidence that base models have a larger reasoning boundary. We argue that on tasks with discrete answer spaces, such as math with numeric outputs, Pass@k at large k reflects the increasingly higher chance of success in the limit of the number of trials rather than genuine reasoning, and can therefore be misleading. We propose Cover@tau, which measures the fraction of problems that a model can solve for which at least a tau proportion of completions are correct. Unlike Pass@k, Cover@tau captures reasoning under an explicit reliability threshold: models that rely on random guessing degrade rapidly as tau increases. We evaluate several RLVR models using Cover@tau-based metrics and illustrate how the relative rankings of popular algorithms change compared to Pass@1, offering a different perspective on reasoning boundaries.

ChronoGraph: A Real-World Graph-Based Multivariate Time Series Dataset

Sat, 06 Dec 2025 00:00:00 +0000

Links: arXiv GitHub

Abstract

We present ChronoGraph, a graph-structured multivariate time series forecasting dataset built from real-world production microservices. Each node is a service that emits a multivariate stream of system-level performance metrics, capturing CPU, memory, and network usage patterns, while directed edges encode dependencies between services. The primary task is forecasting future values of these signals at the service level. In addition, ChronoGraph provides expert-annotated incident windows as anomaly labels, enabling evaluation of anomaly detection methods and assessment of forecast robustness during operational disruptions. Compared to existing benchmarks from industrial control systems or traffic and air-quality domains, ChronoGraph uniquely combines (i) multivariate time series, (ii) an explicit, machine-readable dependency graph, and (iii) anomaly labels aligned with real incidents. We report baseline results spanning forecasting models, pretrained time-series foundation models, and standard anomaly detectors. ChronoGraph offers a realistic benchmark for studying structure-aware forecasting and incident-aware evaluation in microservice systems.

Not All Splits Are Equal: Rethinking Attribute Generalization Across Unrelated Categories

Sat, 06 Dec 2025 00:00:00 +0000

Links: arXiv GitHub

Abstract

Can models generalize attribute knowledge across semantically and perceptually dissimilar categories? While prior work has addressed attribute prediction within narrow taxonomic or visually similar domains, it remains unclear whether current models can abstract attributes and apply them to conceptually distant categories. This work presents the first explicit evaluation for the robustness of the attribute prediction task under such conditions, testing whether models can correctly infer shared attributes between unrelated object types: e.g., identifying that the attribute “has four legs” is common to both “dogs” and “chairs”. To enable this evaluation, we introduce train-test split strategies that progressively reduce correlation between training and test sets, based on: LLM-driven semantic grouping, embedding similarity thresholding, embedding-based clustering, and supercategory-based partitioning using ground-truth labels. Results show a sharp drop in performance as the correlation between training and test categories decreases, indicating strong sensitivity to split design. Among the evaluated methods, clustering yields the most effective trade-off, reducing hidden correlations while preserving learnability. These findings offer new insights into the limitations of current representations and inform future benchmark construction for attribute reasoning.

Rethinking Sparse Autoencoders: Select-and-Project for Fairness and Control from Encoder Features Alone

Sat, 06 Dec 2025 00:00:00 +0000

Links: arXiv

Abstract

Sparse Autoencoders (SAEs) have proven valuable due to their ability to provide interpretable and steerable representations. Current debiasing methods based on SAEs manipulate these sparse activations presuming that feature representations are housed within decoder weights. We challenge this fundamental assumption and introduce an encoder-focused alternative for representation debiasing, contributing three key findings: (i) we highlight an unconventional SAE feature selection strategy, (ii) we propose a novel SAE debiasing methodology that orthogonalizes input embeddings against encoder weights, and (iii) we establish a performance-preserving mechanism during debiasing through encoder weight interpolation. Our Selection and Projection framework, termed S&P TopK, surpasses conventional SAE usage in fairness metrics by a factor of up to 3.2 and advances state-of-the-art test-time VLM debiasing results by a factor of up to 1.8 while maintaining downstream performance.

Learning (Approximately) Equivariant Networks via Constrained Optimization

Tue, 02 Dec 2025 00:00:00 +0000

Links: arXiv · GitHub

Abstract

Equivariant neural networks are designed to respect symmetries through their architecture, boosting generalization and sample efficiency when those symmetries are present in the data distribution. Real-world data, however, often departs from perfect symmetry because of noise, structural variation, measurement bias, or other symmetry-breaking effects. Strictly equivariant models may struggle to fit the data, while unconstrained models lack a principled way to leverage partial symmetries. Even when the data is fully symmetric, enforcing equivariance can hurt training by limiting the model to a restricted region of the parameter space. Guided by homotopy principles, where an optimization problem is solved by gradually transforming a simpler problem into a complex one, we introduce Adaptive Constrained Equivariance (ACE), a constrained optimization approach that starts with a flexible, non-equivariant model and gradually reduces its deviation from equivariance. This gradual tightening smooths training early on and settles the model at a data-driven equilibrium, balancing between equivariance and non-equivariance. Across multiple architectures and tasks, our method consistently improves performance metrics, sample efficiency, and robustness to input perturbations compared with strictly equivariant models and heuristic equivariance relaxations.

Systems and Methods of Detecting Bias in AI Training Data

Tue, 04 Nov 2025 00:00:00 +0000

Docket: BTD-2414-C1 · US application: 19/378,395 · PTO filed: Nov 4, 2025 · Status: Filed · Technologies: Machine learning

Continuation patent application on detecting bias in AI training data.

Application folder (USPTO Patent Center)

Systems and Methods of Identifying Spurious Correlations in AI Training Data

Tue, 04 Nov 2025 00:00:00 +0000

Docket: BTD-2414 · US application: 19/378,391 · PTO filed: Nov 4, 2025 · Status: Filed · Technologies: Machine learning

Patent application on identifying spurious correlations in AI training data.

Application folder (USPTO Patent Center)

Systems and Methods of Training Deepfake Detectors

Tue, 30 Sep 2025 00:00:00 +0000

Docket: BTD-2511 · PTO filed: Sep 30, 2025 · Status: Filed · Technologies: Machine learning

US patent application on training deepfake detectors.

Learning the Neighborhood: Contrast-Free Multimodal Self-Supervised Molecular Graph Pretraining

Fri, 26 Sep 2025 00:00:00 +0000

Links: arXiv GitHub

Abstract

High-quality molecular representations are essential for property prediction and molecular design, yet large labeled datasets remain scarce. While self-supervised pretraining on molecular graphs has shown promise, many existing approaches either depend on hand-crafted augmentations or complex generative objectives, and often rely solely on 2D topology, leaving valuable 3D structural information underutilized. To address this gap, we introduce C-FREE (Contrast-Free Representation learning on Ego-nets), a simple framework that integrates 2D graphs with ensembles of 3D conformers. C-FREE learns molecular representations by predicting subgraph embeddings from their complementary neighborhoods in the latent space, using fixed-radius ego-nets as modeling units across different conformers. This design allows us to integrate both geometric and topological information within a hybrid Graph Neural Network (GNN)-Transformer backbone, without negatives, positional encodings, or expensive pre-processing. Pretraining on the GEOM dataset, which provides rich 3D conformational diversity, C-FREE achieves state-of-the-art results on MoleculeNet, surpassing contrastive, generative, and other multimodal self-supervised methods. Fine-tuning across datasets with diverse sizes and molecule types further demonstrates that pretraining transfers effectively to new chemical domains, highlighting the importance of 3D-informed molecular representations.

NeurIPS Associate Program Chair

Fri, 05 Sep 2025 00:00:00 +0000

A team member served as NeurIPS Associate Program Chair (Associate PC), helping shape the conference’s review process and paper selection.

CVPR Daily interview

Fri, 20 Jun 2025 00:00:00 +0000

Our group was featured in a CVPR Daily interview during CVPR 2025. The interview can be found here

Circumventing shortcuts in audio-visual deepfake detection datasets with unsupervised learning

Sun, 15 Jun 2025 00:00:00 +0000

Links: arXiv · GitHub

Abstract

Good datasets are essential for developing and benchmarking any machine learning system. Their importance is even more extreme for safety critical applications such as deepfake detection—the focus of this paper. Here we reveal that two of the most widely used audio-video deepfake datasets suffer from a previously unidentified spurious feature: the leading silence. Fake videos start with a very brief moment of silence and, on the basis of this feature alone, we can separate the real and fake samples almost perfectly. As such, previous audio-only and audio-video models exploit the presence of silence in the fake videos and consequently perform worse when the leading silence is removed. To circumvent latching on such an unwanted artifact and possibly other unrevealed ones, we propose a shift from supervised to unsupervised learning by training models exclusively on real data. We show that by aligning self-supervised audio-video representations we remove the risk of relying on dataset-specific biases and improve robustness in deepfake detection.

Do We Always Need the Simplicity Bias? Looking for Optimal Inductive Biases in the Wild

Sun, 15 Jun 2025 00:00:00 +0000

Links: arXiv · CVF Open Access

Abstract

Common choices of architecture give neural networks a preference for fitting data with simple functions. This simplicity bias is known as key to their success. This paper explores the limits of this assumption. Building on recent work that showed that activation functions are the origin of the simplicity bias (Teney, 2024), we introduce a method to meta-learn activation functions to modulate this bias.

Findings. We discover multiple tasks where the assumption of simplicity is inadequate, and standard ReLU architectures are therefore suboptimal. In these cases, we find activation functions that perform better by inducing a prior of higher complexity. Interestingly, these cases correspond to domains where neural networks have historically struggled: tabular data, regression tasks, cases of shortcut learning, and algorithmic grokking tasks. In comparison, the simplicity bias proves adequate on image tasks, where learned activations are nearly identical to ReLUs and GeLUs.

DeCLIP: Decoding CLIP representations for deepfake localization

Fri, 28 Feb 2025 00:00:00 +0000

Links: arXiv Proceedings GitHub

Abstract

Generative models can create entirely new images, but they can also partially modify real images in ways that are undetectable to the human eye. In this paper, we address the challenge of automatically detecting such local manipulations. One of the most pressing problems in deepfake detection remains the ability of models to generalize to different classes of generators. In the case of fully manipulated images, representations extracted from large self-supervised models (such as CLIP) provide a promising direction towards more robust detectors. Here, we introduce DeCLIP, a first attempt to leverage such large pretrained features for detecting local manipulations. We show that, when combined with a reasonably large convolutional decoder, pretrained self-supervised representations are able to perform localization and improve generalization capabilities over existing methods. Unlike previous work, our approach is able to perform localization on the challenging case of latent diffusion models, where the entire image is affected by the fingerprint of the generator. Moreover, we observe that this type of data, which combines local semantic information with a global fingerprint, provides more stable generalization than other categories of generative methods.

Robust Novelty Detection through Style-Conscious Feature Ranking

Fri, 28 Feb 2025 00:00:00 +0000

Links: arXiv Proceedings GitHub

Abstract

Novelty detection seeks to identify samples deviating from a known distribution, yet data shifts in a multitude of ways, and only a few consist of relevant changes. Aligned with out-of-distribution generalization literature, we advocate for a formal distinction between task-relevant semantic or content changes and irrelevant style changes. This distinction forms the basis for robust novelty detection, emphasizing the identification of semantic changes resilient to style distributional shifts. To this end, we introduce Stylist, a method that utilizes pretrained large-scale model representations to selectively discard environment-biased features. By computing per-feature scores based on feature distribution distances between environments, Stylist effectively eliminates features responsible for spurious correlations, enhancing novelty detection performance. Evaluations on adapted domain generalization datasets and a synthetic dataset demonstrate Stylist’s efficacy in improving novelty detection across diverse datasets with stylistic and content shifts.

ConceptDrift: Uncovering Biases through the Lens of Foundational Models

Sun, 15 Dec 2024 00:00:00 +0000

Links: arXiv

Abstract

Datasets and pre-trained models come with intrinsic biases. Most methods rely on spotting them by analysing misclassified samples, in a semi-automated human-computer validation. In contrast, we propose ConceptDrift, a method which analyzes the weights of a linear probe, learned on top a foundational model. We capitalize on the weight update trajectory, which starts from the embedding of the textual representation of the class, and proceeds to drift towards embeddings that disclose hidden biases. Different from prior work, with this approach we can pin-point unwanted correlations from a dataset, providing more than just possible explanations for the wrong predictions. We empirically prove the efficacy of our method, by significantly improving zero-shot performance with biased-augmented prompting. Our method is not bounded to a single modality, and we experiment in this work with both image (Waterbirds, CelebA, Nico++) and text datasets (CivilComments).

MolMix: A Simple Yet Effective Baseline for Multimodal Molecular Representation Learning

Sun, 15 Dec 2024 00:00:00 +0000

Links: arXiv GitHub

Abstract

In this work, we propose a simple transformer-based baseline for multimodal molecular representation learning, integrating three distinct modalities: SMILES strings, 2D graph representations, and 3D conformers of molecules. A key aspect of our approach is the aggregation of 3D conformers, allowing the model to account for the fact that molecules can adopt multiple conformations-an important factor for accurate molecular representation. The tokens for each modality are extracted using modality-specific encoders: a transformer for SMILES strings, a message-passing neural network for 2D graphs, and an equivariant neural network for 3D conformers. The flexibility and modularity of this framework enable easy adaptation and replacement of these encoders, making the model highly versatile for different molecular tasks. The extracted tokens are then combined into a unified multimodal sequence, which is processed by a downstream transformer for prediction tasks. To efficiently scale our model for large multimodal datasets, we utilize Flash Attention 2 and bfloat16 precision. Despite its simplicity, our approach achieves state-of-the-art results across multiple datasets, demonstrating its effectiveness as a strong baseline for multimodal molecular representation learning.

WASP: A Weight-Space Approach to Detecting Learned Spuriousness

Sun, 15 Dec 2024 00:00:00 +0000

Links: arXiv GitHub

Abstract

It is of crucial importance to train machine learning models such that they clearly understand what defines each class in a given task. Though there is a sum of works dedicated to identifying the spurious correlations featured by a dataset that may impact the model’s understanding of the classes, all current approaches rely solely on data or error analysis. That is, they cannot point out spurious correlations learned by the model that are not already pointed out by the counterexamples featured in the validation or training sets. We propose a method that transcends this limitation, switching the focus from analyzing a model’s predictions to analyzing the model’s weights, the mechanism behind the making of the decisions, which proves to be more insightful. Our proposed Weight-space Approach to detecting Spuriousness (WASP) relies on analyzing the weights of foundation models as they drift towards capturing various (spurious) correlations while being fine-tuned on a given dataset. We demonstrate that different from previous works, our method (i) can expose spurious correlations featured by a dataset even when they are not exposed by training or validation counterexamples, (ii) it works for multiple modalities such as image and text, and (iii) it can uncover previously untapped spurious correlations learned by ImageNet-1k classifiers.

Probabilistic Graph Rewiring via Virtual Nodes

Tue, 10 Dec 2024 00:00:00 +0000

Links: arXiv GitHub

Abstract

Message-passing graph neural networks (MPNNs) have emerged as a powerful paradigm for graph-based machine learning. Despite their effectiveness, MPNNs face challenges such as under-reaching and over-squashing, where limited receptive fields and structural bottlenecks hinder information flow in the graph. While graph transformers hold promise in addressing these issues, their scalability is limited due to quadratic complexity regarding the number of nodes, rendering them impractical for larger graphs. Here, we propose implicitly rewired message-passing neural networks (IPR-MPNNs), a novel approach that integrates implicit probabilistic graph rewiring into MPNNs. By introducing a small number of virtual nodes, i.e., adding additional nodes to a given graph and connecting them to existing nodes, in a differentiable, end-to-end manner, IPR-MPNNs enable long-distance message propagation, circumventing quadratic complexity. Theoretically, we demonstrate that IPR-MPNNs surpass the expressiveness of traditional MPNNs. Empirically, we validate our approach by showcasing its ability to mitigate under-reaching and over-squashing effects, achieving state-of-the-art performance across multiple graph datasets. Notably, IPR-MPNNs outperform graph transformers while maintaining significantly faster computational efficiency.

Systems and Methods of Detecting Chatbots

Fri, 19 May 2023 00:00:00 +0000

Docket: BTD-2227 · US publication: 2024/0386212 · PTO filed: May 19, 2023 · Status: Published · Technologies: Machine learning

Patent publication on detecting chatbots.

View on Google Patents

Computer Security Systems and Methods Using Self-Supervised Consensus-Building Machine Learning

Sat, 26 Mar 2022 00:00:00 +0000

Docket: BTD-2103 · US application: 17/656,644 · PTO filed: Mar 26, 2022 · Status: Published · Technologies: Machine learning

Patent application on computer security using self-supervised, consensus-building machine learning.

Application folder (USPTO Patent Center)