Blog on Bitdefender AI Research

BEE Aware of Spuriousness: Mechanistic Interpretability for Fine Tuning Foundation Models

Wed, 25 Feb 2026 00:00:00 +0000

In our ICLR 2026 paper “Bridging Explainability and Embeddings: BEE Aware of Spuriousness”, we introduce BEE, a diagnostic tool that surfaces spurious correlations by analyzing weight space drift and embedding geometry rather than relying only on held out validation data.

Towards Fused Kernels for Gated MLP

Wed, 05 Feb 2025 00:00:00 +0000

The decoder block of a Transformer is the basic unit of all modern LLMs. Most of the compute used for it is spent on self-attention and the MLP, with self-attention in special being problematic on long sequences due to its quadratic compute and memory requirements. It is not surprising therefore that there's been a lot of progress towards increasing the performance of self-attention, such as FlashAttention [1], or algorithms and models that approximate full attention, like Window Attention [2], or State-Space Models [3, 4, 5]. While efficient kernels for MLPs do exist, from what we could find they seem to be either tailored to very specific setups, or only partially solve some of the issues of MLPs, such as fusing the gating operation.

Large Language Models for Malware Analysis

Fri, 12 Jan 2024 00:00:00 +0000

Large Language Models (LLMs) took the world by storm in 2023, revolutionizing the way people search and generate text content. LLMs for code have also made inroads in helping people understand code or write code based on requests in natural language. For instance, translating requests to SQL queries has rapidly advanced since the advent of GPT4.

Most popular code LLMs focus on generating or understanding high-level programming languages such as Python, C++, Java etc. [1],[2],[3],[4]. However code LLMs tailored to malware analysis should be adapted to better understand assembly code as well. Recent works also explore this avenue [5],[6],[7].

The BGV fully homomorphic encryption scheme

Thu, 29 Jun 2023 00:00:00 +0000

This is a sister blogpost to the previous one about a similar scheme (BFV) and it's part of the series that covers fully homomorphic encryption techniques and applications.

Introduction

$\gdef\can #1{\|#1\|^{\text{can}}}$

In this blogpost we will focus on the encryption, decryption, relinearization and the noise analysis of the Brakerski, Gentry, Vaikuntanathan (BGV) fully homomorphic encryption scheme. A Python implementation of the aforementioned will be provided.

DISCLAIMER: This toy implementation is not meant to be secure or optimized for efficiency. It's for educational purposes only, and for us to better understand the inner workings.

🕹️ Pretrained Atari Agents

Thu, 05 May 2022 00:00:00 +0000

Releasing trained models in computer vision and natural language processing has been a major source of progress for the research in these fields and a significant catalyst for the adaption of deep learning models in the industry. By comparison, RL agents pretrained on otherwise resource and time intensive benchmarks such as Arcade Learning Environment are rather hard to come by.

Today, our research group within Bitdefender is making available over 2️⃣5️⃣,0️⃣0️⃣0️⃣ agents trained on 60 games in the Arcade Learning Environment. We hope the diversity and the quality of these trained models will help spur new research in multi-task and imitation learning and contribute to the state of reproducibility in deep reinforcement learning.

Private Set Intersection from Homomorphic Encryption: A Python Implementation

Fri, 21 May 2021 00:00:00 +0000

Check out our Private Set Intersection (PSI) implementation in Python here!

In this blog post, we will first motivate our interest in PSI, by providing a list of applications: password checkup, private contact discovery for Whatsapp or Signal, measuring ads efficiency privately or DNA pattern matching. Secondly, we will show how to build a PSI protocol using a HE encryption scheme. Thirdly, we will describe our Python implementation of a specific PSI protocol.

Homomorphic Encryption: a Toy Implementation in Python

Mon, 16 Nov 2020 00:00:00 +0000

Motivation: We made this blog post as self-contained as possible, even though it was initially thought as a follow-up of this tutorial given by OpenMined. The starting point of our Python implementation is this github gist, which follows the Homomorphic Encryption scheme from [FV12]. The motivation behind our implementation was for us to understand in detail the two techniques of [FV12] used for ciphertext multiplication, namely relinearization and modulus-switching. This essential operation of ciphertext multiplication was missing in the previous implementation. We thought we might share this understanding through a blog post as well since it may be of interest to anyone using the [FV12] scheme in TenSeal or Seal libraries.