Engaging with the broader Machine Learning Community.
We tackle fundamental unsupervised learning approaches that we consider to be the key to true unconstrained learning, which best simulates how humans discover the surrounding world. We aim to combine the unsupervised visual perception with supervised cognitive video representation. We want to build systems that understand our world by only watching videos. And we go further, teaching the system to describe, in natural language, the discovered elements.
Elena Burceanu, Iulia Duță, Ema Haller, Andrei Nicolicioiu, coordinated by Marius Leordeanu
Unsupervised learning of objects from video sequences
We address an essential problem in computer vision, that of unsupervised foreground object segmentation in video, where a main object of interest in a video sequence should be automatically separated from its background. Video object segmentation and object discovery are strongly related tasks, but we tackle the problem from a fully unsupervised perspective, building object representations from raw video sequences. An efficient solution to this task would enable large-scale video interpretation at a high semantic level in the absence of the costly manual labeling.
We are focused on generating foreground object soft masks based on automatic selection and learning from highly probable positive features. We show that such features can be selected efficiently by taking into consideration the spatio-temporal appearance and motion consistency of the object in the video sequence. We also emphasize the role of the contrasting properties between the foreground object and its background. Our work is also focused on theoretically proving the properties of our unsupervised learning method, which under some mild constraints is guaranteed to learn the correct classifier even in the unsupervised case.
- E. Haller, M. Leordeanu, Unsupervised Object Segmentation in Video by Efficient Selection of Highly Probable Positive Features, In The IEEE International Conference on Computer Vision (ICCV), 2017
- E. Burceanu, M. Leordeanu, A 3D Convolutional Approach to Spectral Object Segmentation in Space and Time, In The International Joint Conference on Artificial Intelligence (IJCAI), 2020
- E. Haller, M. Leordeanu, Spacetime Graph Optimization for Video Object Segmentation, arXiv, 2020
- E. Haller, A.M. Florea, M. Leordeanu, Iterative Knowledge Exchange Between Deep Learning and Space-Time Spectral Clustering for Unsupervised Segmentation in Videos, journal version under review, 2020
- E.Burceanu, M. Leordeanu, Learning a Fast 3D Convolutional Approach to Spectral Object Segmentation in Space and Time, journal version under review, 2021
Graph methods for video processing
Given a video we want to have a good understanding of the scene and to be able to identify the key events in order to extract useful information. Our goal is to capture the complex interactions between multiple entities in a scene. We improve the clasical convolutional models by proposing a graph model, that has a strong, explicit bias towards modeling relationships and at the same time being able to model long range interactions. We also design a novel method, unsupervised at the object-level, that can discover salient regions and we quantitatively show that they correlate with objects locations and help the relational procesing
- A. Nicolicioiu, I. Duță, M. Leordeanu, Recurrent Space-time Graph Neural Networks, in the Conference on Neural Information Processing Systems (NeurIPS), 2019
- I. Duță, A. Nicolicioiu, M. Leordeanu, Dynamic Regions Graph Neural Networks for Spatio-Temporal Reasoning, Object Representations for Learning and Reasoning Workshop at NeurIPS 2020, 2020
Existing methods in multi-task graphs rely on expensive manual supervision. In contrast, our proposed solution, with consensus shift learning, relies only on pseudo-labels provided by expert models. In our graph, every node represents a task, and every edge learns to transform one input node into another. Once initialized, the graph learns by itself on virtually any novel target domain. An adaptive selection mechanism finds consensus among multiple paths reaching a given node and establishes the pseudo-ground truth at that node. Such pseudo-labels, given by ensemble pathways in the graph, are used during the next learning iteration when single edges distill this distributed knowledge.
Video captioning is the task of describing a video in natural language. It lies at the intersection of computer vision, natural language processing and machine learning requiring both high level visual comprehension and the ability to produce meaningful sentences. Our goal is to detect objects and events in a video and be capable of understanding the interactions between them in spatial and temporal dimensions.
We investigated multiple methods to analyze a video and extract information from it. To overcome limitations of the task and the available data, we design multiple models, to explore different video encoding strategies, to explore intermediate video-language representation and to investigate the gains brought by additional tasks and features. We propose a method for video captioning by selecting from the results of multiple encoder-decoder models. Our selection method based on consensus among multiple sentences is more likely to produce results with the same meaning as the video. We designed a methods that surpassed the state-of-the-art results on the challenging MSR-VTT dataset.
Unsupervised Object Tracking
Object tracking is one of the first and most fundamental problems that has been addressed in computer vision. While it has attracted the interest of many researchers over several decades of computer vision, it is far from being solved.
The task is hard for many reasons. Difficulties could come from severe changes in object appearance, presence of background clutter and occlusions that might take place in the video.
The only ground-truth knowledge given to the tracker is the bounding box of the object in the first frame. Thus, without knowing in advance the properties of the object being tracked, the tracking algorithm must learn them on the fly. It must adapt correctly and make sure it does not jump toward other objects in the background. That is why the possibility of drifting to the background poses on of the main challenges in tracking.
- E. Burceanu, M. Leordeanu, Learning a Robust Society of Tracking Parts using Co-occurrence Constraints, in The European Conference on Computer Vision (ECCV), at Visual Object Tracking workshop, 2018
- E. Burceanu, M. Leordeanu, Learning a Robust Society of Tracking Parts, 2017
- E. Burceanu, SFTrack++: A Fast Learnable Spectral Segmentation Approach for Space-Time Consistent Tracking, NeurIPS - Pre-register workshop, 2020
Natural Language Processing
A large amount of today's data is stored in databases. Building AI tools that facilitate the access to knowledge requires processing of natural language and structured data. We focus on neural approaches for natural language interfaces to databases, in particular structure-aware and semi-supervised methods.
Florin Brad in collaboration with Traian Rebedea, Ionel Hosu, Radu Iacob
Natural Language Interface to Databases
Natural Language Interface to Databases (NLIDB) bridges the gap between technical and non-technical users by allowing the latter to query large amounts of structured data through the use of instructions written in natural language.
Despite long-standing research efforts, progress has been slow and widespread adoption has failed to pick up. Data-driven approaches have been hindered by the lack of large parallel corpora to train the models on, but recent datasets alleviate this problem. We seek to improve existing SEQ2SEQ models by leveraging syntax to guide the generation process and by using semi-supervised techniques to overcome the low parallel data regime.
- F. Brad, R. Iacob, I. Hosu, T. Rebedea, Dataset for a Neural Natural Language Interface for Databases (NNLIDB), in The Proceedings of the Eighth International Joint Conference on Natural Language Processing (IJCNLP, Volume 1: Long Papers), 2017
- I. Hosu, R. Iacob, F. Brad, S. Ruseti, T. Rebedea, Natural Language Interface for Databases Using a Dual-Encoder Model, in The Proceedings of the 27th International Conference on Computational Linguistics (COLING), 2018
- F. Brad, R. Iacob, I. Hosu, S. Ruseti, T. Rebedea, A Syntax-Guided Neural Model for Natural Language Interfaces to Databases, in The Proceedings of the IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI), 2018
- R. Iacob, F. Brad, E.-S. Apostol, C.-O. Truica, T. Rebedea, Neural Approaches for Natural Language Interfaces to Databases: A Survey, in The Proceedings of the 29th International Conference on Computational Linguistics (COLING), 2020
Anomaly detection in Text
Recent deep methods for detecting anomalies in images learn better features of normality in an end-to-end self-supervised setting. These methods train a model to discriminate between different transformations applied to visual data and then use the output to compute an anomaly score. We use this approach for Anomaly Detection in text, by introducing a novel pretext task on text sequences. We learn our model end-to-end, enforcing two independent and complementary self-supervision signals, one at the token-level and one at the sequence-level.
Visual Question Answering
The VQA task refers to answering questions about an image. Existing methods fuse image and text representations and are able to exploit superficial correlations and produce the correct answer, but often for the wrong reason. The recently introduced Visual Commonsense Reasoning dataset facilitates cognition-level understanding on top of recognition, by requiring not only the correct answer, but also a justification for it. We improve existing methods by leveraging structured representations of images.
Within the field of artificial intelligence reinforcement learning seems a natural setting for training agents that interact with the world we are living in. We engage in furthering the field by developing agents able to learn continuously in different environments. We are also investigating models sitting at the intersection of generative models and reinforcement learning.
Florin Gogianu, collaborating with Tudor Berariu
Learning Representations for Deep Reinforcement Learning
We are interested in furthering the research on learning good representations for Deep Reinforcement Learning towards the goals of more stable optimisation, improvede generalization and better sample complexity.
Recent advances in machine learning are still limited to stationary tasks, but general purpose intelligence would require agents able to acquire knowledge in a continual manner dealing with interleaving tasks. Lifelong learning scenarios deal exactly with this problem: training agents on new tasks while preserving performance on old ones.
We are currently exploring memory based, optimization related and architectural methods to train neural models in lifelong learning scenarios.
Malmo AI Challenge
The Microsoft Malmo AI Challenge proposed a time-extended stag hunt scneario build on top of the well-known Minecraft platform. In order to maximize its payoff in such a game an agent needs to predict the level of commitment to a collaborative strategy of the other player, decide on a specific plan and navigate through the environment to execute it.
Multi-agent setups pose additional optimization problems stemming from the non-stationarity of the training objective. Also, situated environments ask agents to learn dynamic strategies capable of dealing with sudden changes in the course of action (such as another playing abandoning the collaborative strategy).
We approached the contest by training agents through deep reinforcement learning techniques using recurrent neural networks for policies and for value estimation. We also added auxiliary loss functions (such as next reward, or next frame prediction) in order to complement the sparse reward signal.
Our submission ranked second for the AI Summer School placement prize and third for the Microsoft Azure for Research Grant prize.
Lattice-based cryptography is a great promise for post-quantum cryptography. We build advanced primitives whose security relies on the hardness of lattice problems.
Miruna Rosca, Radu Țițiu, Mădălina Bolboceanu
Hardness of lattice problems
Lattice-based cryptography is a great promise for post-quantum cryptography. It aims at harnessing the security of cryptographic primitives in the conjectured hardness of well-identiﬁed and well-studied algorithmic problems involving euclidean lattices. In order to build post-quantum cryptographic primitives based on lattices, we actually make use of some intermediate, more versatile problems, Learning With Errors (LWE) and Shortest Integer Solutions (SIS), which are provably at least as hard as classical lattice problems.
To obtain more efficient primitives, different structured variants of LWE and SIS have been introduced. We are interested in studying the hardness of all these problems and giving reductions between them.
- M. Bolboceanu, Z. Brakerski, R. Perlman, D. Sharma, Order-LWE and the Hardness of Ring-LWE with Entropic Secrets, in The Annual International Conference on the Theory and Application of Cryptology and Information Security (Asiacrypt), 2019
- M. Rosca, D. Stehlé and A. Wallet, On the Ring-LWE and Polynomial-LWE Problems, in the proceedings of International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), pages 146-173, Springer, 2018
- M. Bolboceanu, Relating Different Polynomial-LWE problems, in 11th International Conference on Security for Information Technology and Communications (SECITC), 2018
- M. Rosca, A. Sakzad, D. Stehlé and R. Steinfeld, Middle-Product Learning with Errors, In the proceedings of International Cryptology Conference (CRYPTO), pages 283-297, Springer, 2017
Cryptographic primitives from (algebraic variants of) LWE
We build advanced cryptographic primitives whose security relies on the hardness of LWE and its algebraic variants. We are mainly interested in Homomorphic Encryption and Functional Encryption which allow you to run learning algorithms on encrypted data and perform statistical tests on the sensitive encrypted data that one could find for instance in a hospital or in a bank.
- S. Bai, D. Das, R. Hiromasa, M. Rosca, A. Sakzad, D. Stehlé, R. Steinfeld, Z. Zhang, MPSign: A Signature from Small-Secret Middle-Product Learning with Errors, accepted at International Conference on Practice and Theory of Public Key Cryptography (PKC), 2020
- S. Agrawal, B. Libert, M. Maitra, R. Titiu, Adaptive Simulation Security for Inner Product Functional Encryption, accepted at International Conference on Practice and Theory of Public Key Cryptography (PKC), 2020
- B.Libert, R. Țițiu, Multi-Client Functional Encryption for Linear Functions in the Standard Model from LWE, accepted at ASIACRYPT, 2019
- B. Libert, K. Nguyen, A. Passelègue, R. Țițiu, Simulation-Sound Proofs for LWE and Applications to KDM-CCA2 Security, accepted at ASIACRYPT, 2020
- R. Țițiu, B.Libert, D. Stehlé, Adaptively Secure Distributed PRFs from LWE, in The Theory of Cryptography Conference (TCC), invited to The Journal of Cryptology, 2018
My interest is in understanding videos in an unsupervised manner, currently working on object tracking for my PhD at University of Bucharest and Institute of Mathematics of the Romanian Academy. I have a strong background in Mathematics and Physics and I have finished my BSc in Computer Science and the MSc in Distributed Systems at University Politehnica Bucharest.
I am a second year PhD student, co-supervised by Marius Leordeanu (Institute of Mathematics of the Romanian Academy) and Adina Magda Florea (University Politehnica of Bucharest). I have a BSc in Computer Science and a MSc in Artificial Intelligence, both from University Politehnica of Bucharest. My main focus is the problem of unsupervised learning and I am currently working on the task of unsupervised learning of objects from video sequences.
I have a BSc in Computer Science and a MSc in Artificial Intelligence at the University of Bucharest. My research focuses on Graph Neural Networks and their application in Computer Vision. I am particularly interested in developing techniques to create a more structured representation of the video scene. My current goals are to discover object-centric representations in video useful for relational modeling and to analyse the optimisation process of different graph neural networks.
I am interested in deep learning methods with a focus on relational representations and graph neural networds. I studied at University Politehnica of Bucharest, where I obtained a Bachelor's degree and a Master's degree in Artificial Inteligence. I researched topics in multilabel classification, video captioning, occluded regions segmentation, few-shot learning, relational processing of visual data.
I am interested in neural generative models for natural language processing, especially for code generation. In particular, I am interested in leveraging discrete structure (syntax trees) to improve the expresiveness of the latent space and to guide the generation process.
I got my bachelor's degree in Mathematics and Computer Sciences from the University of Bucharest and I've currently enrolled in its Artificial Intelligence Master's degree program. I'm interested in unsupervised and self-supervised representation learning, especially using deep neural language models. At the moment I'm tackling the problem of out-of-distribution sample detection in natural language and more general unstructured data.
Currently pursuing a PhD in Reinforcement Learning unde the supervision of Prof. Lucian Bușoniu following an MSc in Artificial Intelligence from University Politehnica of Bucharest and a BSc in Philosophy. I have a broad interest in Reinforcement Learning topics and I am currently focusing on questions regarding sample efficiency in the context of model-free value-based methods with neural network estimators.
I am interested in post-quantum cryptography, with a focus on lattice-based solutions. I have a strong background in mathematics and I obtained my Phd in cryptography from École Normale Supérieure de Lyon, advised by Damien Stehlé. During my Phd, I worked on algebraic variants of Learning With Errors. More info about me on the website link below.
I obtained both my undergraduate and Master degrees in Mathematics from the University of Bucharest. My goal is to use my strong mathematical skills and experience in mathematical contests and olympiads to solve cryptographic challenges. I am interested in applications of lattices in cryptography, including lattice based homomorphic encryption schemes.
I am interested in lattice-based cryptography, which proposes promising cryptographic schemes in the eventuality that quantum computers are built. Moreover, lattice-based cryptography enables the construction of some advanced cryptographic primitives that allow computation on encrypted data, including Functional Encryption or Homomorphic Encryption. I have obtained my PhD in Cryptography from École Normale Supérieure de Lyon, under the supervision of Benoît Libert. In my undergraduate and my Master programs I studied mathematics at the University of Bucharest..
I am an Associate Professor (Senior Lecturer) at the University Politehnica of Bucharest and senior researcher at the Institute of Mathematics of the Romanian Academy. I am interested in the nature of intelligence, life and consciousness. In particular, my research focuses on computer vision, machine learning and robotics. At the university I teach the graduate level computer vision and robotics classes. I have received a Ph.D. in Robotics from Carnegie Mellon University in 2009 and Bachelor degrees in Mathematics and Computer Science from the City University of New York, 2003.