JDSE

About

Thank you to all the speakers and attenders of this fifth edition of the JDSE. With 96 registrations and an average of 60 participants for each presentation, it was a success, despite the particular context of this online version. Find and share all the speeches on the Labex DigiCosme YouTube channel. See you for the next edition !

The fifth edition of the Paris-Saclay / IPP Junior Conference on Data Science and Engineering (JDSE) is addressed to first year Ph.D. students, M2 students and third year students at Engineering schools at Paris-Saclay and Institut Polytechnique de Paris. It will offer to 6 students the opportunity to present some scientific work developed during their internships or first year of Ph.D. thesis, and also to grow their critical sense thanks to a professional conference hosting prestigious invited speakers, academics and industry scientists. This edition should have been held in September, but due to the sanitary conditions, it is to be held online in early February.

The conference aims at gathering a vast audience and is an excellent means of discovering research activities in Data Science and Engineering.

Past editions

Speakers

Arnaud Doucet
Professor of Statistics, University of Oxford, United Kingdom & Research Scientist, DeepMind

Auto-Encoding Differentiable Particle Filtering

Abstract:

Particle Filters are a powerful and popular class of methods used to perform state and parameter inference in non-linear non-Gaussian state-space models. Combined to variational inference ideas, these techniques provide state-of-the-art variational auto-encoders for time series. However, the resampling steps used by particle filters yield a non-differentiable estimate of the likelihood function and high variance gradient estimates of the Evidence Lower Bound. By leveraging Optimal Transport ideas, we introduce the first principled class of differentiable particle filters, providing a differentiable likelihood function estimate which can be used for end-to-end parameter learning. We establish a few convergence results and demonstrate the performance of differentiable particle filters on various applications. Joint work with Adrien Corenflos (Aalto University), James Thornton (Oxford University) & George Deligiannidis (Oxford University).

Bio:

Arnaud Doucet obtained his PhD from the University of Paris XI (Orsay) in 1997. He previously held academic positions at Cambridge University, Melbourne University, The Institute of Statistical Mathematics in Tokyo and the University of British Columbia where he was a Canada Research Chair in Stochastic Computation. Professor Arnaud Doucet’s research concerns numerical methods for the analysis of complex data sets. In particular, he has contributed to the development and study of sequential Monte Carlo and Markov chain Monte Carlo methods.

Patrick Perez
Scientific Director of valeo.ai, Paris, France

Challenges of machine learning for autonomous driving

Abstract:

Assisted and autonomous vehicles are safety-critical systems that have to cope in real-time with complex, hard-to-predict, dynamic environments. Training (and testing) the underlying models require massive amounts of fully-annotated driving data, which is not sustainable. Focusing on perception, several projects at valeo.ai toward training better models with less supervision will be presented.

Bio:

Patrick Perez is Scientific Director of valeo.ai, an AI research lab focused on Valeo automotive applications, self-driving cars in particular. Before joining Valeo, Patrick Pérez has been a researcher at Technicolor (2009-2018), Inria (1993-2000, 2004-2009) and Microsoft Research Cambridge (2000-2004). His research interests include multimodal scene understanding and computational imaging.

Barbara Hammer
Professor for Machine Learning at the CITEC Cluster at Bielefeld University, Germany

Learning with reject option, drift, and interpretability

Abstract:

Neural networks have revolutionised domains such as computer vision or language processing, and learning technology is included in everyday’s consumer products. Yet, practical problems often render learning surprisingly difficult, in particular if some of the typical prerequisites of machine learning are violated. As an example, only few data might be available in the context of tasks such as model personalization. Learning might take place in non-stationary environments such that models face the stability-plasticity dilemma. In such cases, applicants might be tempted to use models for settings they are not intended for. Within the talk, I will address two challenges, which occur in such settings - How to learn reliably given few examples only ? How to learn incrementally in non-stationary environments where drift might occur ? More particular, I will address distance-based and prototype-based models for learning from few data and learning with drift, and I will argue for a vital property of such models, namely components of their inherent interpretability. Exemplary applications will come from the domain of driver assistance and biomechanics.

Bio:

Barbara Hammer is a full Professor for Machine Learning at the CITEC Cluster at Bielefeld University, Germany. She received her Ph.D. in Computer Science in 1999 and her venia legendi (permission to teach) in 2003, both from the University of Osnabrueck, Germany, where she was head of an independent research group on the topic 'Learning with Neural Methods on Structured Data'. In 2004, she accepted an offer for a professorship at Clausthal University of Technology, Germany, before moving to Bielefeld in 2010. Barbara's research interests cover theory and algorithms in machine learning and neural networks and their application for technical systems and the life sciences, including explainability, learning with drift, nonlinear dimensionality reduction, recursive models, and learning with non-standard data. Barbara has been chairing the IEEE CIS Technical Committee on Data Mining and Big Data Analytics, the IEEE CIS Technical Committee on Neural Networks, and the IEEE CIS Distinguished Lecturer Committee. She has been elected as member of the IEEE CIS Administrative Committee and the INNS Board. She is an associate editor of the IEEE Computational Intelligence Magazine, the IEEE TNNLS, and IEEE TPAMI. Currently, large parties of her work focusses on explainable machine learning for spatial-temporal data in her role as a PI of the ERC Synergy Grant Water-Futures.

Program

Monday 1 February 2021

9:30 - 10:00: Welcome message and opening talk - Bruno Defude

10:00 - 11:00: Arnaud Doucet (University of Oxford & Deepmind)

Auto-Encoding Differentiable Particle Filtering

11:00 - 12:00: Monday Session

11:00 - 11:20: Mandy Ibene - A multi-objective algorithm for interactive prediction of RNA complexes - Abstract
11:20 - 11:40: Martin Cepeda - COVID Risk Mitigation - Abstract
11:40 - 12:00: Manon Mottier - Deinterleaving and Clustering unknown RADAR pulses - Abstract

14:00 - 15:00: Patrick Perez (valeo.ai)

Challenges of machine learning for autonomous driving

Tuesday 2 February 2021

10:00 - 11:00: Barbara Hammer (Bielefeld University)

Learning with reject option, drift, and interpretability

11:00 - 12:00: Tuesday Session

11:00 - 11:20: Benoît Malezieux - Comparing Analysis and Synthesis in Deep Prior Learning for Inverse Problems Resolution - Abstract
11:20 - 11:40: Anfu Tang - Global alignment for relation extraction in Microbiology - Abstract
11:40 - 12:00: Alexis Pister - PK-Clustering: Integrating Prior Knowledge in Mixed-Initiative Social Network Clustering - Abstract

Student posters

Juliana Damurie - Neural Networks for Fourier Pytchography Microscopy and Application to Malaria
Xuanlong Yu - Uncertainty of Optical Flow: Regression Calibration Insights

Call for submissions

We invite Master M2 and Ph.D. students from the Université Paris-Saclay and Institut Polytechnique de Paris to submit an extended abstract of up to 3 pages describing new or preliminary results of their scientific work. The 6 selected authors will be given the chance to present their papers in a 20-minutes talk.

Master students are especially encouraged to submit posters even if they do not have substantial results at the time of submission.

Submissions must be in PDF and must adopt the style of the Springer Publications format for Lecture Notes in Computer Science (LNCS):

A sample submission can be found here. Your paper must be in English, have between 1 and 3 pages, contain no more than one figure or table, and include the following paragraphs: abstract, keywords, motivation, references (up to 10).
You can use either this latex template or this Word template, at your choice.

Submission

All papers must be submitted by mail to Matthieu Labeau and Yohan Petetin.

List of topics

Topics of interest include, but are not limited to:

Data mining
Databases
Big Data analytics
Machine learning
Statistics
Semantic web
Scientific workflows
Distributed data and computing
Applications of data science (biomedical and biological data, physics, chemistry, smart cities, image, documents, audio, video, on-line advertisement, ...)

The extended abstracts will be reviewed by the scientific program committee, including one junior PC member (PhD student or postdoc in data science). All the presentations must be in English. Electronic versions of the extended abstracts will be accessible on the conference web site. The book of abstracts will not be published and the extended abstracts will not constitute a formal publication.

Note: Only Master M2 and Ph.D. students from Université Paris-Saclay and Institut Polytechnique de Paris are invited to contribute.

About

Past editions

News

Speakers

Program

Monday 1 February 2021

10:00 - 11:00: Arnaud Doucet (University of Oxford & Deepmind)

11:00 - 12:00: Monday Session

14:00 - 15:00: Patrick Perez (valeo.ai)

Tuesday 2 February 2021

10:00 - 11:00: Barbara Hammer (Bielefeld University)

11:00 - 12:00: Tuesday Session

Student posters

Attending the JDSE

Call for submissions

Submission

List of topics

Important Dates

Steering Committee

Organizers

Partners and sponsors