Skip to Main content Skip to Navigation
Journal articles

Privacy-preserving mimic models for clinical named entity recognition in French

Nesrine Bannour 1 Perceval Wajsbürt 2 Bastien Rance 3, 4 Xavier Tannier 2 Aurélie Névéol 1 
1 ILES - Information, Langue Ecrite et Signée
LISN - Laboratoire Interdisciplinaire des Sciences du Numérique, STL - Sciences et Technologies des Langues
4 HeKA - Health data- and model- driven Knowledge Acquisition
Inria de Paris, CRC (UMR_S_1138 / U1138) - Centre de Recherche des Cordeliers
Abstract : A vast amount of crucial information about patients resides solely in unstructured clinical narrative notes. There has been a growing interest in clinical Named Entity Recognition (NER) task using deep learning models. Such approaches require sufficient annotated data. However, there is little publicly available annotated corpora in the medical field due to the sensitive nature of the clinical text. In this paper, we tackle this problem by building privacy-preserving shareable models for French clinical Named Entity Recognition using the mimic learning approach to enable the knowledge transfer through a teacher model trained on a private corpus to a student model. This student model could be publicly shared without any access to the original sensitive data. We evaluated three privacy-preserving models using three medical corpora and compared the performance of our models to those of baseline models such as dictionary-based models. An overall macro F-measure of 70.6% could be achieved by a student model trained using silver annotations produced by the teacher model, compared to 85.7% for the original private teacher model. Our results revealed that these privacy-preserving mimic learning models offer a good compromise between performance and data privacy preservation.
Document type :
Journal articles
Complete list of metadata
Contributor : Xavier Tannier Connect in order to contact the contributor
Submitted on : Sunday, May 8, 2022 - 10:24:14 PM
Last modification on : Wednesday, June 8, 2022 - 12:50:08 PM


Files produced by the author(s)



Nesrine Bannour, Perceval Wajsbürt, Bastien Rance, Xavier Tannier, Aurélie Névéol. Privacy-preserving mimic models for clinical named entity recognition in French. Journal of Biomedical Informatics, Elsevier, 2022, 130, pp.104073. ⟨10.1016/j.jbi.2022.104073⟩. ⟨hal-03655039⟩



Record views


Files downloads