The Publication of the Villani Report on Artificial Intelligence: Towards Open Health Data?

Health data: a priority sector for Artificial Intelligence (AI)

Deputy Cedric Villani (from the ‘République en Marche’ party) has just published a report on Artificial Intelligence (AI). It states that, regarding public health, perspectives offered by AI are ‘promising and should improve the quality of care to the benefit of the patient and reduce the costs incurred’. In the Chapter on Health, the deputy proposes to ‘create a platform that would aggregate relevant data for research and innovation in public health (including medico-administrative data, clinical data, and hospital-related data). As a trusted third-party, the State would be in charge of organizing access to this platform according to a set of criteria […]’, thereby rendering access protocols to the Système National des Données de Santé (SNDS) [National System of Health Data] more flexible.

This report triggers many questions, which are indeed being addressed at the General Estates on Bioethics, through public debates and online consultation with the public. The main concern raised by our fellow citizens is about effective confidentiality of data and potential use of data if it turned out that patients could be identified, especially by correlation and inference (on date of birth/place of birth/healthcare setting).

Regarding the risk or re-identification of a patient from multiple data, the legislator plans for the Commission Nationale de l’Informatique et des Libertés (CNIL), the French Data protection authority, to organize a retrospective control of data and access to data as part of the bigger process of planning access to health data. However, this control will be difficult to implement due to a lack of means, according to Isabelle Falque-Pierrotin, director of CNIL — as quoted by Anne Lécu, co-director of the biomedical ethics department in Collège des Bernardins, in an article published in The Conversation[1] on 5th April.

How should we approach the risk of loss of confidentiality? Whilst it is not ethically acceptable to let actuarial criteria prevail over collective solidarity, we can also wonder if it is ethically acceptable to refuse to share individual health data when their analysis could prevent avoidable birth defects, or save lives.

It remains that ‘between the protection of private life and the health security, the question at hand is what each individual really wants’, states Anne Lescu, who also wonders if ‘at a societal scale, our ability to do could perhaps exceed our ability to think what we are doing’.

As far as the Remera registry is concerned, the objective remains the same: to protect confidentiality of data, which is to say to protect the most vulnerable. This objective requires that we participate in the national debate on ethical uses of AI: as these new technologies are, by nature, rapidly evolving, it belongs to us, as data holders, to think what we are doing carefully.

Our data is precious, and it will be useful to run IA algorithms. Collecting data also has a cost. The good news is that producing health data allowed workforce to be created. Big online platforms have understood its importance; it is therefore urgent to remind ourselves that health data must be valued, protected, shared, and that its collection must be funded[2].

Finally, the high-level technological, legal, financial constraints that go alongside open data makes us, and our funding bodies, liable: it is now of utmost necessity to seriously start thinking of professionalizing data collectors and data managers.


[1] https://theconversation.com/debat-les-donnees-de-notre-sante-doivent-rester-confidentielles-92950

[2] Institut National de la Santé et de la Recherche Médicale (Inserm) [French National Institute of Health and Medical Research] and Conseil Régional Auvergne-Rhône Alpes [Regional concil] are no longer funding the registry.