2.1 Background

Over the past two decades, advancements in vehicle fleet data collection methodologies for research purposes emerged primarily from NDS and FOT. These studies were driven by two main factors: the need to better understand the causal factors behind incidents and accidents, and the progressive innovations in various driver assistance systems, leveraging cost-effective sensor, communication, and data server technologies.

It became essential to define and document best practices for carrying out these extensive trials, leading to the development of FESTA Handbook in 2008. The handbook has since received several updates, the latest being version 8 (https://www.connectedautomateddriving.eu/wp-content/uploads/2021/09/FESTA-Handbook-Version-8.pdf). It covers the full process of running field operational tests: from formulating research questions and preparing the test, to analysing collected data to answer these research questions. While extensive, FESTA cannot cover data management and data sharing aspects in high detail. Therefore, in 2016, the first version of this Data Sharing Framework was released, later updated in the CARTRE project (applying General Data Protection Regulation (GDPR)) and ARCADE project (adding automated driving topics) (https://www.connectedautomateddriving.eu/wp-content/uploads/2021/09/Data-Sharing-Framework-v1.1-final.pdf).

By the mid-2010s, the focus of research turned towards automated driving technology, fuelled by advances in machine learning and neural networks. There was a true hype in the latter part of the decade aiming at a fast introduction of automated vehicles and services on public roads. The challenges were however many and, even though there has been significant progress, much is still to be done. The shift also brought new dimensions to data sharing. Now, the domain grapples with large datasets that include both developmental sensor data and driving behaviour data to assess traffic changes. New regulations have also come into play, like test permits mandating minimal data collection, but also voluntary agreements from industry participants and public authorities, e.g., to share data for testing and validation purposes, to faster introduce new and adapted vehicle functionalities on the market (https://www.car-2-car.org/fileadmin/documents/General_Documents/C-ROADS_C2C_CC_C-ITS_and_CCAM_Data_paper_V1.0.pdf)

Data is vital for the development of automated vehicles. Initially, NDS and FOT data was used for building driver models and establishing baseline for driver behaviour across various scenarios. Soon after, data suited for training machine learning (ML) models began to be shared, primarily for research purposes, leading to an increase in datasets that address different aspects of driving. Currently, projects are collecting data from vehicles working on test tracks or in confined areas or conducting tests on public roads limited by conditions specified in the Operational Design Domain (ODD) of the system. now the aim is to extend the operational conditions for automated vehicles and initiate large-scale demonstrations or “living labs”. The data from these projects will be invaluable in validating systems and assessing their impact on traffic safety, efficiency, the environment, and society at large.

In parallel, Cooperative Intelligent Transport Systems (C-ITS) were developed and deployed on the market, mainly with a large step as VW included C2C and C2X capabilities in the ID family (https://www.nxp.com/company/blog/nxp-volkswagen-and-partners-continue-to-accelerate-the-v2x-rollout:BL-THE-V2X-ROLLOUT). These “DAY1” features are based on a set of open and documented standards with full access of all participating stakeholders to testing and validation of these specifications, and the C-ITS components serving them.  The EU, together with the strong European automotive industry, road operators and member states, launched the CCAM partnership (https://www.ccam.eu/) in 2020 to address the needs, requirements, and strategies of automated driving. Starting in 2022, numerous projects under this programme (being part of Horizon Europe), as well as many national projects, have been advancing research in automated driving, with vehicles and road infrastructure elements with functions that enable fully fledged testing on public roads.

There are numerous challenges being addressed and data are a central element to overcoming them, the voluntary agreements between industry and public authorities mentioned above are widely supported and can cover the necessary data elements and types to address some challenges of testing and validation. However, no single entity can tackle these challenges by themselves – none possess the comprehensive data required to develop and validate the functions, systems, and vehicles. This framework introduces best practices for data exchange to facilitate and increase research data sharing.