5.6 Data extraction

Data extraction is the process by which results from analysis are being extracted from a research infrastructure to be (publicly) used in journals, project reports, or for presentations. Depending on the scope for publication, some instances also require the data to be posted aside the paper. The reason for this is the concept of reproducible results. This is challenging in the eyes of both GDPR and IP related matters. It is important for any actor to be aware of this beforehand.

Data extraction must be executed in-line with any existing agreements (which typically are agreements with Data owner(s) or Study participant(s)).

Published information shall be de-identified as well as checked and cleared for IP related matters beforehand. The Data Provider is responsible for defining the data extraction process and the Data Consumer must be aware of the principles, rules, and routines, specific for the dataset.

Both Data Supervisors should be aware of any data extraction request (unless their responsibility has been delegated) and decision. A certificate of the decision could be attached with the actual information extracted. It is recommended for the actors to archive the approved data extraction requests. A data extraction request could cover following elements: 1) intended use of the extracted data; 2) list of data types; 3) description of the data; 4) total size of the data; and 5) list of files or folders to extract.