2.4 FAIR

A prerequisite for data reuse, both within a project/organization and externally, is that the data can be found, read, and interpreted. A structured approach to data creation can facilitate reuse and save time later. One set of relevant principles, generally favoured by the international research community and promoted by the European Commission, are the FAIR principles. FAIR stands for Findable, Accessible, Interoperable, and Reusable. The aim of these principles is to make data and metadata (data about data) machine actionable as well as human-readable. The principles can be viewed in Wilkinson et al. (https://doi.org/10.1038/sdata.2016.18).

These principles apply to three categories: data (or other digital objects), metadata (data about data), and infrastructure (such as a data repository or data space). The principles have some implications on how (meta)data is created and described, as well as how storage, search and access infrastructures are set up.

Findability

Digital resources (data and metadata) should be easy to find for humans and computers. Machine-readable metadata are essential to make data searchable and findable and, allow for easy transfer of metadata between services. Persistent identifiers for data and metadata make sure that data can be cited and shared effectively and without uncertainties.

Data creators can increase findability by selecting identifying a data repository/space early on and finding out about its data and metadata requirements, making sure that it also provides persistent identifiers and a catalogue service (internal or external to the repository or data space).

Accessibility

Access and authorisation need to be clearly defined for both user and machine. Infrastructures need to select open, free, and universally implementable standardized communication protocols which also allow for authentication and authorization procedures (if applicable). Metadata should remain accessible even if the data is no longer available (usually via a tombstone page (https://support.datacite.org/docs/tombstone-pages)).

Data creators can facilitate accessibility by clarifying and describing the legal conditions for making data accessible, setting embargo periods (if necessary) after which data can be made available, and making sure that the selected repository/data space guarantees data longevity and availability.

Interoperability

Data often need to be integrated with other data or within larger workflows. Making use of open, formal, standardized language, vocabularies, formats, etc., when creating data and metadata facilitates this interoperability.

To increase interoperability, data creators can make use of commonly used data formats, ideally openly described, and using openly, commonly used vocabularies for data.

Reusability

Rich metadata descriptions are key to reusability. This means good documentation and making data accessible via an infrastructure that provides (and requires) rich metadata. Data creators can enhance reusability by keeping well-documented data provenance, as well as selecting and using appropriate and openly described metadata standards. Set a licence for data when sharing the data.

These FAIR principles do not force any specific technical implementation. Neither are they a standard. Rather, they are a set of guidelines aimed at improving data reusability. Not all principles may apply or be possible to implement in all situations. These principles should be used to make data and infrastructures supporting them as FAIR as possible given existing factors and limitations. In practice, Science Europe’s (https://doi.org/10.5281/zenodo.4915862) data management plan guide is a useful tool for implementing FAIR at project level.

Observe that FAIR-ness is not a binary state – data or metadata are not either FAIR or not FAIR. Implementing FAIR is usually an incremental process where (meta)data becomes increasingly FAIR as FAIR principles are implemented. Not all FAIR principles and steps apply to all types of data.  See Appendix I for questions related to FAIR implementation. Finally, FAIR does not imply openness: FAIR data can be restricted if necessary – the maxim “As open as possible, as restricted as necessary” is applicable in all cases.