8.2 Financial models

This chapter suggests financial models for data sharing, starting mainly from the point of view of the organisation that has collected the dataset. As the main funding for transport-related research today comes from direct governmental grants, this is also likely to be the case for the re-use of data. Future funding might be directed toward established data-sharing and e-infrastructure activities. In fact, the first two financial models in this chapter (A and B), are based on such activities.

Project-based funding is one of the current methods for financing data sharing. The models C–E consider the pros and cons of directly including data sharing and re-use in the project activities. In the models F–H, the costs fall mainly on the end user (e.g., through membership fees or licenses). Several funding sources might be required to keep data available and provide services for third parties. Therefore, the financial models can also be complementary.

A) Organisations’ core activity

Digital preservation becomes a part of organisations’ core activities. This model is motivated by conditions set by public grant agreements. A part of the grant for the original projects that collected the data will be directed toward central data preservation activity inside the organisation. This would cover data management and sharing for a certain period after the project is finished. The data availability for third parties should be based on reasonable conditions and costs.

A selection process may be required to decide which data will be stored, the way they will be stored and for how long. The operation of a repository can also be outsourced. However, when a dataset containing personal data is stored by a third party, it needs to be strongly encrypted to avoid misuse and liability problems (see chapter 6).

Table 19: Model A (example: social sciences universities, possibly larger datasets)

ProsCons
– Data would be considered IPR, valuable datasets would not be lost
– Dedicated professionals would enhance the quality of the data provision procedures and analysis tools
– A burden for small organisations not prepared for such requirements
– No existing selection process for funding  

B) e-Infrastructures

Public funding is directed to data infrastructures, serving multiple organisations and disciplines. Centralised data management could offer professional data management services, general harmonisation and possibly greater cost-effectiveness when compared to distributed approaches. The roles of public infrastructure would cover certain tasks, but project-specific funding would still be needed when data-users or data owners request additional services.

Table 20: Model B (example: Supercomputing infrastructures and their services to universities)

ProsCons
– Professional data management services
– Data and processing services are free (i.e., for academic re-use)
– The operators of the data infrastructure will have limited knowledge and means to provide detailed support for analysts, other than existing documentation
– Dataset confidentiality sets limitations for storage by third party services
– No selection process for funding exists
– Valuable datasets from smaller projects might not be considered

C) Archiving included in project budget

Project budget allows for dataset finalisation and archiving in commercial services. In this model, the project budget allows for final cleaning, documentation and fees for archiving in selected data storages for a fixed period (e.g., 10 years). The project creates entries in relevant data catalogues.

Table 21: Model C (example: Research team storing its data – or making them open-source)

ProsCons
– The commercial service could get part of their funding from advertising, even enabling free storage – Who answers questions regarding the dataset after a few years have passed?
– Is the documentation of the required quality?
– No existing selection process for funding  

D) Project extension

The project is awarded a continuation to maintain its data.  Model D is like model C, except the dataset is archived by the project partners. For notable projects, separate grants for operation (including data storage, promotion, calls for analysis proposals, etc.) would be awarded based on a review board decision, under specific conditions.

Table 22: Model D (example: Large research projects apply for extensions)

ProsCons
– Targeted promotion activities for datasets can also include funding for analysis activities and effective monitoring of results – No selection process for funding exists
– Valuable datasets from smaller projects might not be considered


E) New project funding

New projects finance maintenance or revival of a dataset. In a chain of projects, the benefit of using past data is obvious, encouraging efforts to be put into maintaining and exploiting the old dataset. Depending on the follow-up activity, the data owner might also be motivated to share data with third parties, to extend analyses for mutual or customer benefit (e.g., offering material for thesis work, benefiting the customer who originally funded the data collection).

If a data request from outside of the organisation meets the business interests of the data owner, it is welcome. Otherwise, it fails to motivate the efforts needed for data sharing.

Table 23: Model E (example: Various research projects analysing and benefiting from previous dataset)

ProsCons
– No changes to current funding methods (additions are needed in call texts to promote existing datasets)
– When data is re-used by those who collected the data in a previous project, the re-use is very efficient
– Plain project-based funding may not be sufficient to keep datasets available and it should be seen instead as a complementary funding source
– Project owners have difficulties estimating the costs required to access a dataset, when they are making an initial project plan/offer

F) Established network

A network of organisations with participation fees arranges data management jointly.  Organisations within the same discipline form networks that share and promote data. Datasets are collected, documented and catalogued using agreed-on/standardised methods. The networks are likely to be formed for handling continuous operational data which meet their business interests. There could be various levels of memberships and fees.

Table 24: Model F (example: Accident data collection and sharing)

ProsCons
– Business aspects can be applied on fees, high-quality harmonised data
– Could include freemium services, where the basic information is available for free but advanced services have a cost
– Facilitates cooperation in research
– Only certain disciplines seem to reach this status

G) Analysis services

An organisation with several valuable datasets uses them to create business, offering both data and related services. This model can enable the original group that carried out a study to get further funding for their work. The model is for organisations with a prominent role in a discipline.

Table 25: Model G (example: Notable data owners / Data providers)

ProsCons
– Continuous research quite possibly results in high-quality results. – Small organisations and partnership projects have difficulties setting up this sort of business and their data easily get lost.
– Even valuable datasets become old and lose value for organisations purchasing analysis services.

H) Data integrators

Companies acquire and market datasets along with transport and other related datasets. In this model, a data integrator markets particularly useful datasets (among others, such as those containing real-time traffic data) licensed from original sources. Customers are offered a single access point for data, so they don’t have to go through negotiations with several parties, facilitating (for example) the development of mobile applications. For a dataset to be shared without fees for the re-users, the maintenance would have to be financed through the organisation that contributed the dataset – or supporting business operations.

Table 26: Model H (example: Road operators putting together information services)

ProsCons
– Easy licensing of various high-quality information resources. – Data integrators may have little interest in non-commercial academic work.

I) Data space

A number of organizations create a common space for sharing CCAM related data. The stakeholders agree upon common principles for data exchange, formats, description and taxonomy. The overriding principle is that within this data space, the control of data is in the hands of the data provider, and that access to data is conditioned by an agreement between the data provider and user. There is also a third category, “service provider”, which has access to data by a data provider and adds value to the data (or integrates multiple sources). Still, bi-lateral agreements are necessary between all stakeholders where the service provider will act as both user and provider. It is possible to handle monetary compensation for data access and exchange, but as of 2023 this is not yet implemented in any framework; therefore, this must be handled outside of the data space. Also, automated contracts could facilitate access but are still not available.

Table 27: Model I (example: using data space technologies for data exchange)

ProsCons
– The data provider is in full control over who has access to which data. – The data user might not have insight in the raw data which can lead to misinterpreted conclusions.