A PDF (and citable) version of this document is available via Zenodo. DOI: https://doi.org/10.5281/zenodo.1316938
SURF – as the national collaborative ICT organisation for the Dutch education and research environment – has joined the effort to support the FAIR data principles implementation and application in the Netherlands. The first product of this endeavor is a report of the six case studies that were conducted by Melanie Imming. The interviewed institutions span from support services of various universities, over to research institutions, and ending with the national health care institute.
The purpose of this report is to build and share expertise on the implementation of FAIR data policy in the Netherlands. The six use cases included in this report describe developments in FAIR data, and different approaches taken, within different domains. For SURF, it is important to gain a better picture of the best way to support researchers who want to make their data FAIR. – Melanie Imming. (2018, April 23). FAIR Data Advanced Use Cases: from principles to practice in the Netherlands (Version Final). Zenodo. http://doi.org/10.5281/zenodo.1250535
On 22nd May 2018 the report was officially launched, accompanied by a lovely workshop in the SURF venue in Utrecht.
This week, we are presenting at the International Digital Curation Conference 2018 in Barcelona.
This presentation can be downloaded from Zenodo.
The pre-print version of the practice paper accepted for the conference is available on OSF Preprints.
Title: From Passive to Active, From Generic to Focused: How Can an Institutional Data Archive Remain Relevant in a Rapidly Evolving Landscape?
Authors: Maria J. Cruz, Jasmin K. Böhmer, Egbert Gramsbergen, Marta Teperek, Madeleine de Smaele, Alastair Dunning
Abstract: Founded in 2008 as an initiative of the libraries of three of the four technical universities in the Netherlands, the 4TU.Centre for Research Data (4TU.Research Data) provides since 2010 a fully operational, cross-institutional, long-term archive that stores data from all subjects in applied sciences and engineering. Presently, over 90% of the data in the archive is geoscientific data coded in netCDF (Network Common Data Form) – a data format and data model that, although generic, is mostly used in climate, ocean and atmospheric sciences. In this practice paper, we explore the question of how 4TU.Research Data can stay relevant and forward-looking in a rapidly evolving research data management landscape. In particular, we describe the motivation behind this question and how we propose to address it.
Slides for presentation including active links at Open Science Days 2017 in Berlin hosted by Max Planck Digital Library (MPDL) on 17th October 2017.
On a rainy Thursday a couple of weeks ago, 14th September 2017, the national Platform for eScience/Data Research (ePLAN) had invited to exchange the latest news about FAIR dataat the eScience Centre in the Amsterdam Science Park. Close to 30 people from different Dutch universities, research support services, research institutions, and ventures followed the workshop appeal. Thus the recaps of Wilco Hazelger (ePlan), Barend Mons (GoFAIR), Peter Doorn (DANS) and Gareth O’Neill (Eurodoc) were received by ears of a quite diverse group of attendees.
For me this event was a good chance to refresh my knowledge about current FAIR processes here in the Netherlands, and to receive some confirmations or contradictions of my interpretation of the FAIR data principles. After nearly half a year of absence on my own FAIR project at TU Delft library, I hoped to get some inspiration out of the conversation with likeminded people on how to implement these principles in everyday research (support) life.
Before I shortly rehash the discussion and consensus of the break out session, I want to share some brain teaser I’ve noted down of the key speakers insights:
Aspects of FAIRness by Barend Mons
∴ Much to my relief Barend confirmed that FAIR is nothing measured in binary but rather a spectrum.
∴ TCP / IPv4 protocols are the current bottle necks of the hourglass design of the soon to be ‘internet of fair’.
∴ Interoperability never can exist without a purpose. Therefore rather assess it in that way: interoperability with what and not just interoperability on its own.
∴ The origin of FAIR emphasizes the machine action-ability of (meta)data.
∴ When talking about a FAIRness evaluation, declare the assessed matter as “re-useless” rather than calling it “unfair”.
∴ The goal of FAIR is R. However, technically I is the key thing of FAIR. “I without F+A makes no sense for R”.
∴ FAIR data can be achieved with FAIR metadata and closed data files.
∴ New perspective on data sharing: establish data visitation instead of data sharing, i.e. your workflow visits the data instead of you receiving data files that were sent to you. To me that is a thrilling shift of perspective: forget sending data files directly via whatever channel, rather establish a platform where the interested people are redirected to landing page of the data-set. Don’t get me wrong, of course this is what we are doing with our archive already. But I also still hear researcher saying, that they share their data via email by request.
∴ A new GoFAIR website is currently under construction and will be launched till end of the year, with a complete makeover and more functionalities as future forming European platform for FAIR work. I am intrigued and will keep an eye out for its launch!
The I in FAIR by Peter Doorn
∴ DANS has 2.6 million pictures as top data category (65% of the archive). Therefore, interoperability of images needs to be tackled. Unfortunately, interoperability of images is hard to determine.
∴ Side note: 4TU.Centre for Research Data has nearly 6.500 datasets in netCDF format as top data category (>90% of the archive). Perhaps this data-format has more advantages in terms of interoperability? Want to know more about our current work with netCDF? Leave a bookmark for the category on this blog.
∴ Barend’s remark about the image interoperability threshold mentioned by Peter: the rich metadata of images makes the interoperability of pictures possible.
∴ The self-assessment tool for FAIR data created by DANS is also connected to the FAIR metrics group.
The Open Science Survey 2017 by Gareth O’Neill
∴ My conclusion of the open science survey by Gareth: the need to improve awareness about open science /access /data /education etc. and the already existing support services will highly likely never decrease.
∴ But who is responsible for increasing the awareness? The university board? The faculties? The research support staff from e.g. the library?
∴ ‘Research visibility’ seems to be the main driver to comply to open science.
∴ The final report and survey analysis will be published in the next 3-6 months. Keep an eye on the Eurodoc website.
A few bits from the group session
∴ What’s the incentive to re-use existing data (where the origin might be untrustworthy) vs. regenerating the data oneself?
∴ Is metadata sufficient for reusability or is there a need for linked data?
∴ Incentives for researcher to create FAIR data needs to be improved asap.
∴ Better distinctions between “data stewards”, “data managers”, “data scientist”; and improved appreciation for researchers doing these jobs.
∴ Biggest nut to crack: what does FAIR data mean in terms of data quality? The data-set (metadata, documentation, and data files) could be perfectly fair, but the actual content of the data files are rubbish. My thoughts on this: first establish certified and trusted data archive / repository that enables FAIR data-sets; secondly gather critical mass of FAIR research data; lastly: enable peer-review of these data-sets to get an actual evaluation of the data quality.
Current FAIR work in the Netherlands, September 2017
∴ The Dutch Tech Centre for Life Science (DTLS) in Utrecht provides a lot of valuable information about FAIR in the life science context. In more detail, DTLS is also focussing on the semantic side of the FAIR data principles and how to implement them.
∴ Data Archiving and Network Services (DANS) in Den Haag are covering the work on these principles predominantly for the humanities and social sciences. One of their practical approaches is a FAIR data assessment tool with subsequent rating of each FAIR facet.
∴ TU Delft Library and 4TU.Centre for Research Data are concentrating on the FAIR data guidance for technological data. A first practical approach was the evaluation of Dutch data repositories and data archives to determine their maturity for the FAIR data demands by funding bodies. The consecutive work is investigating researchers sentiment of the FAIR data principles in relation to their research subject.
∴ In reaction to the individual development of research support and research institutions regarding the FAIR data principles, the European Commission enabled an Expert Group on FAIR data to review these evolvements and received feedback. The report produced by this Expert Group will be delivered first quarter 2018.
∴ The Conference of European Schools for advanced engineering education and research (CESAER) features a task force for Open Science, including a research data management group that also explores FAIR data.
Feedback, input or questions about this blog post? Feel free to comment.
This is an updated version of the slides given on the FAIR principles in Edinburgh at the International Data Curation Conference in February 2017.
The presentations was given at Open Research Data Management: policies and tools, May 24-25, Milano, Università Statale
It also included break out groups looking at how FAIR is interpreted in different subject areas. The 4TU.ResearchData team will be following this up looking at discipline-specific guidance can be published.
Here you can find the pre-print and not peer reviewed version of the practice paper with the title ‘Are the FAIR Data Principles fair?’.
The corresponding Excel Spreadsheet with the evaluation overview of 37 data repositories, statistical analysis and graphical figures is available in our data archive under the name ‘Evaluation of data repositories based on the FAIR Principles for IDCC 2017 practice paper’.
Our very first approach on reviewing a data repository and using the FAIR principles as scoring matrix resulted in the following overview about the 4TU.Centre for Research Data called ‘FAIR Principles – review in Context of 4TU.ResearchData’.
The review in context of 4TU.Research Data helps to understand how we approached this quantitative evaluation of these repositories. Additionally we blogged about our interpretation of the FAIR principles and facets, to display the exact features the repositories have been measured against.
The initial spark for this research project was lit by the European Commission and their updated demands on data management for the Horizon 2020 projects. There are two versions of the FAIR Principles available online: a short list of the principles and appropriate facets, and the extended and guided version. The Nature article by the contributors and authors of the FAIR principles recaps the rationale behind the principles and the experiences of implementing them.
‘Guidelines on FAIR Data Management in Horizon 2020‘ by the European Commission.
The short version of the ‘FAIR DATA PRINCIPLES’.
The extended version of the ‘FAIR DATA PRINCIPLES’.
Read the Nature article ‘The FAIR Guiding Principles for scientific data management and stewardship‘.