Category: Data Stewardship

Research Data Management within the 4TU Research Centres

The 4TU.Centre for Research Data announces its report on research data management within the 4TU Research Centres.

cover 4TU Research Centres

The Report

Over the last few months, the 4TU.Centre for Research Data had the chance to make contact and to speak with several of the Scientific Directors of the 4TU Research Centres about research data management. The report published today highlights the findings from these contacts and conversations.

A citable version of the report is available on OSF Preprints (DOI: 10.17605/OSF.IO/SGFTW).

Key Findings

1. Research data management is not addressed at a strategic level by the 4TU Research Centres, but left to individual research groups or to individual researchers connected to the Centres.

2. Within the 4TU Research Centres, there is a broad range of attitudes towards data and a broad range of data types and characteristics, including large datasets; commercially sensitive datasets; privacy and ethical concerns regarding data; software and its sustainability.

3. Software sustainability is an important and much discussed topic, but there are currently no standards or systematic way of looking after software.

4. Research on human subjects and datasets including personally identifiable information or sensitive personal information are more prominent than might be expected in engineering and the technical sciences. Lack of transparency and reproducibility of scientific results can be an issue in these areas because the underlying datasets are often not available.

An Opportunity to Collaborate

Research data management is increasingly viewed as an important part of high-quality research. International and national funding bodies now mandate institutions and researchers to make data available. Data sharing is predicated on good research data management and has the potential to make scientific research more transparent, open, and efficient. In view of these principles and developments, the 4TU.Centre for Research Data wishes to maintain and deepen its links with the 4TU Research Centres and to support the Centres in various aspects of research data management.

We are hiring (again!) – Data Steward position at TU Delft

hiring-2575036_960_720

We have an exciting job opening for a Data Steward at TU Delft at the Faculty of Architecture & Built Environment and the Faculty of Industrial Design (joint appointment): https://www.academictransfer.com/employer/TUD/vacancy/45483/lang/en/

  • Closing date: 15 March 2018
  • Salary: up to € 4084/month
  • We are looking for individuals enthusiastic about data management and who have a PhD degree in the relevant subject area (or equivalent experience).

This is a great chance to join the dynamically growing team of Data Stewards at TU Delft and to contribute to a cultural change in research data management in a disciplinary manner. The job is really about inspiring the research community and improving day to day practices, and not about policy compliance.

All informal inquiries can be directed to me: M.Teperek@tudelft.nl

Do as you preach: results of 2017 data management survey now published

tape-measure-1860811_960_720

Author: Jasper van Dijck, Data Steward at the Faculty of Electrical Engineering, Mathematics and Computer Science


Data. We advise researchers on how to manage theirs, but we are not averse to gathering and sharing some of our own.

The problem

As data stewards at TU Delft we were asked how we are going to keep track of our progress. After some discussion amongst ourselves, we concluded that we could count the number of researchers we helped with their data management (plans) and we would love to measure the number of data sets shared by TU Delft researchers in the public domain. Presumably, an increase in the former would lead to an increase in the latter. That did not seem quite enough though, since there is a time difference between our usual first point of contact with a researcher, at the beginning of a project, and the archiving/sharing of research data, usually at the end. We would have to be quite patient in finding out if our ventures had paid off since most research projects usually last a few years. So we felt we also needed to know how researchers were currently thinking about research data management (RDM) since one of the focus points of being a data steward at TU Delft is creating awareness and facilitating a change in culture.

The solution

That is why we set up a survey. Nothing fancy, but a simple survey asking researchers a couple of questions on their (attitude towards) research data management. If you are Dutch, this would be our infamous “nulmeting.” This will give us a starting point in measuring the change in attitude and behaviour over time (yes, we are planning to re-do the survey regularly): it will give us insight into what effect our presence and actions have had.

The results

So, we would like to present to you the results of the TU Delft “Quantitative assessment of research data management practice 2017,” or RDM survey 2017 for short. This survey has been set up in cooperation with EPFL and Cambridge University. EPFL has also already finished their survey and Cambridge is currently completing their survey. Our goal is to cross-compare the results between the different institutions to see if we could learn from each other’s approach.

You can find a visualisation of the survey here: https://public.tableau.com/views/TUDelftQuantativeAssessmentofResearchDataManagementPractice2017/TUDelftRDMsurvey.

And yes, the anonymised(!) data is in the public domain. You can find it here: https://zenodo.org/record/1164398. We practise what we preach.

Feel free to explore the results of the survey in the visualisation or download the data yourself. We will learn a lot from it and we are looking forward to finding out what has changed in the next survey.

If you are a researcher at TU Delft and you are reading this: we are counting on you to fill out the RDM survey 2018, somewhere near the end of the year. Until that time, if you have any questions, please contact us at datastewards@tudelft.nl.

Invitation to collaborate

If you are interested in research data management and would like to do a similar survey at your institutions, you are most welcome to join TU Delft, EPFL and the University of Cambridge in our efforts. The survey itself is available on the Open Science Framework: https://osf.io/mz3fx/wiki/home/

So, just drop us an email at datastewards@tudelft.nl

TU Delft Strategic Framework 2018-2024: what does it mean for Open Science?

Capture

TU Delft published its new Strategic Framework 2018-2024 on 12 January, during the Open Science Symposium and its 176th birthday celebration.

The framework is entitled “Impact for a better society” and “openness” is listed as one of the four major guiding principles. The principle of openness was apparent already during the consultation phase of the framework: “more than 600 internal and external stakeholders have been actively participating” in the process.

The purpose of the strategic framework is “to serve as a high-level compass that will guide decision-making bodies at all levels within our university in the years ahead”. But what does the framework really mean for Open Science? In this blog post, I highlighted the key quotations from the strategic framework which are likely to have the highest impact on future Open Science developments at TU Delft.

Impact for a better society

First, Open Science fits neatly with the overall title of the framework “Impact for a better society”. The framework states in the preface that “societal impact and academic excellence can be mutually reinforcing”.  And this is indeed the case. Open Science means that research results can be accessed and re-used by everyone in the society, including the members of the public. TU Delft also wishes to increase its societal engagement by “promoting public participation in scientific research (‘citizen science’). Which is all deeply in line with the principles of Open Science.

Open Access publishing

Within Open Access publishing, TU Delft wishes to first develop a stronger awareness among its researchers. Second, the strategic framework also emphasises the need for a sustainable transition to Open Access publishing and it thus includes the commitment to “reducing costs for Open Access publishing by negotiating journal subscriptions with publishers.” At the same time, TU Delft will explore “new ways to present and disseminate knowledge”, which will not necessarily rely on publishing via the traditional scientific journals. Finally, researchers are encouraged “to serve on relevant Editorial Boards”, suggesting that TU Delft researchers take an active part in shaping publishers’ policies.

Research Data

The importance of good data management and sharing is also stipulated in the strategic framework. TU Delft wishes to stimulate the sharing of research data, and it realises that in order to achieve this, researchers need to be provided “with the necessary support, for example by appointing data stewards and data engineers within all faculties who advise researchers in managing their data.”

In addition, TU Delft will implement a “policy for research data, and enable researchers to control their own research data in accordance with this policy.” And, quite importantly, the strategic framework states that TU Delft wants to “involve researchers in contributing to TU Delft’s policy for research data management.”

Finally, the strategic framework recognises the importance of the new EU General Data Protection Regulation, and will “set up an integrity policy that protects scientific data and personal data in line with the EU directives.”

Open Source

Software is an integral part of research and is necessary for research reproducibility. It is therefore not surprising that the commitment to open source software has been stated in several locations in the strategic framework. First, TU Delft will develop “best practices for working with open source software, for example in relation to copyright and archiving of source code” and “facilitate a central place of support for researchers who want to use open source software.” Furthermore, TU Delft stresses the importance of communities in raising awareness and reinforcing good practice. It will, therefore, create “an open source software community with active ambassadors.”

Rewards for Open Science

The Strategic Framework is aiming at recognising the engagement with Open Science by changing the ways in which researchers are evaluated. TU Delft wants to include a more explicit recognition of “engagement with Open Science and Open Education” in yearly R&O evaluation cycles. To facilitate this, TU Delft supports “(inter) national initiatives aimed at finding alternative indicators that positively value open access publications” and is “collaborating with (inter)national leaders in the field of non-traditional metrics.”

Supporting researchers in their transition to Open Science

Importantly, TU Delft recognises that researchers need to be professionally supported in order to ensure that the objectives of the strategic framework can be successfully met. Therefore, it aims to “improve the quality of [its] professional services” and wants to provide researchers with a clear, ‘one-stop-shop’ contacts for requests which should be “simple and effective”, “digital where possible, and personalised where needed”.

TU Delft also plans to appropriately recognise and reward those supporting researchers in their transition to Open Science. TU Delft will “take the lead in national initiatives aimed at extending the job classification for support staff with positions that support recent developments, such as data stewards that advise researchers in managing their (open) research data”.

Open Education

Strategy for Open Education was also widely mentioned in the framework. The one-page summary outlines TU Delft’s commitment to “promote and facilitate Open Education”, which is then followed by a declaration: “we wholeheartedly support Open Education and want to make Open Educational Resources part of our educational policy”. To achieve this, TU Delft will support lecturers and students in the use of open education resources and will encourage “lecturers to publish their educational material under an open license”

Importantly, TU Delft also wishes to appropriately reward those engaged in Open Education activities. It wishes to strengthen a culture “in which education and teaching receive more appreciation and recognition” and “will refine [its] HR policy so that it will offer further scope for professional development and career opportunities within education”. In addition, as part of its educational policy, TU Delft wants to make “open education part of the basic teaching qualification programme and the evaluation criteria of courses.”

Last, the framework also states that TU Delft has the ambition to replace “commercial textbooks by open resources in all BSc programmes as much as possible.”

How important is the strategic framework?

So how important is the framework? Will the statements be really implemented?

To answer these questions I will conclude with the final quotation from the framework: “this framework is more than a formal requirement; it is our moral responsibility”.

Data sharing: is it all about trust?

cat-2603395_960_720

On 30 January 2018, I attended the Research Data Alliance EU Data Innovation Forum in Brussels. The meeting was focused on innovation, but the common issue discussed in every session was trust: the lack of which sometimes prevented the implementation of innovative solutions, and which sometimes, was the main driver for innovation. Below you will find some of my key reflections on the topics discussed.

Lack of trust as a barrier to innovation

Business context

First, Marta Pont Guixa introduced the results of her research into the practices of business to business data sharing. Not surprisingly, her findings revealed that many businesses relied on access to data and on the exchange of data with other businesses for development of new products, services and for efficiency gains. However, what was quite interesting to hear, is that sharing was typically happening between businesses operating within the same sector and usually a very small proportion of data was shared. Some companies complained about frequent denials of access requests.

Apart from technical challenges, the main obstacles preventing a more widespread data sharing were the issues of awareness and trust. The absence of a legal framework and uncertainties about the practical meaning of the new EU GDPR regulation meant that businesses were often unsure if they were allowed to share data and under what conditions. In addition, the risk of competition led to lack of trust over how the data will be re-used and for what purposes. During the coffee break, one of the attendees mentioned that effective sharing between businesses is often enabled by knowing the right people and gave an example of sharing facilitated by someone knowing the company’s CEO since the primary school times. Could the presence of effective legal frameworks increase the efficiency of sharing?

Academic institutions

The discussion about sharing in a business context was extended to a wider discussion about trust: who do we trust? One hypothesised that perhaps academic institutions are more trustworthy. However, while indeed it seemed that commercial competitiveness is not as fierce in academia than in business environments, the pressure to publish and to be the first to report on academic findings, resulted in research reproducibility crisis, which in turn lead to distrust in academic research.

In addition, one mentioned that the lack of legal frameworks and adequate legal support for data sharing led to problems at academic institutions. One of the attendees reflected that overworked research support officers often preferred to be risk-averse rather than to help researchers decide on appropriate risk management strategies. This, in turn, resulted in datasets not being shared, or in researchers entering into agreements with third parties on their own, bypassing the institutional regulations. As a consequence of the latter, discussions about sharing become extremely convoluted, and with no legal agreements in place, and release of datasets becomes cumbersome. This, of course, raises further questions not only about the transparency and reproducibility of research but also prevents the wider society from benefiting from research discoveries.

Innovation leading to distrust

In the second session of the conference, Andreas Rauber reflected on the popular analogy that data is the new oil and, consequently, the notion that Data Scientist is the “sexiest job of the 21st century”. However, Andreas also mentioned that with the vast amounts of data available, data analysis becomes more and more challenging. Decision makers are served the end results and pretty pictures, which often influence not only business decisions, but also policy changes. Do these decision makers see the algorithms used, do they bother to understand the input data and the inherent biases before they decide to include the results in reports and recommendations?

So how can we trust the data? How can we trust the individuals who make their recommendations and policy changes based on “data-driven” decisions? Who scrutinises the process and who takes the ownership and accountability?

Lack of trust as a driver for innovation?

The final session was about possible blockchain applications for data. Monique Morrow from Humanized Internet discussed cases where the use of blockchain technologies could provide solutions in extreme environments where lack of trust is an everyday reality. She introduced the drama of people in humanitarian crisis situations, where the basic human right – the right of own identity, is denied. How can warzone refugees with no documentation prove their identity? How can they prove their education, degree certificates, and job qualifications? There are numerous hopeful examples of what could be achieved with the use of blockchain in situations of distrust.  

It was also interesting to talk about the potential use of blockchain in scholarly communication and in the management of (research) data. Edwin Morley-Fletcher introduced the concept developed by the consortium My Health My Data, which aims to create a novel blockchain-based platform for handling medical data transactions. There were recently numerous discussions about the potential of blockchain technologies for scholarly communication,  including the extensive report “Blockchain for Research”, but it is yet unknown whether blockchain will be the game changer. As Peter Wittenburg summarised, the blockchain, as any novel technology, is not yet perfect, and there will certainly be issues which will have to be addressed.

And my final reflection: is the distrust in academia enough of a problem to create the need for blockchain-based solutions? Or, perhaps, the technology will develop so attractive offerings that trust issues won’t be the main drivers for adoption anymore. I hope for the latter!

Views on Data Stewardship – report of preliminary findings at TPM faculty

news-1074604_960_720

Date: 29 January 2018
Author: Marta Teperek, Data Stewardship Coordinator, TU Delft Library


Executive summary

Qualitative interviews with nine researchers at the Faculty of Technology, Policy and Management (TPM) at TU Delft were undertaken in order to get an understanding of data management needs at the faculty in advance of appointing a dedicated Data Steward. The purpose of this was to aid the recruitment of the Data Steward and to define the skills and experience of an ideal candidate, as well as help deciding on the work priority areas for the Data Steward. The results of this research can be also used as a point in time reference to monitor changes in data management practice at the faculty.

The main data management challenges identified were: handling personal sensitive research data; working with big data, managing and sharing commercially confidential information and software management issues. Despite the diversity of problems, some common issues were identified as well: the need for improving daily data management practice, as well as the need for revising workflows for students’ research data. With the exception of one researcher, who was in opposition to the Data Stewardship project, all other researchers expressed their support for the project and welcomed the idea of having a dedicated Data Steward at the faculty.

Additionally, several follow up actions were already undertaken as a follow up of these interviews:

  • the Data Stewardship Coordinator was invited to give two talks about Data Stewardship to two different groups of researchers;
  • a member of the Research Data Support from the Library team was asked to deliver a training course for students;
  • the Data Stewardship Coordinator was asked to discuss the best way of rolling our data management training for PhD students at TPM in coordination with the TPM Graduate School.

Given that the financial allocation for the Data Steward at TPM faculty is currently at 0,5 FTE for the first year and 1,0 FTE for the two subsequent years (until December 2020), it is recommended that the first year is spent on continuing and extending this research to better understand the needs of the faculty. It is suggested that at the same time, the Data Steward starts addressing the most urgent data management needs at TPM faculty, in particular, the development of a data management policy, as well as the development of solutions and recommendations for working with personal sensitive research data.

The two subsequent years could be devoted to developing resources and solutions for the remaining problems and for critical evaluation of the project and its effect on data management practice at the faculty. This approach should provide the faculty with enough resources and information to decide on the best strategy for Data Stewardship beyond December 2020.


Introduction

Data stewardship has been recognised internationally as a key foundation of future science. Carlos Moedas from the European Commision (EC) said that Open Science “is a move towards better science, to get more value out of our investment in science and to make research more reproducible and transparent. (…) Recent advances such as the discovery of the Higgs boson and gravitational waves, decoding of complex genetic schemas, climate change models, all required thousands of scientists to collaborate (…) on data. And that implies that research data are findable and accessible and that they are interoperable and reusable”. In support of this, the EC anticipated that about 5% of research expenditure should be spent on properly managing and stewarding data. Barend Mons, the Chair of EC high level expert group on the European Open Science Cloud, estimated that 500.000 Data Stewards will be needed in Europe to ensure effective research data management. Consequently, all NWO and H2020 projects starting from 2017 onwards must create a Data Management Plan and are required to make their data open. In addition, the European Open Science Cloud promises new tools and related EC strategy papers suggest new rewards and grant funding schemes (such as FP9) to benefit those practising open science.

TU Delft’s College van Bestuur (CvB) made a strategic decision to be a frontrunner of this global move and a dedicated Data Stewardship programme was initiated. The long-term goal of this programme is to comprehensively address research data management needs across the whole campus in a disciplinary manner. To achieve this, subject-specific Data Stewards are to be appointed at every TU Delft faculty. Strategic funding from the CvB was allocated to support 0,5 FTE of a Data Steward per Faculty until December 2018, and 1,0 FTE of a Data Steward per Faculty for two years from January 2019 to December 2020. Subsequently, faculties are to decide how to best address their researcher data management needs.

In 2017 the first Data Stewards were appointed at three TU Delft faculties: Faculty of Electrical Engineering, Mathematics and Computer Science, Faculty of Civil Engineering and Geosciences and Faculty of Aerospace Engineering. At the beginning of 2018, Data Stewards are to be appointed at the five remaining faculties, including the Faculty of Technology, Policy and Management (TPM).

In order to facilitate the recruitment decision over the appointment of a Data Steward at TPM faculty, the Data Stewardship Coordinator was set out to investigate the faculty’s research data management needs. Qualitative interviews were undertaken with TPM researchers in autumn 2017, which led to the identification of four main data management issues, specific to the types of research done, and revealed some common data problems for the faculty overall. The report below describes the key findings of this research and makes some recommendations for the future work of a Data Steward at TPM Faculty.

Methodology

Semi-structured qualitative interviews were conducted with four full professors, three associate professors and two assistant professors in September and October 2017. Initial interviewees were selected and approached by the Data Stewardship Coordinator based on their online profile content to ensure a representation of the different research methodologies used across the faculty as well as representation of all three TPM’s departments: Engineering Systems and Services, Multi-Actor Systems and Values, Technology and Innovation. In addition, one researcher was interviewed as a result of a recommendation from the initial interviewee, and two other interviewees were suggested by the Secretary General of the faculty.

All interviewees were informed that interview findings will be used to create a preliminary report on data management needs at the faculty and that the report might be made publicly available. Interviewees were assured that no information will be directly attributed to them and that they will not be named in the report. Interviewees agreed for the interview notes, including personal information, to be shared internally with the Secretary General of the faculty.

Interviews lasted for 30 – 60 minutes. Interviews were not recorded, and instead, notes of key discussion points were taken by the interviewer during the interview.

Categories of data management issues

Diverse nature of research topics at TPM suggested that researchers could have different data management needs. Nine interviews conducted so far revealed that this was indeed the case and identified four top data management issues: handling personal sensitive research data; working with big data, software management issues and managing and sharing commercially confidential information.

Handling personal sensitive research data

Questions about handling of personal sensitive research data were from across the whole research lifecycle: starting with experimental design and ensuring that only the minimum necessary data about people were collected and the right consent forms were in place, all the way through to data anonymisation and deciding which parts of data could be made publicly available, which could be shared only under managed access conditions, and which datasets should never be shared. Researchers also mentioned difficulties of working with sensitive data on a daily basis – the need to use secure servers, encryption to share the data and to ensure that only authorised partners have access to data. Some discussed the challenges of working with sensitive information in fieldwork conditions, especially if the data was politically contentious.

Interviewers wished to have more guidance about recommended workflows and policies, as well as practical support for finding the right storage solutions and means for sharing data with collaborators. In addition, better support was required at the experimental design stage: deciding on the minimal amount of personal information to be collected and drafting the right consent forms. Finally, many expressed the need for resources which could help them with data anonymisation and to manage the risks and benefits of making datasets publicly available.

All these concerns seemed particularly pressing in light of the new EC Data Protection Regulation, coming into force in May 2018. Some interviewees feared that they were unprepared for the new regulation and felt they had not received sufficient information about the impact of the new regulation on their research.

Challenges of working with big data

Challenges of working with big data were mainly related to infrastructure limitations. For researchers working with very large files simple aspects of data management become a difficulty. For example, due to ever-increasing storage requirements for big datasets, many researchers were unable to backup their data. This consequently led to occasional irretrievable data loss. Due to large volumes, big datasets were rarely archived, raising reproducibility concerns. In addition, many researchers had to use third-party computing services in order to effectively process their data. These often resulted in issues associated with very slow data transfer.

Working with big datasets, especially those which needed to be dynamically updated, also meant challenges for data publishing. Many data repositories providers did not offer options for big data sharing and had strict limitations on the maximum size of files. In addition, publishing of big datasets often meant substantial costs and it was often more cost-effective to simply re-generate the data when needed.

Software management issues

The third issue was with software management. In general, researchers did not have policies within their research groups on how software should be managed, annotated and shared. Often the very platforms for software management differed within the same research group. Some researchers felt they did not have sufficient time to annotate their software properly and that their colleagues, especially students, did not have the right skills to effectively work with tools which could help them manage their software better. One researcher mentioned missed commercialisation opportunity due to the fact that the software developed by the group was not understandable to anyone outside the group, including the third party interested in commercialisation.

Interviewees mentioned that due to lack of appropriate skills amongst researchers, there was a need for professional service support in data science. In addition, many suggested that training on the use of software management tools (such as Git, Subversion or Jupyter Notebooks) would be useful, in particular for students. Several wished to receive more information about methods for software archiving and for getting citation credit for code publishing.

Managing and sharing commercially confidential information

Working with commercially confidential data also proved problematic. First, there were tensions between sharing data for the sake of reproducibility, and the need to protect third party’s commercial interests. Interviewees mentioned that navigating between the different contractual clauses could be difficult. One researcher admitted that the inability to share research data obtained from commercial partners made it more difficult to publish papers due to the fact that some journals now required that research data supporting publications was made publicly available. Another researcher felt that collaborating with industry negatively affected the progress of his academic career because commercial clauses consequently meant fewer papers published. That researcher thought that when it came to academic promotions, commercial collaborations were valued less than the number of published articles.

Common data management problems

In addition to data management issues related to the type of research conducted, some common problems mentioned by almost all the interviewees were identified as well. These were related to improving daily data management practice, and to better data management procedures for students.

Daily data management practice

Problems related to daily data management practice concerned issues such as designing a data backup strategy and adhering to it, good file and folder naming, as well as issues with version control. These problems were shared also by researchers who based their research primarily on literature reviews. Overall, very few interviewees established workflows for good data management which would be followed by entire research groups. Most of the time it was down to individuals as to whether data was properly managed or not. Many researchers expressed the wish to improve their data management practice and to attend appropriate training.

Students’ data management practice

Almost all interviewees said that data management practice amongst students needed to be improved and that data management training should be part of the Graduate School’s curriculum. Training needs were related to both awareness of general principles, such as data backup, as well as knowledge of specific techniques and practices, such as data science skills and software management tools.

In addition, one interviewee expressed his concern about the fact that PhD students were not required to archive their research data at the time of graduation. This, he believed, led to research reproducibility concerns and potential reputational damages. The researcher suggested that all PhD students should be required to archive their research data before leaving TU Delft. This view was shared by researchers from the TPM Policy Analysis section (see ‘Follow up actions undertaken’).

An additional concern regarding students’ data was raised during the meeting with researchers from the Engineering Systems and Services department (see the section ‘Follow up actions undertaken’). When discussing research data ownership, researchers mentioned that according to TU Delft regulations, research data collected by Master students belonged to the students, and not to TU Delft. As a result, in several cases, Master students left TU Delft and took all their research data with them, without leaving a copy with their TU Delft supervisors. Researchers believed that this was a concerning and a serious issue from the research integrity and research continuity point of view. To avoid similar issues occurring in the future and to overcome the unfavourable regulation, supervisors now avoided offering participation in valuable, larger projects to Master students.

Views on Data Stewardship

With the exception of one researcher, who was in strong opposition to the Data Stewardship project, all other researchers welcomed the project and thought that there were data management needs at the faculty which could be addressed by the Data Steward.

The researcher with negative views on the Data Stewardship project thought that appointing a dedicated staff member to support researchers in data management was counterproductive. That researcher believed that a Data Steward would “develop guidelines (…) and hold meetings to raise awareness etc.” instead of solving “any actual operational issue”. He also suggested that a quantitative survey should be done to define the common practices and to decide whether any corrective steps needed to be taken. Interestingly, despite the negative attitude in general, the researcher agreed that there were issues with data management which needed to be solved and thought that training in data management for all PhD students was particularly needed.

Another researcher who welcomed the overall idea of the Data Stewardship project raised his concern about the number of resources allocated to the project and suggested that care was taken to ensure that the project would not result in new compliance expectations.

All remaining researchers were enthusiastic about the project and identified numerous data management issues with which they hoped that a Data Steward could help. These included:

  • Advice on data management workflows and best practices (such as data backup, version control, file and folder naming)
  • Advice on data sharing and citation
  • Advice on working with different types of confidential data (such as personal sensitive and commercially sensitive data)
  • Support in designing strategies for sustainable code management
  • Advice on code sharing and citation
  • Help with managing funders’ and publishers’ expectations
  • Training on data and software management, in particular for PhD students

Follow up actions undertaken

As a result of the initial interviews with researchers at TPM, several actions were undertaken, which might suggest that interviewed researchers were genuinely interested in data management issues. First, the Data Stewardship Coordinator was invited to give two presentations about the Data Stewardship project: to researchers from the Department of Engineering Systems and Services, and to researchers from the Policy Analysis section of the Multi-Actor Systems Department. Second, one of the interviewed researchers asked members of the Research Data Services team to deliver a workshop to his students about using data repositories. Third, one of the interviewees made a suggestion to connect with the TPM’s Graduate School to discuss the possibilities of rolling out data management training for PhD students.

In addition, the Data Stewardship Coordinator initiated discussions with other faculties to determine whether the issues around Master students’ research data ownership were also problematic at other faculties and whether the problem should be tackled centrally or not. The Furthermore, the Research Data Services team started liaising with the Human Research Ethics Committee to ensure alignment between research ethics and data management guidelines and policies.

Recommendations

This preliminary report identifies several areas where data management practices at TPM faculty could be improved with the help of a Data Steward. However, given the preliminary nature of these findings and the risk that they might not be representative of the whole faculty, it is recommended that the work of the newly appointed Data Steward is initially focused on a more in-depth investigation of data management needs. While qualitative interviews should be continued, a quantitative survey at the faculty is also needed, in agreement with the advice of the interviewee who was negative about the Data Stewardship project. Indeed, results of quantitative surveys conducted at the three faculties that already have Data Stewards proved to be valuable for measuring the scale of data management issues and deciding on priority actions. The thorough investigation of data management needs will allow the faculty to decide how to prioritise them. Finally, understanding the faculty-specific requirements will inform the development of a faculty data management policy.

In addition, given the fact that many researchers interviewed expressed uncertainties about the recommended procedures for working with personal sensitive data and that the new EC Data Protection Regulation becomes legally binding in May 2018, it is suggested that development of recommendations and training for working with personal sensitive data is also prioritised. This work should be done in collaboration with other teams at TU Delft: the Data Protection Officer, the Research Data Support team at the Library, the ICT team and the Human Research Ethics Committee.

Subsequent two years during which the Data Steward will be appointed at 1,0 FTE could be solely devoted to developing solutions for the remaining priority data management needs and also to evaluating the project. Comprehensive evaluation of the project should help the faculty make an informed decision on how to take the Data Stewardship forward after the end of 2020.


Acknowledgements

I would like to thank: all researchers who agreed to participate in my interviews for their time and valuable feedback; Martijn Blaauw for interviewee suggestions and introduction to the faculty; Alastair Dunning and Heather Andrews for comments on this report.


Citeable version

A citable version of this report is available on the Open Science Framework: https://osf.io/8ce5v