Category: Data Management Plan

TU Delft Research Data Framework Policy now approved

20180628 Policy development timeline

On 26 June 2018, the new TU Delft Research Data Framework Policy was approved by TU Delft’s Executive Board. The Framework Policy is an overarching policy on research data management for TU Delft as a whole and it defines the roles and responsibilities at the University level. In addition, the Framework provides templates for faculty-specific data management policies.

From now on, the deans and the faculty management teams, together with the Data Stewards, will lead the development of faculty-specific policies on data management which will define faculty-level responsibilities.

If you are working at TU Delft and if you would like to be involved in the development of faculty-specific policies, please do get in touch with the relevant Data Steward.

The full text of the policy (pdf) is available below.

policy image

TU Delft Research Data Framework Policy – Version for CvB – 18 June 2018

 

Why is this a good Data Management Plan?

pencil-1891732_960_720

This blog post reports from a workshop session led by Marjan Grootveld and Ellen Leenarts from DANS. The workshop was part of a larger event “Towards cultural change in data management – data stewardship in practice” organised by TU Delft Library on 24th of May 2018.

This blog post was written by Marjan Grootveld from DANS it was published before on the OpenAIRE blog.


It’s not just colonel Hannibal Smith, who loves it when a plan comes together. Don’t we all? On a more serious note, this also holds for Data Management Plans or DMPs. In a DMP a researcher or research team describes what data goes into a project (reuse) and comes out of it (potential reuse), How the team takes care of the data, and Who is allowed to do What with the data When.

Just like a project plan a DMP undergoes a reviewing process. Often, however, researchers share their draft version and questions with research support staff and data stewards (see the results of this survey by OpenAIRE and the FAIR Data Expert Group). About twenty data stewards shared their review and pre-view experiences in a lively session at the Technical University Delft on May 24th. During the day the organisers and speakers highlighted various aspects of data stewardship with a welcome focus on practice situations, especially in the break-out sessions. (When the presentations are available we will add a link to this blog post.)

In the session called “Why is this a good Data Management Plan?” Marjan Grootveld (DANS, OpenAIRE) and Ellen Leenarts (DANS, EOSC-hub) presented text samples taken from DMPs. By raising their hands – or not! – and subsequent discussion the participants gave their view on the quality of the sample DMP texts. For instance, the majority gave a thumbs-up for “A brief description of each dataset is provided in table 2, including the data source, file formats and estimated volume to plan for storage and sharing”. In contrast, the quote “Both the collected and the generated data, anonymised or fictional, are not envisioned to be made openly accessible.” drew a good laugh and the thumbs went down. Similarly, the information that the length of time for which the data will remain re-usable “may vary for the type of data and <is> difficult to specify at this stage of the project” was found not acceptable; the plan should a least explain why it is difficult, and how and when the project team nevertheless will provide a specific answer. And is it really more difficult than for other projects, whose DMPs do provide this information?

Although it can be hard to be specific in the first version of a DMP, it’s essential to demonstrate that you know what Data Management is about, and that you will deliver FAIR and maximally Open data. Does the DMP, for instance, tell what kind of metadata and documentation will be shared to provide the necessary context for others to interpret the data correctly? Does it distinguish between storing the data during the project and sustainably archiving them afterwards? (Yes, we had a sample text neatly describing the file formats during the data processing stage versus the file formats for sharing and preservation.)

There was consensus in the group on the quality of most of the quotes. Where opinions differed, this had mainly to do with the fact that the quotes were brief and therefore open to more lenient or more picky interpretation. In other cases, a sample text had both positive and negative aspects. For instance, “The source code will be released under an open source licensing scheme, whenever IPR of the partners is not infringed.” was found rather hedging (“whenever”) and unspecific (which licensing scheme?), but the plan to make also source code available is good; too often this seems to be forgotten, when the notion of “data” is understood in a limited way.

The session participants agreed that a plan with many phrases like “where suitable/ where appropriate/ should/ possibly” is too vague and doesn’t inspire much trust. On the other hand, information on who is responsible for particular data management activities is valuable, and so is planning like “The work package leaders will evaluate and update the DMP at least in months 12, 24 and 36”. Reviewers prefer explicit information and commitment to good intentions – which may be something to keep in mind for your “Open A-Team“.

 

How to ensure that the costs of data management activities are budgeted in grant proposals?

paper-3190198__340

Written by: Mary Donaldson and Vessela Ensberg

On the 21st February 2018, a Birds of a Feather session was held as part of the 13th International Digital Curation Conference in Barcelona on ‘Data management costing in grants’. The session was proposed and chaired by Marta Teperek of TU Delft.

The session proposal recognised that many research funders now require that research data is properly managed and shared. Consequently, many agree for the costs of data management to be budgeted in grant proposals. This is necessary for the sustainability of data management activities. So why is this not a normality yet?’

Identifying the problems

We identified two main sources for data management not being included in the grant proposal budget: lack of awareness among researchers as to what funds they can request and lack of available support at research institutions.

Researchers’ issues

Among all the usual suspects for the reasons why Research Data Management (RDM) activities are not costed into grant proposals

  • researchers prefer to ask for money for other purposes
  • researchers are not aware which costs are eligible
  • researchers believe that RDM costs should come from award overhead

Identifying solutions to researchers’ issues

Some of the RDM activities we identified as eligible for funding are

  • transcription of interviews
  • data anonymization
  • data curation assistance (outside of existing central posts)

We acknowledge that some of these activities are already included in research proposals as parts of the normal research process, and a specialist, such as a data curator, maybe difficult to hire for a less than full-time post. Growing the list of examples and viable options is likely key to having data management included in grant budgets.

Institutional issues

As we moved on from discussing why grants don’t often contain data management costing, we strayed into the related territory of institutional issues. Those included

  • worries about ‘double dipping’ for RDM costs, especially when trying to recover staffing costs
  • need for training for research admin staff who are directly involved in application processes; high staff turn-over in these positions
  • lack of a centralised system which tracks all grant applications or lack of communication between the Office coordinating the grant awards and RDM services
  • preservation costs being incurred after the award has been closed
  • lack of a pool of ‘expert’ staff which can be hired out to research projects

Identifying solutions to institutional issues

Institutional issues can be addressed by investment in the processes. In particular, Utrecht University and the University of Glasgow gave examples of addressing communication and training of research support staff. The RDM team at Glasgow investigating the possibility of adding a check-box to the central grant review system to indicate that funding for RDM has been costed and included in the application. Utrecht also provides consultations on data management costs and is experimenting with a pool of data managers who can be hired from the library for a certain amount of time to work on specific projects. The library is funding these positions but hopes to be able to recover up to 75% of the cost of each position from research projects in the future.

We also looked for lessons learned from the Open Access for publications. Funders have experimented with different models to pay for the more mature requirement for open access to publications in recent years. We explored whether these models could be adapted to help with the requirement for data management and sharing of research data. The first model we discussed was the FP7 pilot for open access where eligible projects were entitled to apply to a central pot of money, provided certain conditions were met. This pilot is due to end this week (28th Feb 2018), and has encountered administrative issues. In the UK, Research Councils UK (RCUK) have provided large research-intensive institutions with a block grant award to pay for Open Access charges for eligible articles. At the end of the pilot, RCUK will accept longer embargo periods. While we felt that centralized pots of money might work to support data management, the administrative burden of this funding is high.

To summarize, institutions can consider the following options to boost up data management inclusion in the grant budget.

  1.      An institution should have a centralized grant administration system. These systems can be adapted to ensure data management is included in the budget.
  2.      RDM should provide more advocacy with researchers using vocabulary the researchers understand and relate to. RDM should match researchers with resources to support costing of RDM activities.
  3.      Providing seed funding to researchers for legacy projects. These might help researchers engage better with RDM and consider their needs earlier in the process on subsequent projects.
  4.      Institutions should consider having a core team of RDM specialists (data curators, statisticians etc) whose time can be bought out by grants, in the way that technicians already are in the life sciences.
  5.      Provide in-depth training for technical or other support staff to enable them to deliver data management for a project. This would provide regular subject-specific RDM support for projects and help build capacity in departments.

However, despite all the ways in which institutions could help improve and support costing for RDM activities, we felt that tackling funders to better support this process would be more effective than each institution having to develop their own solutions. We also thought that funders should be alerted that in cases in which they only require an outline plan at the time of application, by the time the award is made and a more detailed plan is developed, the opportunity to identify and cost data management activities has passed

Proposed funder interventions

  1.       Improve review process for data management plans. Check for discrepancies between the RDM activities promised and the resources requested.
  2.       Provide a clear statement with examples about acceptable and fundable data management activities.
  3.       Indicate the proportion of each grant award expected to be spent on RDM activities.

This could be expressed as a percentage, or a range (to avoid the figure itself from becoming a point for argument) and would signal to researchers that funders don’t see RDM as a waste of money that could better be spent on generating more research data.

  1.       Make it clear who in the funding body is the person /role to contact to discuss RDM issues. RDM requirements are still new enough that clarification is regularly required.
  2.       Fund more data re-use.

For researchers, the cost/benefit analysis of making research data available is difficult to assess. Issuing calls specifically to encourage re-use of datasets would improve the understanding of data re-use and drive demand for shared datasets, helping tip the scales in favour of sharing data.

Ultimately, better alignment of funder RDM requirements would make it simpler for researchers to comply. It was mentioned that Research Data Alliance RDA had tried to get a funder working group together. Perhaps this is something Science Europe could also be involved with.

Future work

Jisc have funded a project in the UK to produce centralised guidance by July on the following:

  •         What do different funders require in terms of RDM?
  •         What do different funders require in terms of data sharing?
  •         What are different funders willing to pay for?
  •         How should funding for RDM be justified in grant applications?
  •         How can funds for RDM be used by institutions?

Useful links

Training Data Science students on finding and publishing datasets

Capture.PNG

Written by: Marta Teperek and Madeleine de Smaele

On 3 November 2017 Madeleine de Smaele from TU Delft Library was invited by Scott Cunningham, Associate Professor at the Faculty of Technology, Policy & Management, to deliver a workshop to his Data Science students. Marta Teperek attended the workshop as an observer and, given that she only started working at TU Delft on 15 August, it was also a good opportunity for her to learn more about data management support available to researchers.

Below are our key reflections on that session.

Structure and content

Madeleine’s session was divided into two parts, each one lasting for 45 minutes and with a 15 minutes break in between. The session was a mixture of Madeleine’s presentation and some interactive exercises.

Part I – finding datasets

The first part introduced:

The first part concluded with an interactive exercise where participants were asked to find a repository and a dataset of interest for their research, by using re3data.org. Afterwards, we had a roundtable discussion about the datasets found by the participants and what was good and what not so good about them (e.g. clear licence, citation, DOI).

Part II – publishing own datasets

In the second part of the workshop, we discussed the benefits and ways of publishing own research data. We thought this was relevant to the course participants as they had been working on a dataset for their data science course. We thought that they could have been interested in sharing their study results in a repository, and thus getting credit for their work. We spoke about DOIs, visibility and tracking citations.

The second part finished with an exercise as well, where participants were allowed to practise depositing research data into the 4TU.Centre for Research Data.

Feedback

This was the first time that TU Delft Library was delivering a similar presentation to students, so we thought it was necessary to ask the participants for feedback afterwards to see how the session could be improved in the future.

What went well

We were happy to see that participants valued the interactive exercise on finding existing datasets and that they liked the information we provided about data sharing possibilities. Many participants were also happy to learn about the various repositories available for them to use (not only for datasets), as well as about the dedicated support available to them at TU Delft.

We were also happy to see that students liked the slides and they valued the presenter.

What could be improved

It was also extremely useful for us to learn how our sessions could be improved in the future.

The primary suggestion was to tailor the content to the level of knowledge of the students. It turned out that students were already familiar with the principles behind good data management and the benefits of data sharing, and therefore wished the pace of the session to be increased and more focused on the parts they were not aware of. In addition, the participants wanted to see more examples tailored to their discipline and types of research.

The other suggestion was to make the session more interactive: to ask more questions and to facilitate more discussion throughout the session. This could also allow the presenter to expose the right content to the participants during the presentation.

Conclusions

In the future, we will want to find out more about the audience in advance of the workshop to ensure that we can tailor the messages, examples and pace of the session better. We will also revise the content of the workshop to make it more interactive and to facilitate more discussions with the participants along the session.

In addition, we also had issues with accessing the live version of the 4TU.Centre for Research Data during the demo, which was quite unfortunate. To future-proof ourselves, we will prepare some screenshots of a deposit process and always have the slides with us during similar presentations.

Overall, it was a very useful exercise for us and provided us with a lot of ideas on how we could improve the workshop in the future. We are very grateful to both Dr Scott Cunningham and his students for the opportunity.

Resources

Retrospect on Data Management Plan Support Work 2017

dmps

The year 2017 closes in and it was a busy one for the DMP (data management plan) support. The funding bodies NWO (Nederlandse Organisatie voor Wetenschappelijk) and European Commission tightened their demands on research data management during the research and the discoverability and accessibility for the research outcome. Since that caesura the interaction with the researcher receiving grants from these funding bodies steadily increases. Next to other research support services, such as the valorisation centre, pointing researcher with a help request to our services, the RDS (research data services) team proactively contacts researcher to offer advice.

The first step is an introductory talk about the researchers project, their data and research data management. The following topic explains the provided ICT solutions available at TU Delft, because it still appears that researcher are not aware of the full spectrum of available technical infrastructure. Subsequently the DMP section about how to handle research data during the research is discussed. The last part considers preservation, archiving and data availability after the research, where the 4TU.Centre for Research Data is introduced and the benefits explained. Additionally the use of the 4TU.Reserach Data instance of DataVerseNL is described. With this wholesome range of support services, a possible data management and data deposit workflow is discussed.

 

For NWO the first deadline for a DMP draft is 4 months after the project officially started, for the H2020 programme by the European Commission it is 6 months. The RDS team also offers to give feedback on the DMP between that first deadline and the final submission.

 

The ideal involvement of the RDS team from start to finish begins with giving feedback on the data section in the proposal stage. When the project has received funding, the researcher comes back to the RDS team, or the team contacts the researcher again to offer support. Besides helping with filling in the DMP, the RDS teams offers training for the project team and the department, if the researcher is sympathetic towards that. With the data-doi reservation service of the 4TU.Centre of Research Data, the researcher are encouraged to already deposit the scientific publication underlying data into the archive. With the collection creation feature, the researcher are offered a great opportunity to represent their research output in the most suitable way and not wait till the end of the project to ‘dump’ some data into the archive to comply to the open data demand by the funding body.

All this is introduced and explained to the researcher in the first session and can lead to a close collaboration throughout the project duration, if the researcher is in favour of that.

So far we supported 20x research projects by NWO and 3x H2020 (open data pilot) with their DMP drafting and first submission. We did not receive any feedback by the funders about the quality of our DMP support yet. However, the researcher at TU Delft appreciate our advice and help.