Researcher Engagement with Data Management – What Works?

rda_logotype_hight

Written by: Maria Cruz and Julien Colomb


A new RDA project, under the umbrella of the Libraries for Research Interest Group and counting with the help of 29 volunteers from three continents, seeks to collect case studies from organisations around the world on how to engage researchers with research data management.

Collectively, our group have put together a survey, now open for contributions, which allows participants to share their stories and approaches for increasing engagement with research data management among researchers. The results from this survey, including the data, will be shared widely with the community in the form of an open book. The goal is to assemble a wealth of information and resources that can be used by institutions to select the methods that are most suitable for their settings.

The importance of research data management has been well emphasized over the last few years, particularly by research funding agencies, universities, and other research and academic institutions. However, the discussions around this topic have often been led by librarians and data professionals, and researcher engagement has been largely limited to those researchers who are already interested in the topic. In order to achieve global cultural change in data management, researchers need to be motivated and properly recognised for good data stewardship efforts. This is not an easy task.

Many organisations have developed dedicated programmes aiming at greater researcher engagement with research data. Examples include the Data Champions initiative at the University of Cambridge, Data Conversations at the University of Lancaster, the Data Stewardship programme at TU Delft, and the Open Data Champions initiative of SPARC Europe. In addition, some institutions, such as the University Medical Centre Utrecht and the Berlin Institute of Health, decided to change the way in which researchers are rewarded.

However, do we know how successful these programmes are in achieving cultural change? And what about their costs and benefits? Are some programmes more suitable than others for certain types of institutions? Are there other strategies out there that achieve similar results with less effort? These are some of the questions this project is trying to address.

Research data management professionals spend a considerable amount of their time doing outreach, teaching, and otherwise engaging with researchers about research data management. Understanding what we can learn from each other and how to exchange practices more effectively are two very important goals of the project.  

The case study collection, review and editing are being led Iza Witkowska, a Data Consultant from the University of Utrecht in the Netherlands, together with Andrea Medina-Smith from the USA and Elli Papadopoulou from Greece. They count with the help of 15 enthusiastic volunteers for these tasks. The first project update will be presented at the RDA Thirteen Plenary Meeting in Philadelphia in April 2019.

This blog post is distributed under a CC-BY 4.0 licence.


RDA Researcher Engagement Project

Steering Board

Lauren Cadwallader, Julien Colomb, University of Jena, Maria Cruz, Mary Donaldson, Lambert Heller, Rosie Higman, Elli Papadopoulou, Vanessa Proudman, James Savage, Marta Teperek

Project group members

Helene N. Andreassen, Daniel Bangert, Miriam Braskova, Lauren Cadwallader, John Chodacki, Julien Colomb, Philipp Conzett, Maria Cruz, Mary Donaldson, Biswanath Dutta, Esther Fernandez, Joshua Finnell, Raman Ganguly, Patricia Henning, Amy Hodge, Stein Høydalsvik, Greg Janée, Lynda Kellam, Gabor Kismihok, Iryna Kuchma, Narendra Kumar Bhoi, Young-Joo Lee, Leif Longva, Andrea Medina-Smith, Solomon Mekonnen, Remedios Melero, Rising Osazuwa, Elli Papadopoulou, Fernanda Peset, Josiline Phiri, Piyachat Ratana, Gerry Ryder, James Savage, Souleymane Sogoba, Magdalena Szuflita-Żurawska, Ralf Toepfer, Ellen Verbakel, Irena Vipavc Brvar, Jacquelynne Waldron, Anna Wałek, Yan Wang, Iza Witkowska, Joanne Yeomans

Data Champion kick off meeting

On the 14th of December 2018 the Data Champion Kick off Meeting took place at TU Delft. After an update on the Data Champion programme by Yasemin Turkyilmaz-van der Velden, the Data Champion Community Manager, the Data Champions took charge of the meeting by presenting their research, focussing on their research data management practises. In between these presentations there was room for networking activities, to stimulate interaction between Data Champions across the different departments and faculties. Here the researchers learned from each other and gained new insights for their own research data management.

Data Champions

Data Champions are researchers that are practising and advocating good research data management and are sharing their experiences and tips with their group/department members. Data Champions can help their Faculty’s Data Steward with the disciplinary specific practices of Research Data Management. In return, the Data Champion programme offers (international) network and funding opportunities, trainings and workshops, and increased visibility to researchers. Being a Data Champion is a chance to be recognised for your leading role in research data management in your department and faculty. For more information on each Data Champion, or information on how to become a Data Champion, please visit the Data Champion page.

Mark tweeting on the Data Champion meeting at TU Delft.

Data and code in waterworks research – talk by Mark van Koningsveld

Mark van Koningsveld, Data Champion of the Faculty of Civil Engineering and Geosciences

Mark van Koningsveld, Data Champion of the Faculty of Civil Engineering and Geosciences and chair of the section Ports and Waterways, presented his research on waterworks in the Netherlands. He examines the spatial planning of regions where logistical change in the use of waterways is foreseen. He presented an example of his research on the water network of Amsterdam, which helps to understand how the traffic network would change if certain canals would be closed off for construction work. Mark designed software to see how the planning of the reconstruction work would affect the water traffic. The interoperable nature of software allows its re-use to study other case studies, such as the short term effects of the drought that affected the Dutch water networks this summer, or the effects of long term climate change on the logistics of waterworks.

The integrated and interoperable approach that the software offers contrasts with the field’s traditionally linear approach, which can be lengthy and slow. Software enables a parametric design approach, allowing for immediate feedback on multiple aspects under study, such as production costs, different types of vessels and traffic load on the network. To move the research field forward Mark thinks that software should play a more important role, which is why he set up the Ports and Waterways coding lab for his students. He encourages other researchers to set up their own coding labs, as the facilities for this are available at TU Delft. Mark thinks that there is still a lot to be gained from collaborative research within faculties and across the faculties. These unlikely interactions will result in new ways of problem solving.

Time to share is now! Talk by Anton Akhmerov

Anton Akhmerov, Data Champion at the Faculty of Applied Sciences

Anton Akhmerov, Data Champion at the Faculty of Applied Sciences in the Department of Quantum Nanoscience, continued the discussion on the use of software in research. Currently, the publication of papers, data and software are evaluated differently by the research community. Publications are the most visible output of research, followed by research data, and then finally there is software. Software is a difficult concept to grasp, as not every researcher knows how coding works or how to properly apply coding skills in daily research practices. This gap in knowledge is further complicated by the current policy of the TU Delft requiring researchers to fill out an invention disclosure form before  publishing code or software openly at Github. Anton argues that training should be available for all PhDs and Postdocs in order to properly use platforms like Github, such as the software carpentry workshop mentioned above and the programming course Anton organises, but also in the TU Delft Graduate School programme. He thinks researchers should work together to discuss their problems,learn from each other and to familiarise themselves with the available tools.

Gary Steele, another Data Champion at the Faculty of Applied Sciences

Anton and Gary Steele, another Data Champion at the Faculty of Applied Sciences, worked on an open data policy for the Department of Quantum Nanoscience, in which they defined two levels of data sharing: level 0 and level 1. Level 0 means one uploads the numerical data as shown in the figures in a format that is readable to others. Since the data is already used to generate the figure anyway, this should not cost any additional researcher’s time. Level 1 is the publication of the raw data and scripts that underlie the data processing chain which produced the published data. Level 1 data is available in this publication where all the raw data (python scripts and notebooks) is available in Zenodo, a research repository. Level 1 does not mean that all the data collected during the research should be made available. Gary argues that people should think and discuss about what data should be shared, as researchers can get stuck in ‘molasses of useless documentation’ if everything is required to be shared. Anton argues that there is a lot of room in between the current practice of keeping data to ourselves and a possible future where all the data could be shared. Anton thinks that the time to start sharing is now, as the pressure starts to build up from the funding agencies to share research data. To see if the policy has an effect, Gary and Anton will monitor the latest publications and offer support to their colleagues if required.

Going beyond data

The important role of software and coding in research was a recurrent topic of the Data Champion Kick Off meeting. To support researchers to work with code and software, TU Delft hosted its first software carpentry workshop on 29 November 2018 and will organise more in the future. Additionally, Data Steward Kees den Heijer will organise code consultancy walk-in  hours for researchers to support them with their coding and software needs. The first session will take place on the 24th of January from 9:00-11:00 in CEG 2.66 and is open to researchers of all the faculties.

Data Champions networking at the Data Champion Kick Off Meeting

Future of the Data Champion programme

The Data Champions that attended the kick off meeting were enthusiastic about the informal and interactive setting of the meeting, allowing for interesting discussions and the generation of new ideas, as well as providing network opportunities. For future meetings they want case study specific meetings to discuss common problems faced, perhaps even in the form of regular common interest meetings. They were also interested to meet representatives from the various support services of TU Delft, such as the Data Protection Officer, and they were looking forward to the attendance of Rob Mudde, member of the TU Delft Executive Board, to attend the next Data Champion meeting. A next Data Champion meeting is planned for spring 2019 and the Data Champion network is still open to new members.  

Many precious first time experiences, thanks to RDA Europe Early Career Programme

Yasemin Turkyilmaz-van der Velden and Marta Teperek had the opportunity to represent TU Delft at the International Data Week 2018 in Gaborone, Botswana. Yasemin has been awarded the RDA Europe Early Career grant to attend the conference and wrote a blog post about her experiences which was originally published on RDA’s website and can be found below.


I am a Data Steward at the TU Delft Faculties of Applied Sciences, and Mechanical, Maritime and Materials Engineering. At the same time, I am a PhD candidate of Erasmus MC Rotterdam and I am writing my thesis which is about UV-induced DNA damage repair in mammalian cells. I work as a Data Steward since March 2018 and it has been a very enjoyable experience to join TU Delft as well as the Open Science and FAIR data community during the last 8 months.

This was my first RDA plenary and SciDataCon conference, and my first time in Africa. To be honest, I did not know much about either Botswana or RDA and SciDataCon, and I can tell that it has been a very pleasant experience to observe the beautiful nature and culture of Botswana and join RDA and SciDataCon. To comment a bit more on the former, although I could not visit the Okavango Delta which I earlier watched in such amazement in Planet Earth II, it was such a nice experience to visit the Mokolodi Nature Reserve and Gaborone Game Reserve and get the chance to see giraffes, zebras, rhinos, warthogs and others which I can normally only see in zoos or documentaries (and Lion King of course). I was also impressed with the friendliness of the local people which included taxi drivers turning into local guides.

The rhino mother and the baby, and the curious giraffe

To comment on the latter, I was really impressed with the RDA, WDS and CODATA communities and how everyone was so knowledgeable and at the same time so friendly and willing to help and collaborate. Being one of the 8 TU Delft Data Stewards and a member of TU Delft Research Data Management (RDM) team which is around 15 people (not even mentioning the relatively high number of RDM experts within the Netherlands), I should be probably one of the last persons to complain about not having enough people around me to discuss RDM. Yet, it was so nice to be in an environment with so many RDM experts (or early career as me) coming from all over the world, joining to Interest Group (IG) and Working Group (WG) meetings with them, having interactive discussions during the sessions and continuing with the fruitful discussions during the coffee breaks, lunches, dinners or drinks by the pool. It was also a very nice opportunity to meet colleagues from Botswana and other African countries and hear about their experiences.

The local hosts thought of many nice details to give the IDW 2018 a Botswanian touch. It was clear that IDW 2018 was taken seriously in Gaborone. The opening ceremony included the National Anthem and the speech of the President of the Republic of Botswana. Entertainment was not left out, the participants could enjoy shows of the Traditional troupe and the Marimba Band.

The President of the Republic of Botswana and the cheerful Marimba Band

Being an RDA Europe grant winner, I was assigned to take part in the Joint Meetings of WG FAIRSharing Registry and Data Policy Standardisation and Implementation IG and IG Health Data, IG Ethics and Social Aspects of Data, WG Blockchain Applications in Health. My tasks included note taking and helping organizers with their activities during the meetings. I was assigned to groups where my interests lie and I think that it was a great opportunity to get actively engaged in the activities of these IGs and WGs. I would like to thank the meeting organizers for being so friendly and welcoming.

Additionally, I was encouraged to present a poster during the poster session which I always see as a great way to engage informally and interactively with participants during meetings and conferences. I have presented the poster titled as “Data Stewardship at Delft University of Technology” which I prepared together with my TU Delft Data Steward colleague Yan Wang reusing the materials generated by other TU Delft Data Steward colleagues. The poster can be found in zenodo and is available for reuse with a CC BY 4.0 license.

I also gave an oral presentation during the SciDataCon session “Motivations and recognition for good data stewardship”. My presentation was based on the abstract written together with my colleagues Maria Cruz and Marta Teperek and this abstract got us invited to submit a related paper to the Data Science Journal Special Collection for SciDataCon 2018. The presentation can be also found in zenodo with a CC BY 4.0 license.

To give a bit more insight into this session, it was proposed and chaired by Marta Teperek, and it was about different efforts and approaches taken in different institutes to engage with researchers and achieve cultural change. The session started with the presentation of Rosie Higman from Manchester University titled as “Stewards, Champions or Advisors? An overview of institutional Research Data Management support structures” which showcased her impressive work comparing different approaches taken in different institutions. I see her work as an invaluable resource especially for all those who are still not sure where to start and which approach to follow. Then I continued with my presentation “Data Stewardship at Delft University of Technology” and it was followed by the talk of James Savage “Establishing, developing, and sustaining a community of Data Champions”. James Savage is a researcher and a Data Champion at the University of Cambridge. Although both I and Rosie already talked about Data Champions, having a real Data Champion in the room convinced the audience that Data Champions are not some imaginary characters that the research data supporters propose, but they are really willing to become advocates for good RDM practices in their research groups and departments to achieve the desired cultural change. After James, Raman Ganguly from the University of Vienna gave a presentation titled as “Building sustainable networks for data management” and talked about their approach where they did not have any Data Stewards or Champions.

Finally, I was also encouraged by the RDA Europe Early Career Programme to join the IG Early Career and Engagement session. The aim of this IG is to give the opportunity to early career researchers and professionals network among themselves and receive mentorship from senior RDA members. In the current RDM landscape, there are so many things changing and new things are being regularly introduced. Considering my role as a research supporter, I can only be helpful to the researchers if I can keep myself updated with all these changes. Again, I have been lucky from the beginning to join to the TU Delft RDM team and especially to have such an experienced person as Marta Teperek (who is a mentor for this IG), who was always there for all my questions. But I can imagine that not everyone is as lucky as me and therefore I see great benefit in this IG for many early career researchers and professionals.  

I would like to end my post with the monkeys enjoying the coffee table on the last day of the conference and I would like to present my sincere thanks to RDA Europe Early Career Programme for giving me the opportunity to join IDW 2018. I would also like to thank Marta Teperek, the Data Stewardship Coordinator and André Groenhof and Marja van den Bergh, the executive faculty secretaries of Mechanical, Maritime & Materials Engineering, and Applied Sciences for being ever so supporting and encouraging.

 

Continue reading

4TU.Centre for Research Data partners with The Carpentries: Impressions from the first workshop at TU Delft

code-2558224_1280.jpg

Written by Shalini Kurapati and Marta Teperek


Training needs: research computing skills for open science

In addition to good data management, software sustainability is important for open science.

In accordance with the survey conducted by the Software Sustainability Institute in 2014, 7 out of 10 researchers rely on code for their research. Sharing research data without the supporting code often makes research impossible to reproduce. Good documentation and version control have been highlighted as major contributors to sustainable software. In addition, earlier workshops and survey results indicated that researchers need training on good code writing and code management practices and version control.

Similarly, TU Delft-wide survey on data management needs revealed that 32% of researchers were interested in training on version control and 18% specifically in software carpentry workshops.

Thus, 4TU.Centre for Research Data made a strategic decision to partner with The Carpentries and became a Silver Member of the organisation.

What are The Carpentries?

The Carpentries “teach foundational coding, and data science skills to researchers worldwide.” That’s a community-based organisation, which maintains and develops curricula for three different types of workshops: software carpentry, data carpentry, and library carpentry. Detailed and structured lesson plans are available on GitHub and they are delivered by a network of carpentry instructors.

An important element of The Carpentries is that in order to deliver a workshop, instructors need to be certified. The certification process puts a particular emphasis on the pedagogical skills of the instructors.

First software carpentry at TU Delft

TU Delft hosted the first software carpentry workshop on 29 November 2018 as a pilot before officially joining The Carpentries. We had around 30 researchers participating (and another 45 on the waiting list!). The participants were from four faculties at TU Delft: Civil Engineering and Geosciences, Applied Sciences, Technology Policy & Management, and Architecture and Built Environment. We had three instructors and four helpers in the room.

Capture.PNG

The GitHub pages with the lesson materials are publicly available and can be found here: https://mariekedirk.github.io/2018-11-29-Delft/ All participants were asked to bring their laptops along and to install some specific software. No prior programming knowledge was required. Collaborative notes were taken with Etherpad.

During the workshop, participants downloaded a prepared dataset and they worked with that dataset through the two days. They learnt task automation using Unix shell, version control using git, and python programming using jupyter notebooks.

Feedback

The Carpentries have a special way of organising feedback. Participants receive red and green post-it notes and use them to indicate problems / completion of tasks during the whole course. Similarly, after the end of each day, the participants are asked to indicate all the plus sides and negatives of the workshop on green and red post-it notes, respectively.

The feedback from the participants after the workshop helped us evaluate the training. The participants were overwhelmingly appreciative of the instructors and helpers and seem to have enjoyed the training. Some of the participants felt that the pace of the workshop was fast and they did not have time to experiment with the data set. Some others wished to get a more personal approach and to actually get an opportunity to work with their own disciplinary datasets.

Plans for the future

The waiting list for the workshop was very long and we had to disappoint more than 45 researchers who didn’t manage to get their spot on the day. In addition, faculty graduate schools have been willing to give course credits for PhD students who attend this workshop, which made the course even more attractive to attend for PhD students. Therefore, to meet the demand, we are planning to organise four more workshops in 2019: two workshops at TU Delft, one in Eindhoven and one in Twente. We will continue to monitor the number of interested researchers and if the need arises, we might consider scheduling some additional courses.

In addition, to increase our capacity in delivering carpentry training, some of the TU Delft’s data stewards and data champions will attend the training to become instructors. We hope to have this instructor training organised in April.

To address the feedback about the pace of the course, we will be more selective and include fewer exercises in our future workshops to ensure that the participants get the chance to experiment and play with their datasets and scripts.

In order to provide some more tailored support to researchers who have started to code but need some additional support to make it work, or who might have attended a carpentry workshop but are not sure how to apply the learning into practice, we will host dedicated coding walk-in hours consultations starting in January 2019.

So… watch out for the next carpentry workshop – scheduled for Spring 2019!

Reflections on the report “Scholars ARE Collectors: A Proposal for Re-thinking Research Support”.

Post by Amit Gal, Alastair Dunning and Nicole Will

The research organisastion Ithaka S+R recently issued the report “Scholars ARE Collectors: A Proposal for Re-thinking Research Support”. The report takes a user-centred approach when trying to understand what would be a good way to support researchers in the future, and outline possible places to invest.

It makes the case that researchers are, in fact, collectors, and that their (often massive) collections vary widely in form across different disciplines. All of these collections however, are not properly managed – which is quite understandable, as “collecting” requires a different set of skills and tools than “researching”.

From the context of our own research support services at TU Delft, we made some specific points in observation :

1. The report has a great focus on the right point of view – the user’s point of view. If we at TU Delft want to support the researcher better, we must understand her better. That means more than just knowing what she does, it means having an empathetic understanding of why she does it and who she is. Understanding is more than just talking.

2. From an empathetic understanding you get a better appreciation of the challenges. TU Delft’s Informed Researcher Training and Open Science MOOC tries to fill some of the identified skills gap.

3. The report points to four different stakeholders that support scholarly collecting – funders, open data advocacy groups, external tool and service providers, and academic institutions. It might be useful to realize that we, as the TU Delft library, represent two of these stakeholders – we are the academic institution, naturally, but we are also the FAIR data advocacy group. Is it possible that these two sometimes clash? Could one role impede the other, and if so – how should we address it?

3. The journey of understanding our users better, improving our services and creating new, better ones – is a journey we cannot be taking on our own. At the very least, ICT and the researcher groups must be partners here. So we should get better at collaborating with these, and other, parties around us.

4. Some of the language (eg, ‘scholars’, ‘personal collections’) and evidence here is drawn from the humanities and doesn’t feel right in the context of a technical university. The report misses some of the language and developments occurring in a technical university (eg., there is no mention of data science, data stewards etc, and the importance of writing code or running simulations is underplayed)

5. Our instinct is that scientists (as opposed to humanities scholars) have fewer ‘personal collections’ and more ‘group collections’. E.g. A team gets access to data, or a department collects data, or a consortium writes a proposal, or a group writes a paper. While individual roles always play a part, access to these different outputs is managed at a team level.

6. Many of the key points are similar to what we know here at TU Delft, eg about fear of being scooped or the time taken to document data. The metaphor of collection is also important, as it emphasises the emotional ownership scientists feel about their outputs.

7. The conclusions of the final page is definitely worth holding on how do we (and by that I mean not just the library but all the relevant support service) offer the kind of support the researcher needs throughout her workflow (not just the start and end). The goal is not Open Science per se, but getting to Open Science by responding to specific user needs.