Category: Uncategorized

What does reproducibility mean for qualitative research?

Written by Shalini Kurapati and Marta Teperek


Sebastian Karcher of Qualitative Data Repository was at TU Delft on Jan 28 to share his thoughts and expertise on this topic in a two-part seminar organised by the 4TU.Centre for Research Data. Given that we are based at the Delft University of Technology, where qualitative research is not mainstream, we were positively surprised to see a lot of interest in the seminars and that the talk was attended by around 60 participants. The majority of researchers were from the faculty of Technology Policy and Management followed by Industrial Design Engineering. The second part was a hands-on workshop on qualitative data management techniques, which was attended by 20 participants.

Limits of Reproducibility

During the opening talk “Limits of Reproducibility: Strategies for Transparent Qualitative Research”, Sebastian raised the issue of terminology problems surrounding reproducibility discussions. To clarify, he presented lucid definitions of the various terms that are often used interchangeably.

  1. Reproducibility: Using same data and methods to produce same results
  2. Replicability: Using same methods, different data/sample and arriving at same results
  3. Transparency: Providing all necessary information to evaluate a study

Sebastian argued that reproducibility and replicability are based on a positivist view of science (e.g. testing a hypothesis in a lab), whereas qualitative research may follow interpretive research methods where the study conditions are quite impossible to recreate. Therefore, Sebastian suggested that the rigour and quality of qualitative research should not be judged based on reproducibility or replicability, but by the transparency of the research process.

He distinguished three types of transparency most relevant in qualitative research:

  1. Production transparency: Information on how data are collected or generated, such as research questions, sampling, subject recruitment; this information should be recorded following best practices in the research field.
  2. Analytic transparency: Documentation process of qualitative data preparation and analysis leading to the conclusions of a study.
  3. Data access: Considerations with regards to data sharing and access conditions.

However, Sebastian stressed that the idea of using transparency as a way to evaluate the rigour of qualitative research is only the beginning of a debate and not a solution for all aspects of qualitative research evaluation. In addition, following the path of transparency is not always straightforward. For example, data preparation, data anonymisation and providing access to data can be costly (both in terms of time and resources). In addition, there might be ethical and legal challenges with respect to privacy regulations and additional copyright implications of textual data. Nonetheless, Sebastian was confident that thanks to improved technology, existence of instruments like data management plans and raising awareness of best practices, these problems can be overcome.

Managing Qualitative Data for Sharing and Transparency

In the second part of the seminar, Sebastian focussed on practical issues related to de-identification of personal data, access controls for data and informed consent forms.

De-identification of qualitative data is highly dependent on research questions and context. For example, knowledge of a specific language might influence the level of difficulty in de-identification. In addition, local circumstances need to be understood in order to know what gives away the data subject’s identity. Using a practical exercise based on a politicians’ profile, Sebastian asked the participants to de-identify personal data, while also thinking about minimising the loss of richness of the information provided. The final results were truly interesting since different groups de-identified the same text in a myriad of ways, and they was no model answer. An ensuing discussion allowed participants to reflect on the considerations they made while de-identification.  

An ensuing exercise was to understand the guidelines on writing clear and informative text in consent forms. Getting consent right is important to make sure that data collection is efficient, compliant to regulations, and that there are considerations of a potential reuse of research data. It is also important for researchers to negotiate with ethics committees on the text of the informed consent and sometimes receive advice form them to avoid risk averse decisions by the committees.

Before retiring to questions from audience, Sebastian spoke on the importance of awareness of access controls for sharing qualitative data which often contains personal information. Therefore, despite the fact that research data contains information which could potentially disclose individuals, their identity can be still protected through access controls. Sebastian mentioned some examples of access control to research data which can be offered by research repositories:

  • Conditional online access – data can be accessed online, but only by registered users, who in addition might be required to fulfil some obligations, e.g. have ethics approval or be affiliated with research institutions.
  • Depositor-approved access – data can be accessed online by registered users, who are in addition approved by the depositor. Sebastian advised caution with relying on this approach to data access, given that these routes might not be sustainable long-term
  • Offline access – datasets can be accessed by registered users at a secure location, on a computer with no network access
  • Embargoed data – datasets can be only accessed after a certain date/after certain amount of time past/certain even happen – conditions need to be specified and justified

Summary

The take-home message of the workshop was that there is no one size fits all solution when it comes to responsible working with qualitative research data – contextual information is often key. That said, in all cases, transparent documentation of all the datasets and processes are very important, as well as getting appropriate consent and ensuring adequate access controls to data.

Materials

Sebastian’s talk, as well as all the workshop materials are publicly available:

Talk on Limits of Reproducibility: Strategies for Transparent Qualitative Research
Workshop on Managing Qualitative Data for Sharing and Transparency

Coding problems? Just pop over!

Launch of code walk-in consultations at TU Delft

Authors: Nicolas Dintzner, Kees den Heijer, Marta Teperek

On Wednesday 24th of January, the data stewards at TU Delft organised the first (might  be re-named in the future) “code walk-in consultation” hosted at the Faculty of Civil Engineering.  

The main objective of this event was to provide support to researchers facing software and/or data processing related issues.  To this end, we gathered data stewards (Esther, Kees, Nicolas) and data champions (Joseph Weston, Victor Koppejan) and got ready for… whatever software issue troubled people on that day!

Several people turned up ranging from MSc students to a full professor (Mark van Koningsveld, one of our data champions). The participants came in with rather interesting and diverse problems. From data plots in Python, to Fortran compiler behavior, we had our hands full for a little while! Code was reviewed, some of it was compiled (more than once), tests were run and some participants saw their problems being solved on the spot, while others only got some ideas for resolutions.

Everything happened in a relaxed atmosphere. People came in and where greeted by a member of the team. They described their issue(s) and based on this, we decided who among the stewards and champions had the most experience in that domain or was the most likely to be able to help. Then, we opened the laptop of the problem-giver and started hacking away.

Here are a few take-away points from this first session:

  • Bring-your-laptop is a great practice: having working code to play with is really valuable to get started quickly and get to the core problem
  • An external point of view is always useful: we did not manage to solve all issues, but at least, we provided some insights on what could be the possible causes and a course of action to move forward.
  • Minimum working examples are welcome: having a small size example of the issue at hand (when relevant) is quite useful to get to the core of the problem quickly. While not necessary for walk-in sessions (we’ll help you with what you have!), such test cases are useful when the error scenario involves remote code execution, or complex setups.

From a pure data stewardship perspective, such sessions are quite valuable as well. We get to see what researchers work on, what  tools are used and what kind of issues that brings. For instance, we had no idea that people were still working with Fortran 77 code.

So far, we received little feedback, but the little we have is quite encouraging:

Thank you! That is very helpful to see. I also really appreciated all your help this morning at the coding consultation.

So, we’ll keep organizing those code walk-ins, but most likely with a cooler name.  We will start to do so on a monthly basis.

In the meantime be aware that you can get in touch with your faculty data steward at any time for a bit of help regarding your software/data issues!


Data Stewardship at TU Delft – 2018 Report

Capture.PNG

Authors: Marta Teperek, Yasemin Turkyilmaz-van der Velden, Shalini Kurapati, Esther Plomp,  Heather Andrews, Robbert Eggermont

TU Delft has been leading the way in fostering a good research data management culture to uphold the quality, transparency and reproducibility of research. Since 2017, TU Delft has piloted the Data Stewardship programme with the aim to provide disciplinary specific data management support to TU Delft researchers. The focus on disciplinary support is motivated by the belief that in research data management (RDM), there are no one-size-fits-all solutions.

TU Delft has eight faculties with a wide range of research topics. In order to provide dedicated disciplinary support to researchers, a Data Steward was appointed at every faculty. Each Data Steward has a PhD degree in research are relevant for the faculty.

This is a condensed 2018 annual report describing the progress, activities, achievements and future prospects of the project.


Team building and laying the groundwork for the programme

In 2017 the majority of work focused on the recruitment of Data Stewards at three faculties: Electrical Engineering, Mathematics and Computer Sciences (EEMCS), Aerospace Engineering (AE) and Civil Engineering and Geosciences (CEG), and laying the groundwork of the programme. In 2018 Data Stewards were appointed at the remaining faculties, which concluded the team building work and brought the programme to its full speed. Since the beginning of 2019, the team of Data Stewards is at its full capacity, with a dedicated Data Steward per faculty.

The Data Stewards meet weekly for training, information sessions, and knowledge and practice exchange. The weekly meetings focus on the RDM needs of TU Delft researchers and keeping up to date with the most recent trends in RDM such as the FAIR principles, General Data Protection Regulation (GDPR) law, research and software reproducibility. Dedicated experts from TU Delft, as well as national and international scene are regularly invited to these meetings. Communication channels and information sharing spaces have been also created and are now effectively used by all team members. To increase the visibility of the programme and to openly share its progress, a Data Stewardship webpage and a dedicated section on Open Working blog were launched. While the Data Stewards are embedded at each faculty, the Research Data Services (RDS) team operate centrally at the TU Delft library. To establish strong links between these two teams, a joint Away Day is organised once a year. Additionally, members of the RDS team are also attending weekly Data Stewards meetings and participate in some of the joint projects and undertakings (e.g. roll out of a new data management plan template). In addition, connections with faculty secretaries were developed through dedicated meetings to talk about Data Stewardship hosted by the Library and attended by all faculty secretaries. All of these activities were overseen and coordinated by the Data Stewardship Coordinator who is located at the TU Delft library.

Day to day activities of the Data Stewards

The role of the Data Steward at TU Delft is relatively new, so one of the first tasks of the Data Stewards was to become visible to researchers and gather intelligence on the type of support and advice researchers require within the faculty. In the first couple of months, Data Stewards engaged with researchers during faculty meetings, interviews, graduate school seminars, open science roadshows and by sending out a survey on the data management needs (see below for more details).

After researchers were sufficiently aware of the help they could receive, Data Stewards started receiving questions and requests for data management support.  The requests varied across the 8 faculties, but there were a few common topics on which Data Stewards were regularly consulted, such as: advice on data management plans, information about data archiving options, data sharing possibility, GDPR concerns, cross-border data transfers, commercially sensitive data, or data licensing.

Data stewards are also the linking pin to the broader TU Delft research support ecosystem.  Pragmatically speaking, Data Stewards act as general practitioners to all data related questions and issues. If there is a need for a specific intervention from a university wide legal, ethics or ICT specialist, Data Stewards know where to direct the researcher to get the most specific and useful answers.

In addition to advice and consultation, Data Stewards provide and/or facilitate on-request training and workshops on data management topics for researchers and PhD students. Agreements are made with faculty graduate schools to allocate credit points for participation.

At the moment all the Data Stewards are involving in leading the RDM policy development at their respective faculties.

Data Champions

Although embedding Data Stewards at each faculty is a prerequisite for creating awareness and achieving cultural change in RDM, community building efforts are essential to fully accomplish these goals. Additionally, it is impossible for a single Data Steward to have all the necessary disciplinary background to understand and support all types of research carried out in one faculty. Therefore the Data Champions programme was launched in September 2018.

Data Champions are researchers who voluntarily act as local community-based advocates for good data management and sharing practices. In return, they are provided with opportunities to showcase their activities during meetings at the department, faculty and TU Delft level as well as (inter)national conferences to offer increased impact and visibility. Additionally, the Data Champions are offered travel grants to join meetings and conferences to showcase their Data Champion activities, and trainings and workshops to learn new RDM skills to share with their local community members.

Suitable candidates for the programme are identified by faculty Data Stewards and are encouraged to become Data Champions. The general communication with the Data Champions is carried out by the Data Steward at the Faculty of Mechanical, Maritime and Materials Engineering (3mE), who took on the role of the Data Champions Community Manager. The first meeting to officially kick off the programme was on 14 December 2018. This meeting took place in an informal setting to encourage interactive discussions, knowledge exchange and networking. Overall, it was very well received by the Data Champions as well as the research support professionals.  As of December 2018 we already had 27 Data Champions (at least one Data Champion per faculty) and this number is still growing. The AE Faculty, as well as the Faculty of Technology, Policy and Management (TPM), already have at least one Data Champion at every department.

The Dean of the Faculty of Applied Sciences (AS) has recognised the importance of Data Champions for advocating for good data management and sharing practices and aims to also have at least one Data Champion per department. The AS faculty already has six Data Champions and two of them, Anton Akhmerov and Gary Steele, took the lead in creating a dedicated policy on Open Data for their department (Quantum Nanoscience). The importance of the Data Champions programme has been recognised also at a strategic level at TU Delft, evidenced by the wish of Prof. Rob Mudde, the Vice Rector Magnificus of TU Delft, to attend the next meeting of the Data Champions.

RDM Survey

To be able to offer dedicated RDM support, it is necessary to first define the problems and the needs of the researchers. Our survey on research data management needs, which was initiated in 2017 at three faculties (EEMCS, CEG and AE), has been extended and completed in three other faculties in 2018 (TPM, 3mE, AS). The survey gathered 680 responses in total and the data visualisation is publicly available. The survey provided important information on the state of data management practices at TU Delft. The survey will be repeated yearly and this way the results will serve as a benchmark to indicate the effects of the work of Data Stewards on data management awareness and practices at the faculties.

The joint presentation summarising survey results at LIBER conference in July 2018 by the Data Stewards from LT and 3mE faculties was very positively received by the community and downloaded 187 times. Based on this presentation, we got invited to submit a paper about the survey results to LIBER Quarterly. The survey will be run at the two remaining faculties (Architecture and the Built Environment – ABE, and Industrial Design Engineering – IDE) and re-run at the other faculties in 2019.

Data Stewardship in numbers

Summarising, in 2018 the Data Stewards have received at least 245 requests for help with data management (note that not all the requests are recorded, given that it involves manual copy-pasting of the requests received by emails). In addition, in 2018 Data Stewards conducted 68 dedicated interviews with researchers about their data management practices. Notably, the Data Steward at the AE Faculty has met with all the full professors at the faculty, which was positively received by TU Delft’s ex-Rector Magnificus Karel Luyben.

In addition, Data Stewards adhere to the principle “practice as you preach” and therefore share their work as openly as possible. In 2018 the team published 29 blog posts and other publications on the Open Working blog. Our top viewed blog post in 2018 is by the Data Steward at EEMCS, describing the results of the RDM survey (viewed 844 times).

Furthermore, the team have attended 46 national and international conferences and meetings in 2018, including 33 occasions were Data Stewards were presenting as invited speakers or keynote speakers. The Data Steward from the 3mE Faculty was awarded the competitive Research Data Alliance Early Career Researcher Grant to attend the International Data Week 2018 conference in Botswana in November 2018. Again, in adherence with the openness principles, all presentations are publicly shared in a dedicated Data Stewardship at TU Delft community in Zenodo.

Data Stewardship event

On 24 of May 2018 the team has organised a dedicated event “Engaging researchers with research data – Data Stewardship in practice” to showcase the work of Data Stewards at TU Delft and to exchange views and practices on Data Stewardship with other universities. The event was attended by over 120 individuals (with 35% of the participants  from countries other than the Netherlands). All participants judged the event as “good” or “excellent” and responses to open questions were overwhelmingly positive.

All the photos (taken by Jan van der Heul from the RDS team, our Chief Photographer), videos and presentations from the event are publicly available. In addition, three participants wrote blog posts with their reflections and take-home messages (Marjan Grootveld, Danny Kingsley and Martin Donnelly).

Projects

Data stewards have also been involved in many diverse projects. For example, the Data Stewards from the AE and CEG faculties took part in developing domain data protocols, which aim to provide researchers with disciplinary standards for data management in their research domains. The Data Stewards from the 3mE and AS faculties are part of the Electronic Lab Notebooks working group, which, following up on the successful Electronic Lab Notebooks event in March 2018, is now setting up a pilot to test Electronic Lab Notebooks at TU Delft in 2019.

Data stewards from the faculties of TPM, 3mE, AS and CEG have been involved in providing support for researchers working with software in order to improve code management practices and to make software more reproducible. Several workshops on software sustainability were organised, which resulted in a dedicated research paper that got accepted to be presented during the IEEE eScience 2018 conference and got published in the conference proceedings. The preprint of this paper is already downloaded 227 times.

These efforts eventually resulted in 4TU.Center for Research Data joining in December 2018 The Carpentries which is a non-profit organization teaching foundational coding, and data science skills to researchers worldwide. On 29 and 30 November, the first Software Carpentry workshop took place at TU Delft. The tickets got sold out just in a matter of days and we had around 30 researchers participating and another 45 on the waiting list, showing the huge interest and need for such training. Two more Carpentry workshops will take place in TU Delft in 2019. In addition, the Data Steward from the CEG faculty took the lead in the organisation of walk-in coding consultations for researchers wishing to get tailored support on their code management practices, which, due to its success and positive feedback from researchers, will continue to be organised on a regular basis. Moreover, a meeting with TU Delft researchers took place to discuss community building efforts for good programming practices. To this meeting, a representative from the Carpentries and a researcher from the University of Amsterdam was invited to learn lessons from their community building efforts.

Data Stewards have been also instrumental in driving forward the Open Science agenda. Dedicated Open Science roadshows (information sessions on research data management and on Open Access) have taken place at AE, TPM, IDE and CEG faculties. In addition, the TPM faculty organised a dedicated workshop on Open Science to their PhD students. The presentation “Open Science in a nutshell: what’s in it for me?” which was uploaded to Zenodo, has been downloaded 324 times and viewed 1,815 times.

In the current changing funding landscape where the researchers are expected to publish their papers and data openly, it is not feasible to evaluate researchers based on high impact journal publications alone for funding and promotion criteria. This is why, the TPM Faculty was also actively involved in discussions about academic rewards and how to make open science count in academic careers. Prof. Bartel Van De Walle was the keynote speaker at the event on Open Science skills which was co-organised by the Data Stewards, 4TU.Centre for Research Data and the EOSCPilot. There were two separate blog posts highlighting the key aspects of the event (one blog post about the event as a whole and another one about the interactive workshop).

Following the principle that good data management should start as early as possible, the Data Steward from the AE Faculty opiloted the use of Dataverse for keeping research data of master students. Valuable and curated datasets can be subsequently easily published with 4TU.Center for Research Data.

Recognising the need for disciplinary support and for community building, Data Stewards from the ABE and IDE faculties identified the need for Digital Humanities community at TU Delft and are currently discussing with researchers across TU Delft to scope their interests and needs. A bottom-up approach is taken to encourage researchers to take lead in forming their own communities and exchange research ideas, resources and challenges. The first community-driven meeting will take place in early January 2019 at ABE faculty.

Since 25 May 2018, GDPR has came into effect in Europe. In August, two events dedicated to GDPR and its implications for research data were co-organised by the Data Stewards and the Research Data Netherlands. An important aspect of these two events was that representatives from multiple institutions and countries were present to talk about their individual approaches and considerations.

Policy Development

On 26 June 2018, the TU Delft Research Data Framework Policy was approved by TU Delft’s Executive Board. The Framework Policy is an overarching policy on research data management for TU Delft as a whole and it defines the roles and responsibilities at the University level. In addition, the Framework provides templates for faculty-specific data management policies. It is important to develop the faculty policies according to discipline specific RDM needs of the researchers, so they can use this policy as a roadmap for good RDM practices.

Currently, the deans and the faculty management teams, together with the Data Stewards, are busy with the development of faculty-specific policies on data management which will define faculty-level responsibilities. Any interested researcher and research supporter will be invited to give feedback and therefore contribute to the development of the faculty policy. In AS and 3mE faculties, which have around 1000 researchers each, a single meeting would not be feasible, therefore the Data Stewards of these faculties will join to the meetings of every individual department to introduce the policy and ask for feedback. The Data Champions are particularly encouraged to get involved in the development of the policy in their faculties in order to fine tune the policy based on their disciplinary needs.

Future Prospects

As can be seen in this report, 2018 has been a very fruitful year for the TU Delft Data Stewardship programme and with a full team of Data Stewards from the beginning of 2019, we expect 2019 to be even more productive. The faculty policies are expected to be rolled-out and published 2019. As one of the requirements of the policy is all PhD candidates starting from 2019 to attend data management training, currently the Data Stewards are busy with the development of a dedicated training suitable for the disciplinary needs of the PhD candidates. For this, the Data Stewards are in close contact with the central and faculty graduate schools, PhD councils and colleagues from TU Delft Library.

We already have three events planned in 2019: a seminar titled as Limits of Reproducibility: Strategies for Transparent Qualitative Research which will be followed by a hands-on workshop about Managing Qualitative Data for Sharing and Transparency on 28 January, open science seminars kick off on 27 February and a seminar on publishing reproducible research on 16 May.

Additionally, we will also have a one-day event for all TU Delft’s Data Champions,
one workshop on working with software and High Performance Computing (HPC), a conference on collaboration with industry and open science and two more software carpentry workshops.

In addition, a dedicated blog post about out plans for 2019 is going to be published soon, so watch this space!

Researcher Engagement with Data Management – What Works?

rda_logotype_hight

Written by: Maria Cruz and Julien Colomb


A new RDA project, under the umbrella of the Libraries for Research Interest Group and counting with the help of 29 volunteers from three continents, seeks to collect case studies from organisations around the world on how to engage researchers with research data management.

Collectively, our group have put together a survey, now open for contributions, which allows participants to share their stories and approaches for increasing engagement with research data management among researchers. The results from this survey, including the data, will be shared widely with the community in the form of an open book. The goal is to assemble a wealth of information and resources that can be used by institutions to select the methods that are most suitable for their settings.

The importance of research data management has been well emphasized over the last few years, particularly by research funding agencies, universities, and other research and academic institutions. However, the discussions around this topic have often been led by librarians and data professionals, and researcher engagement has been largely limited to those researchers who are already interested in the topic. In order to achieve global cultural change in data management, researchers need to be motivated and properly recognised for good data stewardship efforts. This is not an easy task.

Many organisations have developed dedicated programmes aiming at greater researcher engagement with research data. Examples include the Data Champions initiative at the University of Cambridge, Data Conversations at the University of Lancaster, the Data Stewardship programme at TU Delft, and the Open Data Champions initiative of SPARC Europe. In addition, some institutions, such as the University Medical Centre Utrecht and the Berlin Institute of Health, decided to change the way in which researchers are rewarded.

However, do we know how successful these programmes are in achieving cultural change? And what about their costs and benefits? Are some programmes more suitable than others for certain types of institutions? Are there other strategies out there that achieve similar results with less effort? These are some of the questions this project is trying to address.

Research data management professionals spend a considerable amount of their time doing outreach, teaching, and otherwise engaging with researchers about research data management. Understanding what we can learn from each other and how to exchange practices more effectively are two very important goals of the project.  

The case study collection, review and editing are being led Iza Witkowska, a Data Consultant from the University of Utrecht in the Netherlands, together with Andrea Medina-Smith from the USA and Elli Papadopoulou from Greece. They count with the help of 15 enthusiastic volunteers for these tasks. The first project update will be presented at the RDA Thirteen Plenary Meeting in Philadelphia in April 2019.

This blog post is distributed under a CC-BY 4.0 licence.


RDA Researcher Engagement Project

Steering Board

Lauren Cadwallader, Julien Colomb, University of Jena, Maria Cruz, Mary Donaldson, Lambert Heller, Rosie Higman, Elli Papadopoulou, Vanessa Proudman, James Savage, Marta Teperek

Project group members

Helene N. Andreassen, Daniel Bangert, Miriam Braskova, Lauren Cadwallader, John Chodacki, Julien Colomb, Philipp Conzett, Maria Cruz, Mary Donaldson, Biswanath Dutta, Esther Fernandez, Joshua Finnell, Raman Ganguly, Patricia Henning, Amy Hodge, Stein Høydalsvik, Greg Janée, Lynda Kellam, Gabor Kismihok, Iryna Kuchma, Narendra Kumar Bhoi, Young-Joo Lee, Leif Longva, Andrea Medina-Smith, Solomon Mekonnen, Remedios Melero, Rising Osazuwa, Elli Papadopoulou, Fernanda Peset, Josiline Phiri, Piyachat Ratana, Gerry Ryder, James Savage, Souleymane Sogoba, Magdalena Szuflita-Żurawska, Ralf Toepfer, Ellen Verbakel, Irena Vipavc Brvar, Jacquelynne Waldron, Anna Wałek, Yan Wang, Iza Witkowska, Joanne Yeomans

Data Champion kick off meeting

On the 14th of December 2018 the Data Champion Kick off Meeting took place at TU Delft. After an update on the Data Champion programme by Yasemin Turkyilmaz-van der Velden, the Data Champion Community Manager, the Data Champions took charge of the meeting by presenting their research, focussing on their research data management practises. In between these presentations there was room for networking activities, to stimulate interaction between Data Champions across the different departments and faculties. Here the researchers learned from each other and gained new insights for their own research data management.

Data Champions

Data Champions are researchers that are practising and advocating good research data management and are sharing their experiences and tips with their group/department members. Data Champions can help their Faculty’s Data Steward with the disciplinary specific practices of Research Data Management. In return, the Data Champion programme offers (international) network and funding opportunities, trainings and workshops, and increased visibility to researchers. Being a Data Champion is a chance to be recognised for your leading role in research data management in your department and faculty. For more information on each Data Champion, or information on how to become a Data Champion, please visit the Data Champion page.

Mark tweeting on the Data Champion meeting at TU Delft.

Data and code in waterworks research – talk by Mark van Koningsveld

Mark van Koningsveld, Data Champion of the Faculty of Civil Engineering and Geosciences

Mark van Koningsveld, Data Champion of the Faculty of Civil Engineering and Geosciences and chair of the section Ports and Waterways, presented his research on waterworks in the Netherlands. He examines the spatial planning of regions where logistical change in the use of waterways is foreseen. He presented an example of his research on the water network of Amsterdam, which helps to understand how the traffic network would change if certain canals would be closed off for construction work. Mark designed software to see how the planning of the reconstruction work would affect the water traffic. The interoperable nature of software allows its re-use to study other case studies, such as the short term effects of the drought that affected the Dutch water networks this summer, or the effects of long term climate change on the logistics of waterworks.

The integrated and interoperable approach that the software offers contrasts with the field’s traditionally linear approach, which can be lengthy and slow. Software enables a parametric design approach, allowing for immediate feedback on multiple aspects under study, such as production costs, different types of vessels and traffic load on the network. To move the research field forward Mark thinks that software should play a more important role, which is why he set up the Ports and Waterways coding lab for his students. He encourages other researchers to set up their own coding labs, as the facilities for this are available at TU Delft. Mark thinks that there is still a lot to be gained from collaborative research within faculties and across the faculties. These unlikely interactions will result in new ways of problem solving.

Time to share is now! Talk by Anton Akhmerov

Anton Akhmerov, Data Champion at the Faculty of Applied Sciences

Anton Akhmerov, Data Champion at the Faculty of Applied Sciences in the Department of Quantum Nanoscience, continued the discussion on the use of software in research. Currently, the publication of papers, data and software are evaluated differently by the research community. Publications are the most visible output of research, followed by research data, and then finally there is software. Software is a difficult concept to grasp, as not every researcher knows how coding works or how to properly apply coding skills in daily research practices. This gap in knowledge is further complicated by the current policy of the TU Delft requiring researchers to fill out an invention disclosure form before  publishing code or software openly at Github. Anton argues that training should be available for all PhDs and Postdocs in order to properly use platforms like Github, such as the software carpentry workshop mentioned above and the programming course Anton organises, but also in the TU Delft Graduate School programme. He thinks researchers should work together to discuss their problems,learn from each other and to familiarise themselves with the available tools.

Gary Steele, another Data Champion at the Faculty of Applied Sciences

Anton and Gary Steele, another Data Champion at the Faculty of Applied Sciences, worked on an open data policy for the Department of Quantum Nanoscience, in which they defined two levels of data sharing: level 0 and level 1. Level 0 means one uploads the numerical data as shown in the figures in a format that is readable to others. Since the data is already used to generate the figure anyway, this should not cost any additional researcher’s time. Level 1 is the publication of the raw data and scripts that underlie the data processing chain which produced the published data. Level 1 data is available in this publication where all the raw data (python scripts and notebooks) is available in Zenodo, a research repository. Level 1 does not mean that all the data collected during the research should be made available. Gary argues that people should think and discuss about what data should be shared, as researchers can get stuck in ‘molasses of useless documentation’ if everything is required to be shared. Anton argues that there is a lot of room in between the current practice of keeping data to ourselves and a possible future where all the data could be shared. Anton thinks that the time to start sharing is now, as the pressure starts to build up from the funding agencies to share research data. To see if the policy has an effect, Gary and Anton will monitor the latest publications and offer support to their colleagues if required.

Going beyond data

The important role of software and coding in research was a recurrent topic of the Data Champion Kick Off meeting. To support researchers to work with code and software, TU Delft hosted its first software carpentry workshop on 29 November 2018 and will organise more in the future. Additionally, Data Steward Kees den Heijer will organise code consultancy walk-in  hours for researchers to support them with their coding and software needs. The first session will take place on the 24th of January from 9:00-11:00 in CEG 2.66 and is open to researchers of all the faculties.

Data Champions networking at the Data Champion Kick Off Meeting

Future of the Data Champion programme

The Data Champions that attended the kick off meeting were enthusiastic about the informal and interactive setting of the meeting, allowing for interesting discussions and the generation of new ideas, as well as providing network opportunities. For future meetings they want case study specific meetings to discuss common problems faced, perhaps even in the form of regular common interest meetings. They were also interested to meet representatives from the various support services of TU Delft, such as the Data Protection Officer, and they were looking forward to the attendance of Rob Mudde, member of the TU Delft Executive Board, to attend the next Data Champion meeting. A next Data Champion meeting is planned for spring 2019 and the Data Champion network is still open to new members.  

Many precious first time experiences, thanks to RDA Europe Early Career Programme

Yasemin Turkyilmaz-van der Velden and Marta Teperek had the opportunity to represent TU Delft at the International Data Week 2018 in Gaborone, Botswana. Yasemin has been awarded the RDA Europe Early Career grant to attend the conference and wrote a blog post about her experiences which was originally published on RDA’s website and can be found below.


I am a Data Steward at the TU Delft Faculties of Applied Sciences, and Mechanical, Maritime and Materials Engineering. At the same time, I am a PhD candidate of Erasmus MC Rotterdam and I am writing my thesis which is about UV-induced DNA damage repair in mammalian cells. I work as a Data Steward since March 2018 and it has been a very enjoyable experience to join TU Delft as well as the Open Science and FAIR data community during the last 8 months.

This was my first RDA plenary and SciDataCon conference, and my first time in Africa. To be honest, I did not know much about either Botswana or RDA and SciDataCon, and I can tell that it has been a very pleasant experience to observe the beautiful nature and culture of Botswana and join RDA and SciDataCon. To comment a bit more on the former, although I could not visit the Okavango Delta which I earlier watched in such amazement in Planet Earth II, it was such a nice experience to visit the Mokolodi Nature Reserve and Gaborone Game Reserve and get the chance to see giraffes, zebras, rhinos, warthogs and others which I can normally only see in zoos or documentaries (and Lion King of course). I was also impressed with the friendliness of the local people which included taxi drivers turning into local guides.

The rhino mother and the baby, and the curious giraffe

To comment on the latter, I was really impressed with the RDA, WDS and CODATA communities and how everyone was so knowledgeable and at the same time so friendly and willing to help and collaborate. Being one of the 8 TU Delft Data Stewards and a member of TU Delft Research Data Management (RDM) team which is around 15 people (not even mentioning the relatively high number of RDM experts within the Netherlands), I should be probably one of the last persons to complain about not having enough people around me to discuss RDM. Yet, it was so nice to be in an environment with so many RDM experts (or early career as me) coming from all over the world, joining to Interest Group (IG) and Working Group (WG) meetings with them, having interactive discussions during the sessions and continuing with the fruitful discussions during the coffee breaks, lunches, dinners or drinks by the pool. It was also a very nice opportunity to meet colleagues from Botswana and other African countries and hear about their experiences.

The local hosts thought of many nice details to give the IDW 2018 a Botswanian touch. It was clear that IDW 2018 was taken seriously in Gaborone. The opening ceremony included the National Anthem and the speech of the President of the Republic of Botswana. Entertainment was not left out, the participants could enjoy shows of the Traditional troupe and the Marimba Band.

The President of the Republic of Botswana and the cheerful Marimba Band

Being an RDA Europe grant winner, I was assigned to take part in the Joint Meetings of WG FAIRSharing Registry and Data Policy Standardisation and Implementation IG and IG Health Data, IG Ethics and Social Aspects of Data, WG Blockchain Applications in Health. My tasks included note taking and helping organizers with their activities during the meetings. I was assigned to groups where my interests lie and I think that it was a great opportunity to get actively engaged in the activities of these IGs and WGs. I would like to thank the meeting organizers for being so friendly and welcoming.

Additionally, I was encouraged to present a poster during the poster session which I always see as a great way to engage informally and interactively with participants during meetings and conferences. I have presented the poster titled as “Data Stewardship at Delft University of Technology” which I prepared together with my TU Delft Data Steward colleague Yan Wang reusing the materials generated by other TU Delft Data Steward colleagues. The poster can be found in zenodo and is available for reuse with a CC BY 4.0 license.

I also gave an oral presentation during the SciDataCon session “Motivations and recognition for good data stewardship”. My presentation was based on the abstract written together with my colleagues Maria Cruz and Marta Teperek and this abstract got us invited to submit a related paper to the Data Science Journal Special Collection for SciDataCon 2018. The presentation can be also found in zenodo with a CC BY 4.0 license.

To give a bit more insight into this session, it was proposed and chaired by Marta Teperek, and it was about different efforts and approaches taken in different institutes to engage with researchers and achieve cultural change. The session started with the presentation of Rosie Higman from Manchester University titled as “Stewards, Champions or Advisors? An overview of institutional Research Data Management support structures” which showcased her impressive work comparing different approaches taken in different institutions. I see her work as an invaluable resource especially for all those who are still not sure where to start and which approach to follow. Then I continued with my presentation “Data Stewardship at Delft University of Technology” and it was followed by the talk of James Savage “Establishing, developing, and sustaining a community of Data Champions”. James Savage is a researcher and a Data Champion at the University of Cambridge. Although both I and Rosie already talked about Data Champions, having a real Data Champion in the room convinced the audience that Data Champions are not some imaginary characters that the research data supporters propose, but they are really willing to become advocates for good RDM practices in their research groups and departments to achieve the desired cultural change. After James, Raman Ganguly from the University of Vienna gave a presentation titled as “Building sustainable networks for data management” and talked about their approach where they did not have any Data Stewards or Champions.

Finally, I was also encouraged by the RDA Europe Early Career Programme to join the IG Early Career and Engagement session. The aim of this IG is to give the opportunity to early career researchers and professionals network among themselves and receive mentorship from senior RDA members. In the current RDM landscape, there are so many things changing and new things are being regularly introduced. Considering my role as a research supporter, I can only be helpful to the researchers if I can keep myself updated with all these changes. Again, I have been lucky from the beginning to join to the TU Delft RDM team and especially to have such an experienced person as Marta Teperek (who is a mentor for this IG), who was always there for all my questions. But I can imagine that not everyone is as lucky as me and therefore I see great benefit in this IG for many early career researchers and professionals.  

I would like to end my post with the monkeys enjoying the coffee table on the last day of the conference and I would like to present my sincere thanks to RDA Europe Early Career Programme for giving me the opportunity to join IDW 2018. I would also like to thank Marta Teperek, the Data Stewardship Coordinator and André Groenhof and Marja van den Bergh, the executive faculty secretaries of Mechanical, Maritime & Materials Engineering, and Applied Sciences for being ever so supporting and encouraging.

 

Continue reading

4TU.Centre for Research Data partners with The Carpentries: Impressions from the first workshop at TU Delft

code-2558224_1280.jpg

Written by Shalini Kurapati and Marta Teperek


Training needs: research computing skills for open science

In addition to good data management, software sustainability is important for open science.

In accordance with the survey conducted by the Software Sustainability Institute in 2014, 7 out of 10 researchers rely on code for their research. Sharing research data without the supporting code often makes research impossible to reproduce. Good documentation and version control have been highlighted as major contributors to sustainable software. In addition, earlier workshops and survey results indicated that researchers need training on good code writing and code management practices and version control.

Similarly, TU Delft-wide survey on data management needs revealed that 32% of researchers were interested in training on version control and 18% specifically in software carpentry workshops.

Thus, 4TU.Centre for Research Data made a strategic decision to partner with The Carpentries and became a Silver Member of the organisation.

What are The Carpentries?

The Carpentries “teach foundational coding, and data science skills to researchers worldwide.” That’s a community-based organisation, which maintains and develops curricula for three different types of workshops: software carpentry, data carpentry, and library carpentry. Detailed and structured lesson plans are available on GitHub and they are delivered by a network of carpentry instructors.

An important element of The Carpentries is that in order to deliver a workshop, instructors need to be certified. The certification process puts a particular emphasis on the pedagogical skills of the instructors.

First software carpentry at TU Delft

TU Delft hosted the first software carpentry workshop on 29 November 2018 as a pilot before officially joining The Carpentries. We had around 30 researchers participating (and another 45 on the waiting list!). The participants were from four faculties at TU Delft: Civil Engineering and Geosciences, Applied Sciences, Technology Policy & Management, and Architecture and Built Environment. We had three instructors and four helpers in the room.

Capture.PNG

The GitHub pages with the lesson materials are publicly available and can be found here: https://mariekedirk.github.io/2018-11-29-Delft/ All participants were asked to bring their laptops along and to install some specific software. No prior programming knowledge was required. Collaborative notes were taken with Etherpad.

During the workshop, participants downloaded a prepared dataset and they worked with that dataset through the two days. They learnt task automation using Unix shell, version control using git, and python programming using jupyter notebooks.

Feedback

The Carpentries have a special way of organising feedback. Participants receive red and green post-it notes and use them to indicate problems / completion of tasks during the whole course. Similarly, after the end of each day, the participants are asked to indicate all the plus sides and negatives of the workshop on green and red post-it notes, respectively.

The feedback from the participants after the workshop helped us evaluate the training. The participants were overwhelmingly appreciative of the instructors and helpers and seem to have enjoyed the training. Some of the participants felt that the pace of the workshop was fast and they did not have time to experiment with the data set. Some others wished to get a more personal approach and to actually get an opportunity to work with their own disciplinary datasets.

Plans for the future

The waiting list for the workshop was very long and we had to disappoint more than 45 researchers who didn’t manage to get their spot on the day. In addition, faculty graduate schools have been willing to give course credits for PhD students who attend this workshop, which made the course even more attractive to attend for PhD students. Therefore, to meet the demand, we are planning to organise four more workshops in 2019: two workshops at TU Delft, one in Eindhoven and one in Twente. We will continue to monitor the number of interested researchers and if the need arises, we might consider scheduling some additional courses.

In addition, to increase our capacity in delivering carpentry training, some of the TU Delft’s data stewards and data champions will attend the training to become instructors. We hope to have this instructor training organised in April.

To address the feedback about the pace of the course, we will be more selective and include fewer exercises in our future workshops to ensure that the participants get the chance to experiment and play with their datasets and scripts.

In order to provide some more tailored support to researchers who have started to code but need some additional support to make it work, or who might have attended a carpentry workshop but are not sure how to apply the learning into practice, we will host dedicated coding walk-in hours consultations starting in January 2019.

So… watch out for the next carpentry workshop – scheduled for Spring 2019!