Written by Shalini Kurapati and Marta Teperek
Training needs: research computing skills for open science
In addition to good data management, software sustainability is important for open science.
In accordance with the survey conducted by the Software Sustainability Institute in 2014, 7 out of 10 researchers rely on code for their research. Sharing research data without the supporting code often makes research impossible to reproduce. Good documentation and version control have been highlighted as major contributors to sustainable software. In addition, earlier workshops and survey results indicated that researchers need training on good code writing and code management practices and version control.
Similarly, TU Delft-wide survey on data management needs revealed that 32% of researchers were interested in training on version control and 18% specifically in software carpentry workshops.
What are The Carpentries?
The Carpentries “teach foundational coding, and data science skills to researchers worldwide.” That’s a community-based organisation, which maintains and develops curricula for three different types of workshops: software carpentry, data carpentry, and library carpentry. Detailed and structured lesson plans are available on GitHub and they are delivered by a network of carpentry instructors.
An important element of The Carpentries is that in order to deliver a workshop, instructors need to be certified. The certification process puts a particular emphasis on the pedagogical skills of the instructors.
First software carpentry at TU Delft
TU Delft hosted the first software carpentry workshop on 29 November 2018 as a pilot before officially joining The Carpentries. We had around 30 researchers participating (and another 45 on the waiting list!). The participants were from four faculties at TU Delft: Civil Engineering and Geosciences, Applied Sciences, Technology Policy & Management, and Architecture and Built Environment. We had three instructors and four helpers in the room.
The GitHub pages with the lesson materials are publicly available and can be found here: https://mariekedirk.github.io/2018-11-29-Delft/ All participants were asked to bring their laptops along and to install some specific software. No prior programming knowledge was required. Collaborative notes were taken with Etherpad.
During the workshop, participants downloaded a prepared dataset and they worked with that dataset through the two days. They learnt task automation using Unix shell, version control using git, and python programming using jupyter notebooks.
The Carpentries have a special way of organising feedback. Participants receive red and green post-it notes and use them to indicate problems / completion of tasks during the whole course. Similarly, after the end of each day, the participants are asked to indicate all the plus sides and negatives of the workshop on green and red post-it notes, respectively.
The feedback from the participants after the workshop helped us evaluate the training. The participants were overwhelmingly appreciative of the instructors and helpers and seem to have enjoyed the training. Some of the participants felt that the pace of the workshop was fast and they did not have time to experiment with the data set. Some others wished to get a more personal approach and to actually get an opportunity to work with their own disciplinary datasets.
Plans for the future
The waiting list for the workshop was very long and we had to disappoint more than 45 researchers who didn’t manage to get their spot on the day. In addition, faculty graduate schools have been willing to give course credits for PhD students who attend this workshop, which made the course even more attractive to attend for PhD students. Therefore, to meet the demand, we are planning to organise four more workshops in 2019: two workshops at TU Delft, one in Eindhoven and one in Twente. We will continue to monitor the number of interested researchers and if the need arises, we might consider scheduling some additional courses.
In addition, to increase our capacity in delivering carpentry training, some of the TU Delft’s data stewards and data champions will attend the training to become instructors. We hope to have this instructor training organised in April.
To address the feedback about the pace of the course, we will be more selective and include fewer exercises in our future workshops to ensure that the participants get the chance to experiment and play with their datasets and scripts.
In order to provide some more tailored support to researchers who have started to code but need some additional support to make it work, or who might have attended a carpentry workshop but are not sure how to apply the learning into practice, we will host dedicated coding walk-in hours consultations starting in January 2019.
So… watch out for the next carpentry workshop – scheduled for Spring 2019!
Post by Amit Gal, Alastair Dunning and Nicole Will
The research organisastion Ithaka S+R recently issued the report “Scholars ARE Collectors: A Proposal for Re-thinking Research Support”. The report takes a user-centred approach when trying to understand what would be a good way to support researchers in the future, and outline possible places to invest.
It makes the case that researchers are, in fact, collectors, and that their (often massive) collections vary widely in form across different disciplines. All of these collections however, are not properly managed – which is quite understandable, as “collecting” requires a different set of skills and tools than “researching”.
From the context of our own research support services at TU Delft, we made some specific points in observation :
1. The report has a great focus on the right point of view – the user’s point of view. If we at TU Delft want to support the researcher better, we must understand her better. That means more than just knowing what she does, it means having an empathetic understanding of why she does it and who she is. Understanding is more than just talking.
3. The report points to four different stakeholders that support scholarly collecting – funders, open data advocacy groups, external tool and service providers, and academic institutions. It might be useful to realize that we, as the TU Delft library, represent two of these stakeholders – we are the academic institution, naturally, but we are also the FAIR data advocacy group. Is it possible that these two sometimes clash? Could one role impede the other, and if so – how should we address it?
3. The journey of understanding our users better, improving our services and creating new, better ones – is a journey we cannot be taking on our own. At the very least, ICT and the researcher groups must be partners here. So we should get better at collaborating with these, and other, parties around us.
4. Some of the language (eg, ‘scholars’, ‘personal collections’) and evidence here is drawn from the humanities and doesn’t feel right in the context of a technical university. The report misses some of the language and developments occurring in a technical university (eg., there is no mention of data science, data stewards etc, and the importance of writing code or running simulations is underplayed)
5. Our instinct is that scientists (as opposed to humanities scholars) have fewer ‘personal collections’ and more ‘group collections’. E.g. A team gets access to data, or a department collects data, or a consortium writes a proposal, or a group writes a paper. While individual roles always play a part, access to these different outputs is managed at a team level.
6. Many of the key points are similar to what we know here at TU Delft, eg about fear of being scooped or the time taken to document data. The metaphor of collection is also important, as it emphasises the emotional ownership scientists feel about their outputs.
7. The conclusions of the final page is definitely worth holding on how do we (and by that I mean not just the library but all the relevant support service) offer the kind of support the researcher needs throughout her workflow (not just the start and end). The goal is not Open Science per se, but getting to Open Science by responding to specific user needs.
Authors: Heather Andrews, Maria Cruz, Angus Whyte, Yasemin Turkyilmaz-van der Velden, Shalini Kurapati
To read Part 1 of the blog post follow this link.
Researchers at all levels should be equipped with skills relevant to open science and FAIR data, and the practice of these skills should be effectively rewarded and recognised. This much is clear and has been recently highlighted in the “Turning FAIR into reality” Report of the European Commission FAIR Data Expert Group. Indeed, that report states “there is an urgent need to develop skills in relation to FAIR data” and “metrics and indicators for research contributions need to be reconsidered and enriched to ensure they act as compelling incentives for Open Science and FAIR.”
To change the academic rewards system, it is necessary to define and agree upon the skills researchers need to have at different stages of their careers. This was the goal of the workshop “It’s time for open science skills to count in academic careers” held at TU Delft on 26 September 2018. This post forms the second part of the report of the the workshop. Here we present the outcomes of workshop, based on the interactive group work described below. The results of this work will be applied in EOSCpilot, which is laying the groundwork for skills development in the European Commission’s European Open Science Cloud.
Overview of the hands-on workshop
The participants were divided into four groups. Each group focused on a specific career level according to the European Commission’s framework for research careers:
- R1: First Stage Researcher (up to the point of PhD)
- R2: Recognized Researcher (PhD holders or equivalent who are not yet fully independent)
- R3: Established Researcher (researchers who have developed a level of independence.)
- R4: Researchers – Leading Researcher (researchers leading their research area or field)
The above mentioned groups were led by Yasemin Turkyilmaz, Ellen Verbakel, Maria Cruz and Alastair Dunning, respectively. The registered participants of the one day event included researchers from all career levels, librarians, data stewards, and policymakers. However R4 researchers were unavailable for the hands-on workshop.
Each group received a list of nine pre-defined open science skills together with a detailed explanation for each skill. The group activity for each of the groups was divided into two parts. In the first part, the groups were asked to shortlist a maximum of 4 skills most relevant to their respective career level (R1-R4). Subsequently, for each skill, they wrote down on post-it notes: 1. Why is this skill relevant to researchers at this career level? 2. What would be the evidence that researchers at this career level have these skills and can apply it in practice? (in other words: what does a person applying the skill do?) 3. What support (from support staff, service providers) will researchers need to apply this skill?
After displaying their ideas on whiteboards, the groups were given 5 minutes each to present a summary of their main findings to the other groups. This was followed by a short discussion followed with questions from all the groups. Below is the summary of the findings of each of the four groups.
R1 group activity summary:
The group focused on early career researchers. This group of researchers is dependant on the researchers at higher positions in the academic ladder. At the same time, often this group of researchers is perceived as being the ‘bold’ ones able to take initiatives and amenable to change. It was also recognised that R1 researchers can benefit greatly from more training and more information on open science possibilities. The four skills shortlisted by this group were:
1) Adherence to the FAIR data and code principles during and after research. The group added the term ‘code’ to this skill as well. The group reasoned that the FAIR principles represent what is necessary to have sustainable sharing and archiving of both data and code, which is important for verifiability of research. The group recognised that in order to adhere to the FAIR principles, early career researchers have to receive training as an inherent part of the curriculum. They also stressed that good supervision and getting good examples would facilitate this skill.
2) Securing funding for open science/support. The group reasoned that receiving specific training on awareness of funding opportunities for Open Science (e.g. funding for Open Access publishing) as well as preparation to secure such grants would support early career researchers to be more independent from their respective supervisors and practise open science more freely. They proposed that this could be achieved in collaboration with grant support offices and graduate schools.
3) Awareness and adherence to relevant ethical and legal policies. This skill was found to be important to overcome fear and uncertainties about ethical and legal requirements. This can make early career researchers more confident when discussing with senior colleagues the parameters these requirements set around sharing their research outputs. The group proposed Q&A catalogues which could help early career researchers better understand complex terms such as codes of conduct, legal terms, informed consent etc.
4) Recognizing and acknowledging the contribution of others. The group found it important for early career researchers to know how to properly cite data, code and methods; and how to acknowledge collaborators, technicians in the lab, etc. The group argued that if people are properly acknowledged, then they are more willing to contribute again, which is a great source of motivation for early career researchers. They expect University and Faculty policies and training to be instrumental for this skill. These should also promote the standards for using persistent identifiers, and the CRediT taxonomy for acknowledging who has contributed what to a publication.
R2 group activity summary
This group saw R2 researchers as typically those at the postdoctoral level. The group thought that postdocs were seen as researchers who are not in charge of funding nor leading projects like type 3 and type 4 researchers. Postdocs were seen as researchers focused on making effective collaborations, and working on building up their reputation. The four skills proposed were:
1) Recognising and acknowledging the contribution of others. Recognition was perceived as the main driver for postdoctoral researchers, as well as for researchers working with postdocs. The group thought that researchers need a policy framework that enforces proper recognition and receive training on how to get recognition (e.g. setting up and using ORCID).
2) Making use of open data from others. The group considered this skill as important for verifiability, and as an effective way to start collaborations with other researchers. However, to put this skill into practice, researchers need to know how to search for datasets and assess their quality. Support staff should define the standards for high quality open data, provide support in data curation and give training to researchers on best open data practices.
3) Adherence to good code management practices. This skill was also considered important for verifiability purposes and to stimulate others to reuse the code. It is seen as a quality stamp for the respective code creator, which improves the researcher’s reputation. In order to get this skill, researchers would benefit from training on version control and on writing proper code documentation. It was also suggested that researchers could learn more about these matters from research software engineers.
4) Using or developing research tools open for reuse by others. The group felt that standardisation is crucial to enable effective collaborations. Thus, researchers should receive training on the use of platforms such as github and training on metadata standards.
R3 group activity summary
This group proposed 3 skills, unlike the other groups that considered the maximum number of 4 skills (proposed by the workshop organizing committee). Largely because it was felt that the skill ‘being a role model in practicing open science’ would by definition cover most of the practical skills on the list. The group considered this to be the most important skill for R3 researchers. These researchers are already established in their careers and their fields (already developing and leading some projects), but are still very much involved in the day-to-day practice of research (e.g. they still acquire data or write software). As such, they can lead by example and influence not only earlier career researchers, particularly those they directly supervise, but also more senior colleagues. R3 researchers are very aware of the obstacles which researchers encounter when trying to change how research groups work. The proposed skills were the following:
1) Being a role model in practising open science. Stage 3 researchers were seen as researchers who are still very active in research, and but who also have a close relation with senior colleagues, the researchers in higher positions. This gives them the opportunity to establish how researchers are evaluated within their team. Stage 3 researchers have an influential role within their team, e.g. in the hiring and promotion decisions within the team. In order to do this, it was acknowledged that stage 3 researchers need support from the R4 researchers. This was a key point of discussion, because even though stage 3 researchers are seen as big influencers in the academic ladder, they still depend on the stage 4 researchers and funding committees. In relation to this skill, it was also felt that R3 researchers should not only lead by example in the way they practice open science, but should also directly influence others by speaking about it. In short, practice what they preach, and preach what they practice.
2) Securing funding for open science/support. Stage 3 researchers are involved in hiring people and applying for funding. When applying for funding for example, they should be explicit about how open science will be carried out throughout the project. When hiring, they should include open science requirements in the hiring criteria. The group also recognised that for this to happen effectively, funders need to be willing to provide funding to pay for the costs of open science activities associated with projects, and research grant offices need to advise researchers on how to include these costs in their grant applications.
3) Recognizing and acknowledging the contribution of others. The group felt this is an important area where R3 researchers can lead by example. In addition, R3 researchers are usually still building up a network and collaborations, and to do this effectively, recognition is always necessary.
R4 group activity summary
The group considered project leaders as researchers who are usually less involved in the day-to-day research practices of research. As project leaders they may be in charge of project management and involved in designing research projects, policies and regulations, vision and strategy. Nevertheless, it was acknowledged that researchers at this senior career stage still needed substantial support from their institution in order to put their vision into practice. Having this in mind, the group shortlisted the following skills:
1) Being a role model in practising open science. As project leaders, stage 4 researchers can influence a broader community (not only the researchers in their project, but also funding committees, other project leaders, executive boards, etc.). They can promote change in daily practices, but also in research policies. Stage 4 researchers could become open science role models by promoting and discussing it with their network. Advocating for open science during meetings and conferences; participating in policy development, and changing the practices within their respective groups. In order to do this, they need platforms and tools at their institutions, they also need recognition for advocating for open science, and they need support from their team members.
2) Recognising and acknowledging the contribution of others. Just like in the other groups, this group found it relevant for researchers to recognise everyone’s contribution in a project; recognising not only the scientific staff but also the support staff (laboratory technicians, data managers, etc.). In order to do it, the careers of the support staff should also be recognised for example, by creating new job profiles for data managers, data stewards, etc.
3) Developing a vision and strategy on how to integrate OS practices in the normal practice of doing research. This skill was found to help create the link between principles and actual practice. Stage 4 researchers usually work on the ‘big pictures’ of research, and thus, they have to have a vision and strategy to steer their research and project members. In doing so, they should have the advice of support staff, to ensure the feasibility of their vision. They should also have clear information about who does/can-do what within their institution, and about financial possibilities for them to turn their vision into reality.
4) Awareness and adherence to relevant ethical and legal policies. This skill was seen as relevant because senior researchers are accountable for their project team’s behaviour. Any risk of ethical and/or legal infringement will jeopardise the reputation not only of the project leader, but also of their entire group and, quite likely, the institution they belong to. Thus, it is important for stage 4 researchers to establish procedures dealing with ethical and legal issues. In order to properly do this, the institution should provide researchers with integrated support. The role of each support staff member should be well-defined (and well-informed to the researcher), there should be effective communication within the support staff, and the workflows through which the researchers can receive support need to be clearly stated.
Overall workshop summary
Overall, all groups stressed the importance of peer to peer learning: everyone can contribute to changing cultures and daily practices. All groups also agreed that proper infrastructure and policy support from institutions is required for researchers to truly implement open science practices.
Finally, recognition was seen as one of the main drivers for both scientific and non-scientific staff participating in a research project and all groups stressed the importance of proper recognition of open science practices.
The ideas and discussion generated during the workshop have given us a rich corpus of information to reflect on the workshop objectives and to envision a road map for the future to implement these ideas and discussions. The workshop outputs will be applied in EOSCpilot to help focus its Skills Framework on the key skills identified, the rationale for these, and the mapping of skills to researcher career stages together with the support requirements. Watch this space for progress and updates.
And finally a motto for everyone: change is in your hands! Everyone can contribute to change of practice in their own spheres of influence.
Authors: Shalini Kurapati, Marta Teperek, Maria Cruz, Angus Whyte
Disclaimer: In the spirit of openness and transparency, we would like to share that Shalini Kurapati wrote parts of this blog post based on the zenodo record of the presentations even though she wasn’t present during the event. Her account was verified by the remaining authors who were present.
To read Part 2 of this blog post follow this link.
Open Science is not always easy – skills are urgently needed
Open science is becoming a ubiquitous and recurring theme in the current academic environment. Researchers are increasingly expected to publicly share their research outputs (data, code, models etc.) as well as their publications. This often requires considerable effort from researchers to manage and curate their research outputs to make them shareable. But are these efforts appropriately rewarded? Emphasising the number of publications in high impact factor journals as the only valuable metric for academic promotion and hiring won’t motivate researchers to practise open science.
There is a lot of interest in changing the reward system to better align it with the actions researchers are expected to take towards more open research practices, for instance, the OSPP Rewards WG. Making sure that researchers have the right skills to do that is the other side of that coin. To change the rewards system, we have to understand and identify the skills researchers should be rewarded, and recognise that these may change at different stages of their academic careers. This was precisely the goal of the workshop on 26 September 2018 that we organised at TU Delft jointly with the EOSCpilot. The EOSCpilot project is laying the groundwork for the European Open Science Cloud, and wants to offer a framework for institutions and others to develop the skills needed for researchers, data stewards, and others who support research to help put open science into practice.
The workshop was aptly titled “It’s time for open science skills to count in academic careers” (#openskills18). The workshop format combined presentations on related topics with interactive group work in the afternoon. In this post, we summarise the presentations and in a separate blog post we’ll present the outcomes of the workshop and will reflect on the key findings /thoughts on future steps.
The aim and format of the workshop was presented by Valentino Cavalli of LIBER and EOSCPilot. In his welcome note, Mr.Cavalli explained barriers to open science in the European and wider academic context. These barriers include a culture of disincentives, fragmentation between infrastructures, interoperability issues and access to computational resources. He highlighted that the workshop would focus on the culture of disincentives, which has to be changed such that researchers at all careers levels are equipped with relevant skills and suitably rewarded for putting them into practice
The opening talks were delivered by Ms. Anne de Vries (PhD students Network Netherlands), Prof. Bartel Van de Walle (TU Delft) and Mr. Rinze Benedictus (UMC Utrecht).
Ms. Anne de Vries shared the perspectives of the eurodoc, the European council of doctoral candidates and junior researchers on open science policy and practices. She stated that it is important to identify and train open science skills for early career researchers based on their disciplines. Early career researchers should also be made aware on how to make their outputs FAIR and how open science skills will not only take science forward but also positively influence their careers. She also reflected that senior staff should support early career researchers in practising open science and thus also need appropriate education and training.
Prof. Bartel van de Walle spoke on the open science policies and practical examples from his domain of information management during humanitarian crisis response. He also presented the challenges of implementing open science due to the inertia in research institutions that are often resistant to change. He insisted that open science is not just a requirement of funding agencies but is the right way forward to democratise science and achieve the UN’s sustainable development goals. He also pointed out the waves of change and indicated an example of successful implementation of open science policies in practices like McGill University’s Neurological institute and hospital. He concluded his speech by saying that open science is just science done right.
Mr. Rinze Benedictus delivered a powerful message with his talk: institutions should not equate the impact of a research work to an impact factor of a journal. He displayed the reinforcing loop of how authorship in high impact journals is an incentive for researchers for receiving more funding and recognition and to continue with the cycle of publishing to increase citation scores. He showed the damning evidence from The Lancet and Nature about the reproducibility crisis in science due to the earlier said focus on publishing to establish scholarship. While referring to global initiatives such as the DORA to change attitudes among institutions and individual researchers, he gave a concrete example of how UMC Utrecht is implementing good practices in rewarding researchers. For instance, at evaluation meetings at UMC, a researcher would be asked “How did you arrive at your research question and what are your next steps”? Rather than the traditional “what is your measurable output”.
Dr. Simon Kerridge (CASRAI) gave a talk on the CreDIT taxonomy. The problem that CreDIT tries to solve is that the current authorship criteria in publication doesn’t give sufficient recognition for various contributions of researchers. In addition, authorship credit alone doesn’t support accountability for the research results. He stated that since science is increasingly a team effort, credit needs to be given where due to incentivise researchers for their unique contribution. He explained that the CreDiT taxonomy aims to offer a role based credit systems, where the contributors can assign themselves credit for 14 tasks: writing, supervision, review, data analysis, project management and so on. Finally, he presented the vision for the future of increasing the awareness of the CreDiT taxonomy and to create feedback mechanism to evaluate future versions and to link it platforms like ORCID and Crossref.
The closing remarks of the workshop were provided by Ms. Anette Björnsson (European Commission) and by Mr. Kevin Ashley (Digital Curation Center & EOSCPilot).
Ms. Anette Björnsson reflected on the current initiatives within the European Commission towards changing academic rewards. She highlighted the importance of several recent reports produced by EC Working Groups: Evaluation of Research Careers fully acknowledging Open Science Practices, the report on Next Generation Metrics and Turning FAIR Data into reality. All of them influence the current thinking at the European Commission and also help shape the mission and vision of the European Open Science Cloud. She also stressed that large collaborative efforts at the European level require cooperation and consensus between all EU Member States, which often require time and patience. The situation is no different when it comes to the implementation of policies and changing practices in open science: individual Member States are at different stages of implementation and have varying levels of infrastructure and personnel currently available to them. However, Ms.Björnsson ensured us that while sometimes slower than desired, change is coming. Given that EOSC is a collaborative, pan-European endeavour, the chances are that changes brought with the EOSC will also be more effective and sustainable long-term.
Mr. Kevin Ashley then continued reflecting on the discussions which took place throughout the day, and in particular, the points raised by researchers during the interactive workshop part. He stressed that the common priority to all researchers, regardless of their career stages seems to be to get the recognition they deserve for Open Science activities. He reflected that there are (numerous) barriers to practical implementation of Open Science and to rewarding those practising Open Science appropriately, but that these barriers should not stop anyone from changing the status quo. As Dr. Maria Cruz beautifully summarised in her tweet, based on Mr. Ashley’s words: It’s possible to change the academic rewards system. It’s possible for PhD students. It’s possible for senior researchers. And it’s possible for institutions.
The format, content and outcomes of the hands-on workshop during the event, together with some reflections and thoughts on next steps are published in a separate blog post.
- All presentations of the speakers can be viewed and downloaded here.
This blog post was written and originally published by Loek Brinkman on his own blog.
On the 26th of September, I participated in the event “Time for open science skills to count in academic careers!”, organised by the European Open Science Cloud Pilot (EOSCPilot) and the 4TU.Centre for Research Data. The goal was to define open science skills that we thought should be endorsed (more) in academic career advancement.
The setting was nice: we were divided in four groups, representing different stages of academic careers (from PhD to full professor) and discussed which open science skills are essential for each career stage. What I liked about the event was that the outcomes of the discussion were communicated to representatives of EOSCpilot and the European Commission. So I’m optimistic that some of the recommendations will, in time, affect European research policies regarding career advancement.
On the other hand, I think we might be skipping a step here. Open science is often talked about as a good thing that we should all strive for (in line with the (in)famous sticker present on many laptops of open science advocates: “Open Science: just science done right”), as though open science is a goal on itself. To me, this doesn’t make a lot of sense. There is no clear definition of open science. It is an umbrella term covering many aspects, e.g. open access, open data, open code, citizen science and many more. So, in practice, people use various definitions of open science that in- or exclude some of the aforementioned aspects of open science, and differ in how these aspect should be prioritised. That means that while many people are in favour of open science, they may disagree greatly on what they think should be addressed first and how.
I don’t see open science as a goal. I see open science as a means to achieve a goal. I think, we should first agree on the goal: specify what we want to change or improve. The way I see it, the goal is to make science more efficient – to achieve more, faster. Starting from this goal, several sub-goal can be defined, such as:
(1) making science more accessible,
(2) making science more transparent & robust,
(3) making science more inclusive.
Open science can be a means to achieve these subgoals. Depending on how you prioritise the subgoals, you might be more interested in (1) open access, (2) open data and code, or (3) citizen science, respectively.
It is not too difficult to come up with a list of open science skills for academics, and it would be awesome if those skills would be endorsed more in academic career advancement. But we first need to define the goals we want to achieve, before we can start to prioritise the means by which these can be achieved. If the endorsement of open science skills can be aligned with the overall goals, then we are well on our way to make science more efficient.
This blog post was originally published by the LSE Impact Blog.
Recommendations on how to better support researchers in good data management and sharing practices are typically focused on developing new tools or improving infrastructure. Yet research shows the most common obstacles are actually cultural, not technological. Marta Teperekand Alastair Dunning outline how appointing data stewards and data champions can be key to improving research data management through positive cultural change.
By now, it’s probably difficult to find a researcher who hasn’t heard of journal requirements for sharing research data supporting publications. Or a researcher who hasn’t heard of funder requirements for data management plans. Or of institutional policies for data management and sharing. That’s a lot of requirements! Especially considering data management is just one set of guidelines researchers need to comply with (on top of doing their own competitive research, of course).
All of these requirements are in place for good reasons. Those who are familiar with the research reproducibility crisis and understand that missing data and code is one of the main reasons for it need no convincing of this. Still, complying with the various data policies is not easy; it requires time and effort from researchers. And not all researchers have the knowledge and skills to professionally manage and share their research data. Some might even wonder what exactly their research data is (or how to find it).
Therefore, it is crucial for institutions to provide their researchers with a helping hand in meeting these policy requirements. This is also important in ensuring policies are actually adhered to and aren’t allowed to become dry documents which demonstrate institutional compliance and goodwill but are of no actual consequence to day-to-day research practice.
The main obstacles to data management and sharing are cultural
But how to best support researchers in good data management and sharing practices? The typical answers to these questions are “let’s build some new tools” or “let’s improve our infrastructure”. When thinking how to provide data management support to researchers at Delft University of Technology (TU Delft), we decided to resist this initial temptation and do some research first.
Several surveys asking researchers about barriers to data sharing indicated that the main obstacles are cultural, not technological. For example, in a recent survey by Houtkoop at el. (2018), psychology researchers were given a list of 15 different barriers to data sharing and asked which ones they agreed with. The top three reasons preventing researchers from sharing their data were:
- “Sharing data is not a common practice in my field.”
- “I prefer to share data upon request.”
- “Preparing data is too time-consuming.”
Interestingly, the only two technological barriers – “My dataset is too big” and “There is no suitable repository to share my data” – were among three at the very bottom of the list. Similar observations can be made based on survey results from Van den Eynden et al. (2016) (life sciences, social sciences, and humanities disciplines) and Johnson et al. (2016) (all disciplines).
At TU Delft, we already have infrastructure and tools for data management in place. The ICT department provides safe storage solutions for data (with regular backups at different locations), while the library offers dedicated support and templates for data management plans and hosts 4TU.Centre for Research Data, a certified and trusted archive for research data. In addition, dedicated funds are made available for researchers wishing to deposit their data into the archive. This being the case, we thought researchers may already receive adequate data management support and no additional resources were required.
To test this, we conducted a survey among the research community at TU Delft. To our surprise, the results indicated that despite all the services and tools already available to support researchers in data management and sharing activities, their practices needed improvement. For example, only around 40% of researchers at TU Delft backed up their data automatically. This was striking, given the fact that all data storage solutions offered by TU Delft ICT are automatically backed up. Responses to open questions provided some explanation for this:
- “People don’t tell us anything, we don’t know the options, we just do it ourselves.”
- “I think data management support, if it exists, is not well-known among the researchers.”
- “I think I miss out on a lot of possibilities within the university that I have not heard of. There is too much sparsely distributed information available and one needs to search for highly specific terminology to find manuals.”
It turns out, again, that the main obstacles preventing people from using existing institutional tools and infrastructure are cultural – data management is not embedded in researchers’ everyday practice.
How to change data management culture?
We believe the best way to help researchers improve data management practices is to invest in people. We have therefore initiated the Data Stewardship project at TU Delft. We appointed dedicated, subject-specific data stewards in each faculty at TU Delft. To ensure the support offered by the data stewards is relevant and specific to the actual problems encountered by researchers, data stewards have (at least) a PhD qualification (or equivalent) in a subject area relevant to the faculty. We also reasoned that it was preferable to hire data stewards with a research background, as this allows them to better relate to researchers and their various pain points as they are likely to have similar experiences from their own research practice.
Vision for data stewardship
There are two main principles of this project. Crucially, the research must stay central. Data stewards are not there to educate researchers on how to do research, but to understand their research processes and workflows and help identify small, incremental improvements in their daily data management practices.
Consequently, data stewards act as consultants, not as police (the objective of the project is to improve cultures, not compliance). The main role of the data stewards is to talk with researchers: to act as the first contact point for any data-related questions researchers might have (be it storage solutions, tools for data management, data archiving options, data management plans, advice on data sharing, budgeting for data management in grant proposals, etc.).
Data stewards should be able to answer around 80% of questions. For the remaining 20%, they ask internal or external experts for advice. But most importantly, researchers no longer need to wonder where to look for answers or who to speak with – they have a dedicated, local contact point for any questions they might have.
Data Champions are leading the way
So has the cultural change happened? This is, and most probably always be, a work in progress. However, allowing data stewards to get to know their research communities has already had a major positive effect. They were able to identify researchers who are particularly interested in data management and sharing issues. Inspired by the University of Cambridge initiative, we asked these researchers if they would like to become Data Champions – local advocates for good data management and sharing practices. To our surprise, more than 20 researchers have already volunteered as Data Champions, and this number is steadily growing. Having Data Champions teaming up with the data stewards allows for the incorporation of peer-to-peer learning strategies into our data management programme and also offers the possibility to create tailored data management workflows, specific to individual research groups.
Technology or people?
Our case at TU Delft might be quite special, as we were privileged to already have the infrastructure and tools in place which allowed us to focus our resources on investing in the right people. At other institutions circumstances may be different. Nonetheless, it’s always worth keeping in mind that even the best tools and infrastructures, without the right people to support them (and to communicate about them!), may fail to be widely adopted by the research community.
Written by Julie Beardsell and originally published on the ICT innovation blog.
Responding to the challenge
Navigating the often complex legal landscape of software licensing can be a genuine challenge for researchers, particularly when starting up a research project for the first time.
Today’s researchers, when starting out on a PhD, may typically need to be competent scientists and programmers, but also understand and be sufficiently knowledgeable to make the right choices for the licensing of the software that they build. Without the latter, they risk a number of potentially undesirable situations.
To help researchers navigate their way, a working group at TU Delft has put together a set of guidelines for researchers, which can be downloaded here.
In addition, the working group is drafting a document to provide more detailed information and links to related documents and useful sources.
Open, reproducibility, peer-review and building upon others’ work
The very nature of the research itself, may be to create or improve upon software, which might be worked on openly and collaboratively with others, from institutions other than those of the institution by which the researcher is employed.
In addition, the task of creating scientific software as output of the researcher does not end with the publication of results which will have been generated as a result of the developed software. Making that software available for inspection and use by other scientists is essential to reproducibility, peer-review, and the ability to build upon others’ work.
Importance of licenses
Licenses are important for setting out the terms on which software may be used, modified, or distributed and by whom. Without a license agreement, software may be left in a state of legal uncertainty in which potential users may not know which limitations owners may want to enforce, and owners may leave themselves vulnerable to legal claims or have difficulty controlling how their work is used. Licenses can also be used to facilitate access to software as well as restrict it.
The working group consists of Julie Beardsell, Merlijn Bazuine, Susan Branchett, Maria Marques de Barros Cruz and Marta Teperek and the group would like to thank those researchers across the faculties who have contributed so far and encouraged the development of this initiative at TU Delft.
About the Author
“Open Source Software Guidelines for Researchers” by Julie Beardsell is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.