First meeting of National Research Data & Software training coordination

Picture credit: Carlos Teijeiro Barjas

Paula Martinez Lavanchy, the Research Data officer coordinating RDM training initiatives at TU Delft and 4TU.ResearchData co-organized the ‘First meeting of National Research Data & Software training coordination’ together with Mateusz Kuzak from the Netherlands eScienceCenter and Carlos Teijeiro Barjas from SURFsara.

This is her report about the event that took place on 3 March 2020 at SURF in Utrecht.

How did this initiative start?

In October 2019, TU Delft published its Vision for Research Data and Software Management Training. This ambitious plan aims at covering the different training needs that we consider relevant for researchers (including PhD candidates) at different stages of their research. This is a live document that needs constant evaluation and adjustment. So, since its publication, my task as the coordinator of RDM training activities has been to implement this vision and to investigate the challenges that can be encountered and possible solutions.

One of the first evident challenges is, how can we ensure that we are able to provide all the training we think is relevant for researchers in a sustainable way? 

At the end of 2019, in a meeting with Mateusz Kuzak from the Netherlands eScienceCenter and Carlos Teijeiro from SURFsara, we discussed all our ideas for developing and organizing training at our respective institutions and already found some ways to collaborate. But, we also thought that the sustainability issue of RDM (including software) training is probably common at different institutions and that there are other challenges that we could be solving collaboratively within the training coordinators community. So, we decided to go national!

What was the meeting about?

We contacted everybody that we knew was involved in providing or coordinating training in the field of Research Data and Software skills at the Dutch Universities and other organizations and call for a meeting to:

  1. Network and exchange knowledge about the training activities at different research institutions around the Netherlands
  2. Identify the challenges and the needs for training
  3. Identify collaboration opportunities

The session took place on 3 March 2020 and 18 participants (including us – the three hosts) from 12 institutions joined for a whole afternoon to discuss training.

Here you can see the introduction slides of the meeting: https://doi.org/10.5281/zenodo.3712254

We initiated the session with 5 min pitches from each institution to know what type of training is in place. After that, we collected challenges encountered in RDM-generic training, Software-related training and Discipline-specific training.

This first part of the discussion provided useful information about the different approaches towards training for RDM-related topics. Sometimes training on RDM is provided under a broad topic like Research Reproducibility or Open Science, and sometimes it is split to focus on specific RDM-related tasks such as Data Management Planning or Data Publication. 

At this point, I already identified with whom I should be exchanging experiences more in-depth about similar courses organized at TU Delft. It is in my to-do list to discuss with colleagues from the Centre for Digital Scholarship Leiden if and how to join forces to provide Data Carpentries for Social Sciences, which demonstrated to be very relevant for researchers at TU Delft Faculty of Architecture. I am also hoping to exchange knowledge with colleagues from VU Amsterdam about their approach to training on versioning control (training on Git). And, I could also get some more colleagues engaged in participating in our Code Refinery Train-the-Trainer event planned after the summer. 

When discussing challenges, it was easy to identify commonalities such as lack of trainers and helpers for the training sessions, the low motivation of researchers to attend the generic RDM courses if they are not mandatory, the lack of practical exercises and applied material, or the the lack of awareness about innovative ways to provide RDM training. 

Then, the most exciting part came: Is there room for collaboration to approach the challenges? Are we interested in collaborating? If yes, how can we collaborate? 

A collaborative future

In the last part of the meeting, we brainstormed about ideas to work collaboratively on training. Everybody agreed that there is a lot of potential to work we could collaborate on in different areas.

Some example of collaborative efforts that were mentioned:

  • National pool of trainers for domain-specific, data type-specific and/or software to exchange between institutions.
  • Database of trainers profiles – in case institutions want to hire trainers or facilitate the exchange of trainers within institutions.
  • Coordination of The Carpentries efforts – having a national coordinator of The Carpentries (or training in general) that can support people in creating a community around training, certifying instructors, and the development of training materials.
  • Exchange of course materials from the different institutions in The Netherlands – focus on making them FAIR, exchange course description, modules, visual material and exercises (especially practical exercises).
  • Training in developing open training materials in  a collaborative manner (using The Carpentries framework)
  • Train-the-Trainers programme – support staff providing training needs pedagogical skills and to learn creative ways of providing the content/exercises. In addition, there is also a need to learn about tools that researchers use for RDM in order to be able to teach about them. 

What’s next?

In the next few weeks, we need to identify if some of these ideas could be organized within established initiatives already existing in the Netherlands e.g. LCRDM, NPOS, RDNL, etc. It is also necessary to discuss how the resources can be organized to continue the coordination of the Research Data & Software training at a national level. 

The ideas will be shared with the initial group we contacted and we can decide as a community the best way to move forward.

If you coordinate Research Data and Software training at your institution and you are interested to join this network, please contact Paula Martinez Lavanchy, Mateusz Kuzak or Carlos Teijeiro

Day 3, part II: Reinventing the library research support services at Griffith University

20200313_122517

Griffith University campus

In the afternoon of Day 3, Danny and I met with Belinda Weaver and her colleagues from the library research support services at Griffith University. Belinda shared with us an impressive example of how Griffith Library reinvented its research support services to meet the growing demands of supporting digital research in the 21st century. The Library not only had to upskill their staff but also restructure support services and develop new approaches to training and engagement. Below are some selected snapshots of what Belinda and her team did to support data-intensive research.

Mapping out the needs

In order to decide what services should the library provide and how to do it, the team organised two types of consultations:

  1. With research staff and students, who needed research data support
  2. With support staff at the Library, who offered such support

The first one was essential to understand what are the needs of researchers. Research data management support offered by the Library traditionally focused on issues such as back up strategies, IP and licensing. It turned out that what researchers needed most was support with working with data across the entire research lifecycle and taking into account all the complexities of research projects. The newly surfaced issues were, for example, effectively managing access rights and access control, data security, data governance, but also data clean-up and data-driven research methods.

The internal consultation with the library staff helped to collectively agree which services the library should offer, decide on roles and responsibilities within the library staff members (who should deliver these new services), and to identify the knowledge and skill gaps. Doing the process collaboratively helped everyone understand and accept the need to build new capacity and capability to support data-driven research, and also to realise the roles they needed to play in the process.

Breaking down information silos

After establishing the gaps, the team focused on collaboratively creating a new knowledge base. This was again approached from two different angles:

  • By looking at specific topics – the team has identified 60 topics where knowledge needed to be updated and consolidated (e.g. APIs, data encryption)
  • By looking at disciplinary differences and practices (e.g. tools, research methods, data sources)

To ensure that knowledge can be easily shared and exchanged between colleagues and to counteract information silos, the team created templates for both specific knowledge topics, and for mapping out and understanding research disciplines.

Ensuring that such information is easily shareable between team members is essential when it comes to supporting the increasing amount of interdisciplinary research, and also in situations where team members need to switch roles or share tasks and responsibilities.

Skills and awareness

Understanding the needs of researchers and becoming familiar with the knowledge and disciplinary differences in which researchers operate, helped Belinda and her team to adjust the training provided by the Library. It is was particularly interesting for me to learn how the team addresses the ever-growing need for data wrangling skills. This is done through a combination of weekly hacky hours, software carpentry workshops organised once every two months, and yearly Research Bazaar festivals. 

Software Carpentry workshops

Software Carpentry workshops teach researchers basic computational skills. Griffith University Library currently has four certified Software Carpentry instructors, which includes two instructor trainers. In addition, some Library staff act as helpers during these workshops. All these help Griffith University run these workshops on a regular basis. All workshop logistics are managed by Griffith’s eResearch Services unit.

Hacky hours

Weekly hacky hours complement the software carpentry workshops. While Software Carpentry workshops are essential for researchers to learn the basic skills they need to start working with code and data, the content of the carpentry workshops is generic. Therefore, researchers who attend Software Carpentry workshops sometimes struggle in implementing the new learning into their daily practices and workflows. Hacky hours invite researchers to pop over to get help finding solutions to their specific problems, or to get advice on working with their own research data.

ResBaz, or Research Bazaar

ResBaz or Research Bazaar is an impressive, three-day-long festival of digital skills for research. In Brisbane, it is organised jointly by Queensland University of Technology, the University of Queensland, Griffith University, the University of Southern Queensland, and Queensland Cyber Infrastructure Foundation.

The first two days of the festival offer myriad of workshops helping researchers learn how to work with digital skills (in addition to Software Carpentry workshops, researchers can also learn how to work with Jupyter Notebooks, how to program with R, or how to do RNA sequencing etc.). The third day consists of talks on various topics: case studies on the use of digital tools and methods; talks on effective collaboration; or seminars on topics issues pertaining to personal and professional development.

While at TU Delft we do run regular carpentry workshops, and piloted drop-in consultations for code and data (our “Coding Lunch and Data Crunch” sessions), so far we haven’t run any big festivals of the like of ResBaz – definitely something worth considering!

Useful resources

Other blog posts from my trip to Australia:

Day 3, part I: Machine Actionable Data Management Plans at the University of Queensland

QU campus

University of Queensland Campus

The third and the fourth day of my trip I did jointly with Danny Kingsley, Scholarly Communication Consultant and the former head of Scholarly Communication at the University of Cambridge (my former boss). Together we have met with Fei Yu, Jan Wisgerhof and Kathleen Smeaton from the University of Queensland. 

Machine Actionable DMPs: theory and practice

In Europe, there are now a lot of discussions about machine-actionable data management plans (maDMPs). In the European context, traditional DMPs were created mostly as a result of funders’ requirements. Funders wished to have assurance that researchers would manage their research data responsibly. Typically, the information in data management plans was not structured and was not much re-used. The goal of machine-actionable DMPs is to make information recorded in the DMPs structured and actionable by machines. For example, if a researcher needed X amount of data storage space, appropriate storage requests would be made straight away from the DMP. Under the auspices of the Research Data Alliance, a lot of important theoretical work has been already accomplished in order to agree on a data model for maDMPs. However, at least in the European context, there has not yet been a fully functional implementation of the maDMP concept (or I am not aware of them).

It was therefore very interesting for me to visit the University of Queensland, where colleagues from the Library’s research data management team have developed a dedicated tool, the Research Data Manager. The tool, while it is not called a DMP tool, is a beautifully working and functional implementation of maDMPs.

So how did it all start?

Back in 2015, the research data team spoke with some Queensland University researchers and asked them about research data underlying their published papers. In most cases, data were either no longer findable, or very difficult to find. 

Therefore, the research data team created a tool, with the intention to capture an initial, basic metadata layer about research projects and research data created by university researchers. That’s how the Research Data Manager tool (RDM tool) came to live.

Iterations

The tool went through several rounds of iterations. Initially, it was started as something similar to DMPs in the European context – researchers were asked to describe their plans for data management at the beginning of their projects.

This approach, however, wasn’t very popular among researchers. They weren’t motivated to respond to long questions about their data management strategy, where a lot of information had to be copied from somewhere else, and they didn’t see the real benefit of doing this.

To respond to this feedback, colleagues from the RDM team made a lot of changes in the tool to make it more useful for researchers: 

  1. Substantially limited the number of questions asked in a DMP
  2. Changed all the open text fields into lookup, multiple-choice or checkbox questions, in order to allow for structured responses
  3. Structured responses allowed integrations, which brought actionability to DMPs and provided tangible benefits to researchers.

So what does the Research Data Manager Tool do?

The tool has 20 very easy to answer data management questions (all are lookup fields, checkboxes, radio buttons). By replying to these questions, researchers get free 1TB of storage (capacity can be extended through the tool), which is backed up and maintained by the University. The game-changer was that the storage which researchers could request through the tool allowed them to easily collaborate online with other researchers (authentication through edugain allows easy collaboration with people from 300+ universities worldwide). 

As soon as the researcher responds to the questions, the request for project storage space is immediately pushed to their supervisor for approval, and subsequently, a dedicated project space is created. Altogether, it only takes about 15 mins for researchers to receive their allocated storage. 

It is all in integrations

The key principles behind the tool are simplicity and integrations. I was impressed to see how many integrations the tool had already in place. In addition to integrations with storage and authentication systems, the tool also has a direct connection with the university finance and ethics application systems. What it means is that when a researcher logs in to the tool and indicates that their project has been externally funded, they can look up their project info coming from the finance system and auto-populate the relevant fields in the RDM tool. Similarly, if a researcher indicates that they will be working with personal research data, some additional questions will appear, including a question about ethics approval. But again, instead of duplicating the information, both the ethics tool and the RDM tool are connected. If a researcher has already started an ethics application, they can look it up in the RDM tool instead of copy-pasting the content.

Digital research notebooks

An additional interesting feature of the tool is that it prompts researchers for the use of digital research notebooks (aka electronic lab notebooks, or ELNs). In the form, researchers are also asked if they would like to use digital lab notebooks for their project. If they tick the box, then an account is created for them with LabArchives, which is the institutional digital research notebook product.

At TU Delft we are currently piloting ELNs and it seems that researchers from various disciplines have different requirements for an ELN. Therefore, I was curious to know if the one-size-fits-all approach wasn’t a problem for researchers at the University of Queensland. “It is used across all disciplines. Various researchers use it in different ways. In arts and humanities, researchers simply use it as a digital replacement of their paper notebooks or sketchbooks with the advantage that they can access it anywhere on their laptop or mobile, and are fully backed up” – explained Fei.

New developments

Jan told us about the latest developments in the tool. One of the biggest successes of the team was the integration between the RDM tool and the institutional eSPACE repository. This allows researchers to easily publish selected datasets in the university repository and get a DOI for them, without the need to populate all the metadata fields – these are auto-populated based on the information in the RDM tool. During our visit, the RDM team was just celebrating the first dataset which was published in the repository through the integration (there are of course more datasets hosted in the repository, which were published before, through direct data deposition route).

The two new developments that the team is currently working on are integrations with the thesis submission system and also with popular scientific instruments. The integration with the thesis submission system means not only that theses could be automatically uploaded to the university repository, but also that students will be asked to publish their data at the same time. The integration with instruments allows instruments data to be directly added to the RDM tool, with baseline metadata, which makes the data flow much easier for researchers to manage and also allows for easy data publication. In addition, this could also enable facility managers to get statistics on the usage of tools and facilities.

All data and metadata in the RDM tool are version controlled. In addition, researchers can easily export and submit their DMPs from the system to funding bodies (albeit Australian funders don’t have strict requirements for DMPs).

Success story

The usefulness of the tool and the tangible benefits of transforming it into a ‘maDMP’ meant that researchers didn’t need to be convinced to start using it. In the first 10 months, 1000+ researchers started using the tool. At the moment it has 10,000 users. “People create a DMP and see the immediate benefits, without even knowing this is a DMP” – Fei explained. 

In addition, seeing the usefulness of the tool, the graduate school made it mandatory to use the tool for all PhD students. This made PhD supervisors very happy. Many of them were worried that PhD students leave the University without leaving their data behind, or leaving them not in good order. Because every student request needs to be approved by the supervisor, supervisors are now aware of where and how students store their research data and gained better oversight over data management practices in their research groups.

An unintended benefit of these integrations was also much closer cooperation with other university services. Thanks to the joint work on the RDM tool, colleagues from other departments now all see how good data management practices are embedded within their workflows: from grant applications, through ethics approval and finishing with publication.

Back to TU Delft

Our DMP template at TU Delft has a lot of questions with simple, multiple-choice responses. However, we do not yet have integrations in place with various university tools and systems. Visiting colleagues from the University of Queensland was therefore very inspirational and, well, we have a lot of work still to do at TU Delft to transform our DMPs into maDMPs. The work done by Fei, Jan and Kathleen certainly provided us with lots of useful lessons learnt and examples we could try to adapt in our institutional setting.

Other blog posts from my trip to Australia:

Day 2: Data Fluency, BRIDGES, MeRC, and reverse Data Management Plans

Monash 2

Monash University campus

Marta Teperek, the Data Stewardship Coordinator at TU Delft Library, is now in Australia: doing a short study trip to exchange practice with Australian colleagues, and subsequently taking part in the RDA Plenary 15 Meeting and associated events in Melbourne.

She aims to post short updates from her trip.


On the second day of my study trip, I have met with David Groenewegen and his colleagues from Monash University. I met with colleagues working at the central Monash University Library (David, Neil Dickson, Beth Pearson, Patrick Splawa-Neyman and Andrew Harrison) and with colleagues from the Monash e-Research Center (Anitha Kannan, Stephen Dart and Nicholas McPhee).

In summary: I wish I had more time to spend with colleagues from Monash University – truly impressive work, despite a very lean team and a massive university (spread around several campuses… and continents) – there is really a lot to learn from their approach to research data.

While I am sure that my poor head wasn’t able to contain everything, see below some key highlights and points which I think are most important for us at TU Delft (and any other institution which aspires to have world-leading data management services).

Data Fluency and the importance of perseverance

At Monash University the main problem when it comes to providing digital skills training is the size of the institution: not only ~80,000 students, ~8,000 staff and ~5,000 doctoral students (source: Wikipedia), but also multiple campuses to take care of. To address the demand for training, the Library is now the hub of the Data Fluency initiative. The aim of this initiative is to empower researchers with skills to transform their own research data and workflows as they see fit, and make them more effective.  

Community engagement

Similarly to Melbourne’s Research Computing Services, the Data Fluency initiative has community engagement at its core. Courses are mostly based on Carpentry-style lessons and are taught by academics and professional staff. In addition, postgraduate students are often hired as professional instructors (this means a workforce of dozen people paid by the hour for their work). This allows the library to run courses every week. Last year they have trained impressive 1,200 people (this year’s target is even higher at 1,500!). The instructors are part of Monash University Community of Practice – people who attended previous workshops and want to continue learning through sharing, exchange, and regular interactions. In addition to frequent workshops, the community is sustained by afternoon data seminars (monthly) and Friday drop-in sessions (weekly). 

Perseverance can be sometimes very important

At TU Delft we also tried running our Coding Lunch and Data Crunch sessions (monthly) as drop-in opportunities for any code and data questions. However, not many researchers attended these sessions and as a result, they are now on hold. Interestingly, colleagues from Monash explained to me that in their case perseverance was key. They also experienced sessions where no researchers turned up. Instead of stopping the sessions, they experimented with locations. Turned out that rooms with glass windows (so that passersby can see that the inside is not scary), proximity to cafeterias, and organising the sessions in locations which don’t require researchers to make a lot of effort to get to, worked best.

Reputational benefits

What is very interesting, the successful approach to training seems to have had an immense reputational gain for the Library. Other offices and departments (research platforms, graduate offices, others running specialist training) now partner with the Library, as the central hub where valuable, quality training on digital skills is made available to researchers across the campus. 

Show me your data

We also discussed the need for engagement with schools and faculties. Patrick Splawa-Neyman’s briefly introduced his approach to raising data management awareness at the School of Public Health and Preventive Medicine. He decided to interview individual researchers and ask them about their research data. He had 25 questions and he interviewed 25 people, which not only helped him to map out areas where data management practices needed improvement but also raised awareness about the benefits of good data management within the school. As a result, Patrick was able to introduce REDCap as a tool for research data management. The tool collects basic information about the authors and about the data, and contains a built-in GDPR checklist. The tool and the workflow were endorsed by the School and used so successfully that now a separate instance of the tool (with specific customisations) was introduced specifically for PhD students.

BRIDGES for online research presence

4TU.Research Data will be soon moving to figshare for its repository platform software. Monash was one of the first universities worldwide to use the institutional instance of figshare. Thus, it was very timely that Beth, Andrew and Neil shared with me some useful thoughts about running BRIDGES (which is the name of Monash’s figshare instance) as a repository solution.

Community again: let researchers own it

When rolling out BRIDGES, Monash decided to give researchers full responsibility for the content they upload into the repository. Nobody checks what researchers upload into the repository. Everything gets a DOI and goes live straight away. Researchers appreciate this solution: there are no delays when it comes to publishing research outputs (so it is possible to get a DOI on a Friday night when preparing a last-minute grant proposal). In addition, workflows are very simple and intuitive, and researchers feel responsible for their own research outputs.

In addition, researchers have the freedom to decide which research outputs they perceive as valuable and worth sharing. BRIDGES is thus not a data repository. BRIDGES means collections of various research outputs. This not only better resonates with researchers from arts and humanities (to whom the word ‘data’ is sometimes a bit nebulous) but gives researchers the freedom to share all valuable components of their research. 

This approach has been particularly valued by PhD students, who thanks to BRIDGES gain the possibility to promote their own work and establish their own academic profile in the way they want. Before they start publishing serious academic papers, they can already promote their conference presentations, posters, blog articles, reports and theses.

Hands off = opportunity to do other things

Simplifying the workflow and giving researchers the freedom to use BRIDGES as they want, not only empowered the research community and encouraged them to explore BRIDGES, but also freed a lot of time of many colleagues at the Library. The simplicity of the workflow and the intuitive upload process meant that suddenly there was no need for training and powerpoints. Also, researchers no longer needed to be convinced about the benefits of using BRIDGES – they simply saw them.

In addition, Monash’s staff have worked very closely with the figshare’s team to maximise the use of APIs and ensure integration of various workflows. One of the biggest successes in this field was the integration between the University’s system for digital submission of PhD theses and BRIDGES. This meant that theses could be automatically pushed to BRIDGES and that the cataloguing work was no longer necessary. This meant that the cataloguers’ team could join and strengthen the research data team instead, who needed to increase its capacity.

Other useful notes

There were many other useful tips which colleagues from Monash shared with me:

  • Colleagues from Monash were overall very satisfied with their relationship with figshare – they were very positive about figshare’s responsiveness, regular (fortnightly) releases, lack of downtime, clear roadmaps, transparency about development goals
  • APIs and integrations seem to be working very well:
    • Figshare now enabled data export to Pure through dedicated Pure import files (users are matched by their unique university IDs)
    • Colleagues from Monash are now considering options for importing user profile information from existing systems (such as Pure) into figshare
  • The active data management space proved to be rather confusing and not very useful for active collaborations – the Library is now downplaying this functionality

MeRC and reverse DMPs

The Monash e-Research Centre (MeRC) has been launched to help researchers with the 21st-century digital research challenges. The MeRC supports researchers working with big data, complex analytics workflows, projects requiring streamlined interfaces between instruments and data, and other research projects, which needs for data management and processing are out of scope for corporate IT support. Researchers who wish to benefit from the support and facilities offered by the MeRC need to make an application in which they need to make a case for their project: explain the rationale for requesting bespoke support and outline the potential impact of their work.

How to avoid orphan data?

Providing bespoke solutions for complex digital research projects (often involving very large datasets – Monash currently stores 16 PB of data) means that good data governance is essential. Colleagues at Monash have learnt their lesson when decommissioning old storage and attempting to move data to a newer storage solution. It turned out that lots of data were ‘orphaned’ – researchers who produced them left the university, but their undocumented data was left behind on the server. It proved very difficult to find someone ready to decide what to do with such orphaned assets (makes me think of ‘bulk data’ situation at TU Delft 😉 – thankfully Nick agreed to share the write up of their decommissioning workflow).

Reverse DMP

To tackle the problem of ‘orphan’ data, Monash is currently working on a solution which could ensure that information about data creators, data provenance, governance, legal and ethical aspects, as well as information about the datasets themselves could be recorded together with the data files. The goal is to make this process as painless for researchers as possible thanks to system integrations. 

When researcher comes to use a piece of equipment or a facility, they should be automatically recognised – the system should know what they are doing and why, and record appropriate metadata with this information, instead of asking researchers to endlessly re-enter their credentials or information about their project.

Monash colleagues refer to this vision as a ‘reverse DMP’ – a Data Management Plan which isn’t created for compliance purposes at the start of the project, but is continuously (and automatically) created by all the systems with which the researcher interacts (by recording metadata about all these interactions, data provenance, data storage, access etc.).

The value of data

Asking researchers to make a case for their projects when they apply for the MeRC’s support, together with the emphasis on better data governance (and the pain points of decommissioning the old storage system), brought people’s attention to the value of research data. 

The introduction of the GDPR (Monash decided to comply with the GDPR, because one of their campuses is based in Italy, because they have a lot of international collaborations, and also because they saw GDPR as an opportunity to improve data management practices) was an important stick: the potential risks and liabilities associated with working with personal data, meant more care and responsibility when processing them. The offer of artificial intelligence tools to interrogate data and see completely new qualities provided an important carrot – data became the crucial asset for the development of new tools and insights. The need to make a case for the use of resources and the need to make decisions about data custodianship increased the feeling of ownership for research data among the research community.

As Anita explained, realising the value of data is a cultural change and it is a journey which takes time, but Monash clearly seem to be on the right path.

Resources

Other blog posts from my trip to Australia:

Day 1: From Petascale Campus to Community-driven training: a myriad of innovative data initiatives at the University of Melbourne

Presentation1

Marta Teperek, the Data Stewardship Coordinator at TU Delft Library, is now in Australia: doing a short study trip to exchange practice with Australian colleagues, and subsequently taking part in the RDA Plenary 15 Meeting and associated events in Melbourne.

She aims to post short updates from her trip.


On the first day of my study trip, I visited the University of Melbourne. Peter Neish and his colleagues have been extremely generous hosts and I was able to meet with various colleagues representing the Petascale Campus Initiative, the Melbourne Data Analytics Platform, the Library and the Research Computing Services. In addition, colleagues from the Australian Research Data Commons joined us for lunch. 

All discussions were extremely helpful and I was very much impressed by the myriad of innovative data initiatives at the University of Melbourne. Below is a (biased) selection of some of them.

The Petascale Campus Initiative and the Melbourne Data Analytics Platform

The Petascale Campus Initiative is a strategic plan at the University of Melbourne to increase its digital capacity. It is a collaboration between multiple stakeholders, including the Chancellery, the faculties, the IT, the Library, and others. The initiative not only aims to include the coherence and cohesion between digital infrastructure support available to researchers across the campus but also heavily invests in people – various experts in data analytics and data stewardship.

Thanks to the investment in experts in data analytics and data stewardship, a new initiative was launched at the University of Melbourne: the Melbourne Data Analytics Platform. The initiative is overseen (and funded) jointly by the Chancellery and Melbourne’s academic community. The data experts engage with projects on a collaborative basis and they make intellectual contributions to them. The exact nature of the partnerships varies and might involve data analysis, visualisation, set up of effective workflows, support with grant writing. And there is clearly a lot of demand at Melbourne for such services. There are calls every 6 months for new collaborative projects. Proposals are then reviewed and scored by a steering committee, which has a representative from each faculty. In the most recent call, there were 45 applications received and only 7 of them awarded as official collaboration.

Data Stewards and Data Champions

The University of Melbourne also has its Data Stewards and Data Champions programmes. Data Stewards are either members of the Melbourne Data Analytics Platform or appointed at the Library. Data Champions, similarly to our programme at TU Delft, are researchers who are passionate about data management issues and are part of a community of like-minded individuals.

I was interested to hear that at the University of Melbourne, similarly to our model at TU Delft, advice and support were favoured over compliance-based approaches. Advice and support were perceived as more likely to result in trust-based relationships with researchers. 

We also had a discussion about the long-term sustainability of the services – who should pay for them and how. One of the observations was that while project money seems like a possible cost-efficient way of funding support services, such moels might put those who can’t get competitive grants at disadvantage, or even exclude them from using a service (e.g. early career researchers, academics from disciplines which are not well funded).

Community-driven and community-delivered training

My last (but definitely not least!) meeting was with Sonia Ramza and her team of Community Managers, who deliver Digital Research Skills training at the University of Melbourne. They have adopted a very innovative and original model for their training provision – the training is community-driven and community-delivered. So how does it work? 

Researchers come to interactive, problem-solving training offered by the team. Some of these researchers become enthusiastic about the training. Some of them then become volunteers: ‘Research Leads’ in their respective communities. As Research Leads they assume the role of local trainers. They either share their learning in dedicated training sessions within their communities or help the central team deliver training for new researcher cohorts, by enriching central training with practical case studies on how to apply the learning in their own research context. Some of the most active ‘Research Leads’ then get a chance to become paid ‘Community Managers’ who, in addition to their research positions, oversee the training curricula and the ‘Research Leads’ communities around each curriculum (each Community Manager oversees around five ‘Research Leads’).

The whole approach to community-driven and community-delivered training is very agile. The team is encouraged to experiment. Each experiment is evaluated. The team then decides what is worth keeping and what needs to be changed. That’s how their current training programme evolved. It was initially heavily based on The Carpentries but was adapted with time. The curricula offered by the Carpentries, while very useful with their focus on practical approach and pedagogical skills, sometimes felt too rigid and too lengthy for busy researchers at Melbourne. Therefore, the team took certain part of the Carpentries curricula, modified the content in collaboration with experts on engagement and pedagogical approaches, and made them available as stand-alone, easily digestible problem-solution focused courses (most of them last for few hours, and the longest training is about 6h long). In addition to these courses, the team is also offering very short YouTube videos. The philosophy behind these is that they can be accessed by anyone, anywhere, and one can learn a lot in just 3 minutes! (just subscribed to their YouTube channel :)).

Resources

Other blog posts from my trip to Australia:

Data Champions Get Together – 5 March 2020

social-media-2457842_1280

On Thursday 5 March 2020 we had the first meeting of TU Delft Data Champions this year.


What happened?

A lot! In 2hours we managed to squeeze in:

  • 4 diverse talks (note! each talk was max. 10 min, in order to allow the same time for discussions and Q&A):
  • Open Science pub quiz Intermezzo – we had lots of fun thanks to our quiz master Femke van Giessen
  • Launch of the book by our internship student Connie Clare: “The Real World of Research Data” containing:
    • Interviews with 16 of our Data Champions
    • A 6-step recipe for other institutions who wish to start Data Champions communities

In summary, lots of discussions, enthusiasm, practice exchange… and fun! And as always, we concluded with networking drinks and snacks.

What’s next?

The next get together of our Data Champions community will be in September this year. But meantime, there will be numerous events and smaller meetings (also at discipline- and community-level), so keep an eye on the announcements in the monthly newsletter.

Meantime, consider joining the TU Delft Data Champions community.

Resources

All presentations are of course shared openly on Zenodo: https://doi.org/10.5281/zenodo.3699076

The Turing Way Bookdash

Author: Esther Plomp

The Turing Way Book Dash participants working on their contributions (photo by Esther Plomp).

I attended the second Turing Way book dash event (London, 21 and 22 February), which may need some explaining:

  • The Turing Way is a ‘lightly opinionated’ online guide to reproducible data science. The book is collaboratively written using GitHub and Jupyter book, an effort led by Kirstie Whitaker.
  • A book dash is a short event (1-2 days) where people come together to work on a book. The name book dash derives from a book sprint, where the time taken is longer than a dash (3-5 days).

By attending the Turing Way Book dash I got to contribute to this amazing resource, I met a lot of great people that are part of the Turing Way community, and I gained more confidence in working collaboratively using GitHub (and made my first pull request!).

Malvika Sharan introducing the Book Dash to the participants (photo by Esther Plomp).

Together with my partner in credit-crime, Frances Madden, we added a section about why you should have an ORCID, which was reviewed by Jade Pickering. This resulted in a drawing made by Scriberia.

Photograph of the drawing by Scriberia, which will be available on Zenodo.

Afterwards I contributed a data citation section and revised the Research Data Management chapter, adding in some of the examples we highlight at Delft. (These changes are still under review at the time of writing.)

It even got me back into cross-stitching, thanks to Sarah Gibson! (Stay tuned for a picture of my cross-stitched Binder logo.)

Exciting additions thanks to the book dash participants include:

As you may conclude from this list, I got to meet and collaborate with amazing people during the book dash! I hope to continue to add contributions to the Turing Way through the bi-weekly Turing Way Collaboration Cafés. The next Café is on the fourth of March and starts at 16:00!

If you would like to contribute to the Turing Way please get in touch or visit their contributing guidelines to learn how to start.