All Learning Resources

  • Metadata Management for Spatial Data Infrastructures

    This presentation will focus on creating geospatial metadata for spatial data infrastructures. The growing emphasis on data management practices in recent years has underscored the need for well-structured metadata to support the preservation and reuse of digital geographic information. Despite its value, creation of geospatial metadata is widely recognized as a complex and labor-intensive process, often creating a barrier to effective identification and evaluation of digital datasets. We will discuss our set of best practices for describing a variety of spatial content types using the ISO Series of Geographic Metadata Standards. We will share a series of Python and XSLT routines, which automate the creation of ISO-compliant metadata for geospatial datasets, web services, and feature catalogs. These auto-generation tools are designed to work directly with XML documents, making them suitable for use within any XML-aware cataloging platform. Our goals are to make metadata creation simpler for data providers, and to increase standardization across organizations in order to increase the potential for metadata sharing and data synchronization among the geospatial community.
  • Dash: Making Data Sharing Easier

    Dash is a self-service tool for researchers to select, describe, identify, upload, update, and share their research data. 

    For more information about Dash go to ​   Dash is the front end to the Dryad repository platform.

  • Rethinking Research Data | Kristin Briney | TEDxUWMilwaukee

    The United States spends billions of dollars every year to publicly support research that has resulted in critical innovations and new technologies. Unfortunately, the outcome of this work, published articles, only provides the story of the research and not the actual research itself. This often results in the publication of irreproducible studies or even falsified findings, and it requires significant resources to discern the good research from the bad. There is way to improve this process, however, and that is to publish both the article and the data supporting the research. Shared data helps researchers identify irreproducible results. Additionally, shared data can be reused in new ways to generate new innovations and technologies. We need researchers to “React Differently” with respect to their data to make the research process more efficient, transparent, and accountable to the public that funds them.

    Kristin Briney is a Data Services Librarian at the University of Wisconsin-Milwaukee. She has a Ph.D. in physical chemistry, a Masters in library and information studies, and currently works to help researchers manage their data better. She is the author of “Data Management for Researchers” and regular blogs about data best practices at

    This talk was given at a TEDx event using the TED conference format but independently organized by a local community. Learn more at

  • Introduction to Lidar

    This self-paced, online training introduces several fundamental concepts of lidar and demonstrates how high-accuracy lidar-derived elevation data support natural resource and emergency management applications in the coastal zone.  Note: requires Adopbe Flash Plugin.

    Learning objectives:

    • Define lidar
    • Select different types of elevation data for specific coastal applications
    • Describe how lidar are collected
    • Identify the important characteristics of lidar data
    • Distinguish between different lidar data products
    • Recognize aspects of data quality that impact data usability
    • Locate sources of lidar data
    • Discover additional information and additional educational resources


  • OntoSoft Tutorial: A distributed semantic registry for scientific software

    An overview of the OntoSoft project, an intelligent system to assist scientists in making their software more discoverable and reusable.

    For more information on the OntoSoft project, go to ​

  • Digital Preservation Handbook

    The Handbook provides an internationally authoritative and practical guide to the subject of managing digital resources over time and the issues in sustaining access to them. A key knowledge base for digital preservation, peer-reviewed and freely accessible to all. It will be of interest to all those involved in the creation and management of digital materials.
    The contents page provides an "at a glance" view of the major sections and all their component topics. You can navigate the Handbook by clicking and expanding the "Explore the Handbook" navigation bar or by clicking links in the contents page.
    The contents are listed hierarchically and indented to show major sections and sub-sections. Landing pages provide overviews and information for major sections with many sub-sections.
    -Digital preservation briefing [landing page]
    -Getting started
    -Institutional strategies [landing page]
    -Organizational activities [landing page]
    -Technical solutions and tools [landing page]
    -Content-specific preservation [landing page]

  • Data Management Lifecycle and Software Lifecycle Management in the Context of Conducting Science

    This paper examines the potential for comparisons of digital science data curation lifecycles to software lifecycle development to provide insight into promoting sustainable science software. The goal of this paper is to start a dialog examining the commonalities, connections, and potential complementarities between the data lifecycle and the software lifecycle in support of sustainable software. We argue, based on this initial survey, delving more deeply into the connections between data lifecycle approaches and software development lifecycles will enhance both in support of science.

  • USGS Data Management Training Modules—Metadata for Research Data

    This is one of six interactive modules created to help researchers, data stewards, managers and the public gain an understanding of the value of data management in science and provide best practices to perform good data management within their organization. This module covers metadata for research data. The USGS Data Management Training modules were funded by the USGS Community for Data Integration and the USGS Office of Organizational and Employee Development's Technology Enabled Learning Program in collaboration with Bureau of Land Management, California Digital Library, and Oak Ridge National Laboratory. Special thanks to Jeffrey Morisette, Dept. of the Interior North Central Climate Science Center; Janice Gordon, USGS Core Science Analytics, Synthesis, and Libraries; National Indian Programs Training Center; and Keith Kirk, USGS Office of Science Quality Information.

  • The Oxford Common File Layout

    The Oxford Common File Layout (OCFL) specification describes an application-independent approach to the storage of digital information in a structured, transparent, and predictable manner. It is designed to promote long-term object management best practices within digital repositories.  This presentation was given at under the topic of Preservation Tools, Techniques and Policies for the Research Data Alliance Preserving Scientific Annotation Working Group on April 4, 2017. 

  • Data Champions: leading the way to proper research data management

    Presentation given by Esther Plomp at the "FAIR Data - the Key to Sustainable Research" seminar at Tartu University Library on the 9th of April 2019.

    Data Champions are experts in data management who share their experience with their group/department members. They are volunteers that act as advocates for good data management and sharing practises and they help their Faculty’s Data Steward with disciplinary specific understandings of Research Data Management (RDM). The Data Champion programme started in September 2018 at the TU Delft and is open to researchers from all levels (PhD students to professors) as well as support staff (data managers, software developers and technicians). As the Data Champions are members of all faculties and various departments of Delft University of Technology (TU Delft) they form a network across the university campus. The Champions are invited for meetings where they interact with each other and share their experiences, such as achievements and problems that they encounter in managing data and software. They are also encouraged to participate at a national and international level by being informed on current trends in data management and there is a travel grant available that allows them to participate in RDM events, trainings and workshops. At TU Delft they are actively working together with the Data Stewards on RDM policy development as well as involved in more practical activities such as coding support and Software Carpentry workshops. These activities increase the visibility and impact of the Data Champions, recognise their data management efforts, and offer them opportunities to learn new skills which they can share with their local community members.

  • Reproducible Quantitative Methods: Data analysis workflow using R

    Reproducibility and open scientific practices are increasingly being requested or required of scientists and researchers, but training on these practices has not kept pace. This course, offered by the Danish Diabetes Academy, intends to help bridge that gap. This course is aimed mainly at early career researchers (e.g. PhD and postdocs) and covers the fundamentals and workflow of data analysis in R.

    This repository contains the lesson, lecture, and assignment material for the course, including the website source files and other associated course administration files. 

     By the end of the course, students will have:

    1. An understanding of why an open and reproducible data workflow is important.
    2. Practical experience in setting up and carrying out an open and reproducible data analysis workflow.
    3. Know how to continue learning methods and applications in this field.

    Students will develop proficiency in using the R statistical computing language, as well as improving their data and code literacy. Throughout this course we will focus on a general quantitative analytical workflow, using the R statistical software and other modern tools. The course will place particular emphasis on research in diabetes and metabolism; it will be taught by instructors working in this field and it will use relevant examples where possible. This course will notteach statistical techniques, as these topics are already covered in university curriculums.

    For more detail on the course, check out the syllabus at:

  • Template Research Data Management workshop for STEM researchers

    These materials are designed as a template for an introductory Research Data Management workshop for STEM postgraduate students and Early Career Researchers. The workshop is interactive and is designed to be run for 2-3 hours depending on which sections of the workshop are delivered. As it is a template workshop there is a lot of material to cover all disciplines, it is unlikely that all sections would be of interest to any one group of researchers. The sections are:
    Backup and file sharing
    How to organise your data well
    Data Tools
    Personal and sensitive data
    Data sharing
    Data Management Plans

    The workshop works best when adapted for a particular discipline and with a maximum of 30 participants. This workshop was developed for the Data Champions programme at the University of Cambridge and is an adaptation of workshops which are run on a regular basis for PhD students and Postdoctoral researchers. If you would like any more information please email [email protected] and we would be happy to answer any questions that you have.

  • Developing Data Management Education, Support, and Training

    These presentations were part of an invited guest lecture on data management for CISE graduates students of the CAP5108: Research Methods for Human-centered Computing course at the University of Florida (UF) on April 12, 2018. Graduate students were introduced to the DCC Checklist for a Data Management Plan, OAIS Model (cessda adaptation), ORCiD, IR, high-performance computing (HPC) storage options at UF, data lifecycle models (USGS and UNSW), data publication guides (Beckles, 2018) and reproducibility guidelines (ACM SIGMOD 2017/2018). This was the first guest lecture on data management for UF computer & information science & engineering (CISE) graduate students in CAP 5108: Research Methods for Human-centered Computing -  A draft of a reproducibility template is provided in version 3 of the guest lecture.  

  • How to motivate researcher engagement?

    Presentation given about Data Stewardship at TU Delft and Data Championship at Cambridge University at Dutch LCRDM (Landelijk Coördinatiepunt Research Data Management) Data Steward meeting 1st December 2017.  Topics covered include suggestions by data stewards about how to approach and persuade researchers to engage in data management and stewardship activities.  

  • CURATE! The Digital Curator Game

    The CURATE game is designed to be used as an exercise that prompts players to put themselves into digital project scenarios in order to address issues and challenges that arise when institutions engage with digital curation and preservation.

    Developed as a means to highlight the importance of training in digital curation among practitioners and managers working in libraries, museums and cultural heritage institutes, the game has been used as a self-assessment tool, a team-building exercise and a training tool for early career students.

    The CURATE game package includes:

    • Welcome to CURATE Presentation
    • Game Board (PDF)
    • Game Cards (PDF)
    • About the Game (PDF)
    • Rules (PDF)
    • Record Sheet & Closing Questions (PDF)
    • Frequently Asked Questions (DoC)
  • Coffee and Code: R & RStudio

    What is R?

    R is an [Open Source]( programming language that is specifically designed for data analysis and visualization. It consists of the [core R system]( and a collection of (currently) over [13,000 packages]( that provide specialized data manipulation, analysis, and visualization capabilities. R is an implementation of the *S* statistical language developed in the mid-1970s at Bell Labs, with the start of development in the early 1990s and a stable beta version available by 2000. R has been under continuous development for over 25 years and has hit major development [milestones]( )(programming_language) over that time.

    R syntax is relatively straightforward and is based on a core principle of providing reasonable default values for many functions, and allowing a lot of flexibility and power through the use of optional parameters.

  • Train the Trainer Workshop: How do I create a course in research data management?

    Presentations and excercises of a train-the-trainer Workshop on how to create a course in research data management, given at the International Digital Curation Conference 2018 in Barcelona.

  • Data and Software Skills Training for Librarians

    Library Carpentry is an open education volunteer network and lesson organization dedicated to teaching librarians data and software skills. The goal is to help librarians better engage with constituents and improve how they do their work. This presentation will serve as an introduction on how Library Carpentry formed in 2015, evolved as a global community of library professionals and will continue as a future sibling of the Carpentries, an umbrella organization of distinct lesson organizations, such as Data and Software Carpentry. We’ll cover existing collaborative lesson development, curricula coverage, workshop activities and the global instructor community. We’ll then talk about the future coordinating activities led by the UC system to align and prepare for a merging with Data and Software Carpentry.

  • Workshop: Research Data Management in a Nutshell

    The workshop Research Data in a Nutshell was part of the Doctoral Day of the Albertus Magnus Graduate Center (AMGC) at the University of Cologne on January 18 2018.

    The workshop was intended as a brief, interactive introduction into RDM for beginning doctoral students.

  • Ecology Curriculum

    This workshop uses a tabular ecology dataset from the Portal Project Teaching Database and teaches data cleaning, management, analysis, and visualization. There are no pre-requisites, and the materials assume no prior knowledge about the tools. We use a single dataset throughout the workshop to model the data management and analysis workflow that a researcher would use.

    • Data Organization in Spreadsheets
    • Data Cleaning with OpenRefine
    • Data Management with SQL
    • Data Analysis and Visualization in R
    • Data Analysis and Visualization in Python

    The Ecology workshop can be taught using R or Python as the base language.
    Portal Project Teaching Dataset: the Portal Project Teaching Database is a simplified version of the Portal Project Database designed for teaching. It provides a real-world example of life-history, population, and ecological data, with sufficient complexity to teach many aspects of data analysis and management, but with many complexities removed to allow students to focus on the core ideas and skills being taught.

  • Planning for Software Reproducibility and Reuse

    Many research projects depend on the development of scripts or other software to collect data, perform analyses or simulations, and visualize results.  Working in a way that makes it easier for your future self and others to understand and re-use your code means that more time can be dedicated to the research itself, rather than troubleshooting hard-to-understand code, resulting in more effective research. In addition, by following some simple best practices around code sharing, the visibility and impact of your research can be increased.  In this introductory session, you will:

    • learn about best practices for writing, documenting (Documentation), and organizing code (Organization & Automation),
    • understand the benefits of using version control (Version Control & Quality Assurance),
    • learn about how code can be linked to research results and why (Context & Credit),
    • understand why it is important to make your code publishable and citable and how to do so (Context & Credit),
    • learn about intellectual property issues (Licensing),
    • learn about how and why your software can be preserved over time (Archiving).
  • ScienceBase as a Platform for Data Release

    This video tutorial provides information about using ScienceBase as a platform for data release. We will describe the data release workflow and demonstrate, step-by-step, how to complete a data release in ScienceBase.

  • Introduction to GRASS GIS

    GRASS GIS, commonly referred to as GRASS (Geographic Resources Analysis Support System), is a free and open source Geographic Information System (GIS) software suite used for geospatial data management and analysis, image processing, graphics and maps production, spatial modeling, and visualization. GRASS GIS is currently used in academic and commercial settings around the world, as well as by many governmental agencies and environmental consulting companies. It is a founding member of the Open Source Geospatial Foundation (OSGeo).

    This training includes an introduction to raster and vector analysis, image processing, water flow modeling, Lidar data import and analysis, solar radiation analysis, shaded relief, network analysis using four interaces, and python scripting.   The training uses GRASS GIS 7.0 and a GRASS GIS NC SPM sample dataset.  The GitHub repository can be found at: .

  • What is Open Science?

    This introductory course will help you to understand what open science is and why it is something you should care about. You'll get to grips with the expectations of research funders and will learn how practising aspects of open science can benefit your career progression. Upon completing this course, you will:

    • understand what Open Science means and why you should care about it
    • be aware of some of the different ways to go about making your own research more open over the research lifecycle
    • understand why funding bodies are in support of Open Science and what their basic requirements are 
    • be aware of the pontential benefits of practicing open science 

    It is important to remember that Open Science is not different from traditional science. It just means that you carry out your research in a more transparent and collaborative way. Open Science applies to all research disciplines. While Open Science is the most commonly used term, you may also hear people talking about Open Scholarship or Open Research in the Arts and Humanities.

  • Research Data Management Guide

    This guide can assist you in effectively managing, sharing, and preserving your research data. It provides information and guidance for all aspects of the data lifecycle, from creating data management plans during the proposal phase to sharing and publishing your data at the conclusion of your project. This guide is not specific to any particular funder, discipline, or type of data.  The guide also features data management stories and examples, both good and bad, that would be useful to research data management instructors or other service providers.


    This training module provides you with instructions on how to deploy B2SAFE and B2STAGE with iRODS4. It also shows you how to use these services. Moreover, the module provides hands-on training on Persistent Identifiers, more specifically Handle v8 and the corresponding B2HANDLE python library.

    B2SAFE is a robust, safe and highly available service which allows community and departmental repositories to implement data management policies on their research data across multiple administrative domains in a trustworthy manner.

    B2STAGE is a reliable, efficient, light-weight and easy-to-use service to transfer research data sets between EUDAT storage resources and high-performance computing (HPC) workspaces.

    Please consult the user documentation on the services for a general introduction, if needed, before following the contents of this git repository. This training material foresees two types of trainees: those who want to learn how to use the EUDAT B2SAFE and B2STAGE services; and those who prefer to deploy and integrate these services. Following the full, in-depth tutorial will allow you to understand how the components of a service are combined and thus enables you to also extend the integration of services at the low-level (technology-level rather than API level). Following just the "use" part of the training will familiarise you with the APIs of the services, but not with the underlying technology and its wiring.

  • Training on using the E2O WCI Data Portal

    Course Content: This course will offer an introduction to the eartH2Observe Water Cycle Integrator (WCI) Data Portal available at

    Course Objectives: Training the users on how to navigate through the E2O WCI portal: navigate around the map, select indicators by searching, perform some analysis on the selected indicators, download data, and other WCI functionalities.

    Why is this topic interesting? With this training we can increase the use of the WCI, build capacity, and furthermore the dissemination of all the available data and tools. Upon completion of this training the users will have increased capacity in efficiently using the portal and its functionalities.

    The course includes 3 lessons:
    Lesson 1:  GISportal - An Introduction
    Lesson 2:  GISportal - External Data and Collaboration
    Lesosn 3:  GISportal - Docker Version

    The Earth2Observe (E2O) Water Cycle Integrator (WCI) portal takes data that you select and plots it on a map to help you analyse, export and share it.

    The WCI portal is an open source project built by Plymouth Marine Laboratory's Remote Sensing Group. The portal builds on the development of several other EU funded projects, past and present, that PML have involvement in. You can find the code on GitHub at 

  • Data Management Guidelines

    The guidelines available from this web page cover a number of topics related to research data management.  The guidelinesare targeted to researchers wishing to submit data to the Finnish Social Science Data Archive, but may be helpful to other social scientists interested in practices related to research data management with the understanding that the guidelines refer to the situation in Finland, and may not be applicable in other countries due to differences in legislation and research infrastructure.
    High level topics (or chapters) covered include:
    - Data management planning (the data, rights, confidentiality and data security, file formats and programs, documentation on data processing and content, lifecycle, data management plan models)
    - Copyrights and agreements
    - Processing quantitative data files
    - Processing qualitiative data files
    - Anonymisation and personal data including policies related to ethical review of human sciences
    - Data description and metadata
    - Physical data storage
    - Examples 
    The guidelines are also available in FSD's Guidelines in DMPTuuli, a data management planning tool for Finnish research organisations. It provides templates and guidance for making a data management plan (DMP).

  • Bio-Linux

    Bio-Linux 8 is a powerful, free bioinformatics workstation platform that can be installed on anything from a laptop to a large server, or run as a virtual machine. Bio-Linux 8 adds more than 250 bioinformatics packages to an Ubuntu Linux 14.04 LTS base, providing around 50 graphical applications and several hundred command line tools. The Galaxy environment for browser-based data analysis and workflow construction is also incorporated in Bio-Linux 8.
    Bio-Linux 8 comes with a tutorial document suitable for complete beginners to Linux, though some basic bioinformatics knowledge (eg. what is a read, assembly, feature, translation) is assumed.  The tutorial comprises a general introduction to the Linux system and a set of exercises exploring specific bioinformatics tools.  You can find the latest version of the tutorial via the Bio-Linux documentation icon on the desktop.  There is also a copy on-line at: Allow yourself around 2 days to work through this, depending on your previous experience. Other, taugh courses can be found on the Bio-Linux Training web page.
    Bio-Linux 8 represents the continued commitment of NERC to maintain the platform, and comes with many updated and additional tools and libraries.  With this release we support pre-prepared VM images for use with VirtualBox, VMWare or Parallels.  Virtualised Bio-Linux will power the EOS Cloud, which is in development for launch in 2015.

  • Introduction to HydroShare

    HydroShare is an online, collaborative system for sharing and publishing a broad set of hydrologic data types, models, and code. It enables people to collaborate seamlessly in a high performance computing environment, thereby enhancing research, education, and application of hydrologic knowledge. HydroShare is being developed by a team from the Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) supported by National Science Foundation awards ACI-1148453 and ACI-1148090.

    The introduction to HydroSHare inlcude a Getting Started guide, a Frequently Asked Questions guide, and a number of videos on topics such as: 
    - Collaborate Data and Model Sharing using HydroShare
    - Delineate Watersheds and Perform Hydrologic Terrain Analysis with HydroShare and CyberGIS
    - Share, Publish and execute your SWAT models with HydroShare and SWATShare

    CUAHSI is an organization representing more than 130 U.S. universities and international water-science-related organizations and is sponsored by the National Science Foundation to provide infrastructure and services to advance the development of hydrologic science and education in the United States.

  • The Realities of Research Data Management

    The Realities of Research Data Management is a four-part series that explores how research universities are addressing the challenge of managing research data throughout the research lifecycle. In this series, we examine the context, influences, and choices higher education institutions face in building or acquiring RDM capacity—in other words, the infrastructure, services, and other resources needed to support emerging data management practices. Our findings are based on case studies of four institutions: University of Edinburgh (UK), the University of Illinois at Urbana-Champaign (US), Monash University (Australia) and Wageningen University & Research (the Netherlands), in four very different national contexts.

    - Part One of the series:  A Tour of the Research Data Management (RDM) Service Space, found at:
    - Part Two of the series:  Scoping the University RDM Service Bundle at:
    - Part Three of the series:  Incentives for building University RDM Services at:
    - Part Four of the series:  Sourcing and Scaling RDM Services at:
    In addition, supplemental material has been provided including in-depth profiles of each collaborating institution's RDM service spaces, a "Works in Progress Webinar:  Policy Realities in Research Data Management" with an accompanying three-part Planning Guide.  at:

  • Data Management for the Humanities

    The guidelines available from this web page cover a number of topics related to Data Management. Many of the resources and information found in this guide have been adapted from the UK Data Archive and the DH Curation Guide. The guidelines are targeted to researchers wishing to submit data to the social science research data, and would be useful to new data curators and data librarians in the Arts & Humanities as well.  Each section has useful references for further study, if desired.
    What You Will Find in This Guide:
    -How to Document and Format your Data
    -Examples of Data Management Plans (DMP) and Data Curation Profiles (DCP)
    -Tools to Help You Create DMPs and DCPs
    -California Digital Library Data Repositories
    -Where to Get Help on Campus
    -A list of Federal Funding Agencies and Their Data Management Requirements
    -A Description of Data Curation for the Humanities and What Makes Humanities Data Unique
    -Information on Data Representation
    -Resources on Data Description Standards

  • Intro to SQL for Data Science

    The role of a data scientist is to turn raw data into actionable insights. Much of the world's raw data—from electronic medical records to customer transaction histories—lives in organized collections of tables called relational databases. Therefore, to be an effective data scientist, you must know how to wrangle and extract data from these databases using a language called SQL (pronounced ess-que-ell, or sequel). This course teaches you everything you need to know to begin working with databases today!


    ResearchVault is a secure computing environment where scientists and collaborators can conduct research on restricted and confidential data.

    ResearchVault (also known as ResVault) is designed to act as a workstation that is secure and pre-approved with the capacity for large-scale data storage and computation. Researchers can:

    Securely store restricted data like:

    • electronic protected health information (ePHI) (HIPAA)
    • export-controlled data (ITAR/EAR)
    • student data (FERPA)
    • controlled unclassified information (CUI)
    • intellectual property data (IP)

    Store and work with larger data sets than is possible on a regular workstation
    Perform work on stored data sets with familiar software tools running on virtual machines located in the UF data center
    Concurrently run more programs than on a regular workstation
    Display work results on a graphical interface that is securely transmitted to remote devices such as desktops, laptops, or iPads
    Work collaboratively with other researchers on the same data sets using different workstations.

    The system is modeled on a bank vault where you receive:

    An individual deposit box with secure storage for valuables
    Privacy from other users and bank staff
    A secure area within the vault to privately access your valuables

    ResVault is available to University of Florida faculty and students.  People not associated with UF can be sponsored by faculty at UF.  Training materials available from the ResVault home page include an introductory / overview video, and a recording of a training session on the research administration for restricted data that was given on Oct 12, 2018. The recording is available from the UF Media Website. It describes how the requirement for using special IT infrastructure is handled and how the right environment for each project is determined, as well as the training requirements for project participants.

  • Managing Creative Arts Research Data

    This post-graduate teaching module for creative arts disciplines is focused on making data and digital documentation that is highly usable and has maximum impact. The module content is particularly well suited for inclusion within MA programmes dealing with ephemeral art forms such as dance, music, visual art, theatre or media design. Learning is self-directed. MCARD-ExcersiceV1.0.pdf is an optional, summative assessment exercise.

    This module, funded as part of the wider JISC Managing Research Data programme as part of the Curating Artistic Research Output (CAiRO) Project, offers data management knowledge tailored to the special requirements of the creative arts researcher who is producing non-standard (i.e. non-textual) research outputs. The module aims to develop the development of skills required by arts researchers to effectively self-archive and then disseminate data made through research activities. The module can also help researchers to better understand data management issues and then communicate needs to third parties, such as institutional repositories, in order to negotiate appropriate levels of service.

    Downloadable resources associated with this module include a zip file containing module content as stand alone .html files, a PDF of optional, summative exercise, and a PDF version of the introduction.  Topics include:
    Unit 1: Introducing art as research data
    Unit 2: Creating art as research data
    Unit 3: Managing art as research data
    Unit 4: Delivering art as research data

    Each unit has a suggested order (accessible via the navigation on the left of each page) and addition ‘Focus on’ content which further illustrates topics covered in the main body.  Module content can be accessed directly online at:

  • Best Practices in Data Collection and Management Workshop

    Ever need to help a researcher share and archive their research data? Would you know how to advise them on managing their data so it can be easily shared and re-used? This workshop will cover best practices for collecting and organizing research data related to the goal of data preservation and sharing. We will focus on best practices and tips for collecting data, including file naming, documentation/metadata, quality control, and versioning, as well as access and control/security, backup and storage, and licensing. We will discuss the library’s role in data management, and the opportunities and challenges around supporting data sharing efforts. Through case studies we will explore a typical research data scenario and propose solutions and services by the library and institutional partners. Finally, we discuss methods to stay up to date with data management related topics.

    This workshop was presented at NN/LM MAR Research Data Management Symposium: Doing It Your Way: Approaches to Research Data Management for Libraries.  Powerpoint slides are available for download.  files include a biophysics case study.

    Terms of Access:  There is 1 restricted file in this dataset which may be used;  however, you are asked not to share the Mock lab notebook. It is completely fictitious. Users may request access to files.

  • Pyunicorn Tutorials

    pyunicorn (Unified Complex Network and RecurreNce analysis toolbox) is a fully object-oriented Python package for the advanced analysis and modeling of complex networks. Above the standard measures of complex network theory such as degree, betweenness and clustering coefficient it provides some uncommon but interesting statistics like Newman’s random walk betweenness. pyunicorn features novel node-weighted (node splitting invariant) network statistics as well as measures designed for analyzing networks of interacting/interdependent networks.

    Moreover, pyunicorn allows to easily construct networks from uni- and multivariate time series data (functional (climate) networks and recurrence networks). This involves linear and nonlinear measures of time series analysis for constructing functional networks from multivariate data as well as modern techniques of nonlinear analysis of single time series like recurrence quantification analysis (RQA) and recurrence network analysis. Other introductory information about pyunicorn can be found at: .

    Tutorials for pyunicorn are designed to be self-explanatory.  Besides being online, the tutorials are also available as ipython notebooks.  For further details about the used classes and methods please refer to the API at:

  • E-Infrastructures and Data Management Toolkit

    This online toolkit provides training and educational resources for data discovery, management, and curation across the globe, in support of an international collaborative effort to enable open access to scientific data.  Tools within the toolkit include:
    - DDOMP Researcher Guide which has resources and tips for creating a successful DDOMP (data management plan)
    - Data Management Training including webinars, courses, certifications, and literature on data management topics
    - Best Practices & Standards which provide guidelines for effective data management.
    Video tutorials about each of these tools are available at: 
    Other capacity building tools include a Data Skills Curricula Framework to enhance information management skills for data-intensive science which was developed by the Belmont Forum’s e-Infrastructures and Data Management (e-I&DM) Project to improve data literacy, security and sharing in data-intensive, transdisciplinary global change research.  More information about the curricula framework including a full report and an outline of courses important for researchers doing data-intensive research can be found at: .

  • Introduction to Research Data Management - half-day course (Oxford)

    Teaching resources for a half-day course for researchers (including postgraduate research students), giving a general overview of some major research data management topics. Included are a slideshow with presenter's notes, a key resources hand-out, and two other hand-outs for use in a practical data management planning exercise. These course materials are part of a set of resources created by the JISC Managing Research Data programme-funded DaMaRO Project at the University of Oxford. The original version of the course includes some Oxford-specific material, so delocalized versions (which omit this) of the slideshow and the key resources hand-out are also provided

  • Introduction to Humanities Research Data Management

    Reusable, machine-readable data are one pillar of Open Science (Open Scholarship). Serving this data
    reuse aspect requires from researchers to carefully document their methods and to take good care of
    their research data. Due to this paradigm shift, for Humanities and Heritage researchers, activities and
    issues around planning, organizing, storing, and sharing data and other research results and products
    play an increasing role. Therefore, during two workshop sessions, participants will dive
    into a number of topics, technologies, and methods that are connected with
    Humanities Research Data Management. The participants will acquire knowledge and skills that will
    enable them to draft their own executable research data management plan that will support the
    production of reusable, machine-readable data, a key prerequisite for conducting effective and
    sustainable projects. Topics that will be covered are theoretical reflections on the role of data within
    humanities research and cultural heritage studies, opportunities and challenges of eHumanities and
    eResearch, implementing the FAIR principles and relevant standards, and basics of Data Management
    Learning outcomes: Participants of this workshop will gain an overview about issues related to
    Humanities Research Data Management and learn about relevant tools and information resources.
    Through a hands-on session, the participants will be especially equipped and skilled to draft the nucleus
    of their own Research Data Management Plan.