All Learning Resources

  • DMP Assistant: bilingual tool for preparing data management plans (DMPs)

    The DMP Assistant is a bilingual tool to assist in the preparation of a Data Management Plan (DMP). This tool, which is based on international standards and best practices in data management, guides the researcher step by step through the key questions to develop his plan. DMP Assistant is powered by an open source application called DMPOnline, which is developed by the Digital Curation Centre (DCC).  Site registration is required.  Data management planning templates are available for the DMP Assistant after registration and sign in.

  • Data Management for Clinical Research MOOC

    This course presents critical concepts and practical methods to support planning, collection, storage, and dissemination of data in clinical research.
     
    Understanding and implementing solid data management principles is critical for any scientific domain. Regardless of your current (or anticipated) role in the research enterprise, a strong working knowledge and skill set in data management principles and practice will increase your productivity and improve your science. The instructors' goal is to use these modules to help you learn and practice this skill set.

    This course assumes very little current knowledge of technology other than how to operate a web browser. The course will focus on practical lessons, short quizzes, and hands-on exercises as we explore together best practices for data management.

    The six modules cover these topics:

    • Research Data Collection Strategy
    • Electronic Data Capture Fundamentals
    • Planning a Data Strategy for a Prospective Study
    • Practicing What We've Learned: Implementation
    • Post-Study Activities and Other Considerations
    • Data Collection with Surveys
  • National Network of Libraries of Medicine (NNLM) Research Data Management Webinar Series

    The National Network of Libraries of Medicine (NNLM) Research Data Management (RDM) webinar series is a collaborative, bimonthly series intended to increase awareness of research data management topics and resources.  The series aims to support RDM within the library to better serve librarians and their institutional communities. Topics include, but are not limited to, understanding a library’s role in RDM, getting started, data management planning, and different RDM tools.

    Several NNLM Regional Medical Libraries will collaborate and combine efforts to feature experts from the field for this national webinar series. Each session will include separate objectives based on the featured webinar presenter. Attendee participation will be possible through the WebEx platform chat features and other electronic methods designed by the guest presenter. Sessions are recorded, closed-captioned, and posted for later viewing.

    Each session will last approximately 1 hour and 1 MLA CE contact hour will be offered per session. CE contact hours will only be available during the live presentations of the webinar.

    Watch the webpage for upcoming webinars.

  • RDMRose Learning Materials

    RDMRose was a JISC funded project to produce, and teach professional development learning materials in Research Data Management (RDM) tailored for Information professionals. The Slideshare presentations and documents include an overview of RDM, research in higher education, looking at research data, the research data lifecycle, data management plans, research data services, metadata, and data citation.  

    RDMRose developed and adapted learning materials about RDM to meet the specific needs of liaison librarians in university libraries, both for practitioners’ CPD and for embedding into the postgraduate taught curriculum. Its deliverables included open educational resources materials suitable for learning in multiple modes, including face to face and self-directed learning.

     

  • Overview of Interdisciplinary Earth Data Alliance (IEDA) Data Management Resources

    In the digital era, documenting and sharing our scientific data is growing increasingly important as an integral part of the scientific process. Data Management not only makes our data resources available for others to build upon, but it also enables data syntheses and new analyses that hold the potential for significant scientific advancement. Effective data management begins during the planning stages of a project and continues throughout the research process from field and/or laboratory work, through analysis, and culminating with scientific literature and data publication. By planning ahead, and following some best practices along the way, the process of data management can be simple and relatively low-effort, enabling rapid contribution and publication of data in the appropriate data systems at the conclusion of a project.

    IEDA offers a variety of tools to support investigators along the full continuum of their data management efforts:  Links to these tools and resources are available from the landing page for this resource.

    Pre-Award

    • IEDA Data Discovery Tools
    • IEDA Data Management Plan (DMP) Tool

    Research & Analysis

    • Register sample-based data sets and samples 
      • Register sample metadata and get a unique sample identifier (IGSN)
      • Download Templates for Analytical Data
      • Learn about contributing Analytical Data to the EarthChem Library
    • Register sensor-based data sets and samples 
      • Contribute sensor data files (e.g. geophysical data) and supporting metadata to MGDS
    • IEDA Analysis Tools
      • GeoMapApp earth science exploration and visualization application 
        • Analyze your own geospatial data within the context of other available datasets

    Synthesis & Publication

    • Register final data sets with IEDA 
    • Publish your data 
      • Publishing your data with a DOI ensures that it can be directly referenced in your paper and cited by others.
    • IEDA Data Compliance Reporting (DCR) Tool 
      • Rapidly generate a Data Compliance Report (DCR) based on your NSF award number to demonstrate that your data are registered with IEDA systems and you are compliant with NSF Data Policies.
  • USGS Data Templates Overview

    Creating Data Templates for data collection, data storage, and metadata saves time and increases consistency. Utilizing form validation increases data entry reliability.

    Topics include:

    • Why use data templates?
    • Templates During Data Entry - how to design data validating templates 
    • After Data Entry - ensuring accurate data entry
    • Data Storage and Metadata
    • Best Practices
      • Data Templates
      • Long-term Storage
    • Tools for creating data templates
    • Google Forms 
    • Microsoft Excel
    • Microsoft Access
    • OpenOffice - Calc

     

  • Introduction to Scientific Visualization

    Scientific Visualization transforms numerical data sets obtained through measurements or computations into graphical representations. Interactive visualization systems allow scientists, engineers, and biomedical researchers to explore and analyze a variety of phenomena in an intuitive and effective way. The course provides an introduction to the principles and techniques of Scientific Visualization. It covers methods corresponding to the visualization of the most common data types, as well as higher-dimensional, so-called multi-field problems. It combines a description of visualization algorithms with a presentation of their practical application. Basic notions of computer graphics and human visual perception are introduced early on for completeness. Simple but very instructive programming assignments offer a hands-on exposure to the most widely used visualization techniques.

    Note that the lectures, demonstration, and tutorial content require a Purdue Credentials,Hydroshare, or CILogon account.

    Access the CCSM Portal/ESG/ESGC Integration slide presentation at  https://mygeohub.org/resources/50/download/ccsm.pdf. The CCSM/ESG/ESGC collaboration provides a semantically enabled environment that includes modeling, simulated and observed data, visualization, and analysis.
    Topics include:

    • CCSM Overview
    • CCSM on the TeraGrid
    • Challenges
    • Steps in a typical CCSM Simulation
    • Climate Modeling Portal: Community Climate System Model (CCSM) to simulate climate change on Earth
    • CCSM Self-Describing Workflows 
    • Provenance metadata collection
    • Metadata

     

  • Simplifying the Reuse and Interoperability of Hydrologic Data Sets and Models with Semantic Metadata that is Human-Readable & Machine-Actionable

    This slide set discusses the big, generic problem facing geoscientists today that stems from lack of interoperability across a huge number of heterogeneous resources, and how to solve it.  Practical solutions to tame the inherent heterogeneity involve the collection of standardized, "deep-description" metadata for resources that are then wrapped with standardized APIs that provide callers wtih access to both the data and the metadata.  

  • Data Rescue: Packaging, Curation, Ingest, and Discovery

    Data Conservancy was introduced to Data Rescue Boulder through our long-time partner Ruth Duerr of Ronin Institute. Through our conversations, we recognized that Data Rescue Boulder has a need to process large number of rescued data sets and store them in more permanent homes. We also recognized that Data Conservancy along with Open Science Framework have the software infrastructure to support such activities and bring a selective subset of the rescued data into our own institution repository. We chose the subset of data based on a selection from one of the Johns Hopkins University faculty members.


    This video shows one of the pathways through which data could be brought into a Fedora-backed institutional repository using our tools and platforms


    Data Conservancy screen cast demonstrating integration between the Data Conservancy Packaging Tool, the Fedora repository, and the Open Science Framework. Resources referenced throughout the screen cast are linked below.


    DC Package Tool GUI


    DC Package Ingest


    Fedora OSF Storage Provider


    (under development as of April 2017)


  • ORNL DAAC Data Recipes

    A collection of tutorials, called "data recipes" that describe how to use Earth science data from NASA's Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) using easily available tools and commonly used formats for Earth science data focusing on biogeochemical dynamics data.  These tutorials are available to assist those wishing to learn or teach how to obtain and view these data. 

  • EarthChem Library: How to Complete a Data Submission Template

    Learn how to complete a data submission template for the EarthChem Library (www.earthchem.org/library). You can access existing templates at www.earthchem.org/data/templates. If you do not see a template appropriate for your data type, please contact EarthChem at info@earthchem.org.

  • iData Tutorial

    A brief tutorial that shows how to upload, preview, and publish from iData. To use the accompanying My Geo Hub tutorial exercises, go to https://mygeohub.org/resources/1217. Note that the number before each step is the time on the YouTube video where it shows how each step is done. Also, note that the video does not contain audio content.

  • EarthChem Library: Submission Guidelines

    Learn general guidelines for data submission to the EarthChem Library (www.earthchem.org/library), including the data types and formats accepted and additional best practices for submission.

  • How to Manage Your Samples in MySESAR

    The System for Earth Sample Registration (SESAR) operates a registry that distributes the International Geo Sample Number IGSN. SESAR catalogs and preserves sample metadata profiles, and provides access to the sample catalog via the Global Sample Search.

    MySESAR provides a private working space in the System for Earth Sample Registration. This tutorial will introduce you to how to manage samples in MySESAR, including how to search the sample catalog, how to view and edit samples, how to print labels, how to group samples and how to transfer ownership of samples. For details relating to sample registration, please see tutorials for individual sample and batch sample registration here: http://www.geosamples.org/help/registration.

    MySESAR allows you to:

    • obtain IGSNs for your samples by registering them with SESAR.
    • register samples one at a time by entering metadata into a web form.
    • register multiple samples by uploading metadata in a SESAR spreadsheet form.
    • generate customized SESAR spreadsheet forms.
    • view lists of samples that you registered.
    • edit sample metadata profiles.
    • upload images and other documents such as field notes, maps, or links to publications to a sample profile.
    • restrict access to metadata profiles of your samples.
    • transfer ownership of a sample to another SESAR user.
  • GeoBuilder for Exploring Geospatial Data

    A step-by-step tutorial for GeoBuilder. The GeoBuilder tool provides a wizard type interface that guides users through several steps for loading, selecting, configuring and analyzing geo-referenced tabular data. As a result, the data is presented on an Open Street Map with customized annotation, station/site popup, and dynamic filtering and plotting. The tool can be used in two ways: first, an end user can use it to dynamically load and explore a csv file of interest. Second, a data owner can use it to build a customized view of the data he wants to share, save the configuration, and publish the data, configuration, and viewer as a new “tool” specifically for his data. With this, any scientist can easily develop an interactive web-enabled GIS interface to share their data within minutes, as compared to the past where they needed to hire a web developer and spent months to get the same done.

    Note that My Geo Hub registration is required to access the GeoBuilder tool.

  • GeoBuilder - How to Share My Session

    A brief tutorial that shows how to share a GeoBuilder session. ​The GeoBuilder tool provides a wizard type interface that guides users through several steps for loading, selecting, configuring and analyzing geo-referenced tabular data. To use the accompanying My Geo Hub tutorial exercises, go to https://mygeohub.org/resources/1219. Note that the number before each step is the time on the YouTube video where it shows how each step is done. Also, note that the video does not contain audio content.

    For more information about GeoBuilder, go to ​https://mygeohub.org/resources/geobuilder.

  • Introduction to Lidar

    This self-paced, online training introduces several fundamental concepts of lidar and demonstrates how high-accuracy lidar-derived elevation data support natural resource and emergency management applications in the coastal zone.

    Learning objectives:

    • Define lidar
    • Select different types of elevation data for specific coastal applications
    • Describe how lidar are collected
    • Identify the important characteristics of lidar data
    • Distinguish between different lidar data products
    • Recognize aspects of data quality that impact data usability
    • Locate sources of lidar data
    • Discover additional information and additional educational resources

    Note: requires Flash Plugin.

  • Introduction to Lidar

    This course provides an overview of Lidar technology; data collection workflow; data products formats, and metadata; Lidar and vegetation; QA/QC, artifacts, issues to keep in mind; and DEM generation from Lidar point cloud data.

  • Genomics Curriculum

    The focus of this workshop is on working with genomics data and data management and analysis for genomics research. It covers data management and analysis for genomics research including best practices for the organization of bioinformatics projects and data, use of command line utilities, use of command line tools to analyze sequence quality and perform variant calling, and connecting to and using cloud computing.
    Lessons:

    • Project organization and management
    • Introduction to the command line
    • Data wrangling and processing
    • Introduction to cloud computing for genomics
    • Data analysis and visualization in R *beta*
  • Ecology Curriculum

    This workshop uses a tabular ecology dataset from the Portal Project Teaching Database and teaches data cleaning, management, analysis, and visualization. There are no pre-requisites, and the materials assume no prior knowledge about the tools. We use a single dataset throughout the workshop to model the data management and analysis workflow that a researcher would use.
    Lessons:

    • Data Organization in Spreadsheets
    • Data Cleaning with OpenRefine
    • Data Management with SQL
    • Data Analysis and Visualization in R
    • Data Analysis and Visualization in Python


    The Ecology workshop can be taught using R or Python as the base language.
    Portal Project Teaching Dataset: the Portal Project Teaching Database is a simplified version of the Portal Project Database designed for teaching. It provides a real-world example of life-history, population, and ecological data, with sufficient complexity to teach many aspects of data analysis and management, but with many complexities removed to allow students to focus on the core ideas and skills being taught.
     

  • The Agriculture Open Data Package

    he third GODAN Capacity Development Working Group webinar, supported by GODAN Action, focused on the Agriculture Open Data Package (AgPack).
    In 2016 GODAN, ODI, the Open Data Charter and OD4D developed the Agricultural Open Data Package (AgPack) to help governments to realize impact with open data in the agriculture sector and food security. Details at http://www.agpack.info 
    During the webinar the speakers outlined examples and use cases of governments using open data in support of their agricultural sector and food security. Also, the different roles a government can pick up to facilitate such a development, how open data can support government policy objectives on agriculture and food security. 

  • Publishing Open Data from an Organisational Point of View

    The second GODAN Capacity Building webinar was on “Publishing open data from an organisational point of view” and was lead by GODAN Action colleagues from the Open Data Institute in London.
    This webinar focused on key aspects:
    - Why publish open data
    - What benefit can publishing open data bring
    - Why licenses are the most important aspect of publishing open data
    - How to start with publishing open data

  • GODAN Working Group on Capacity Development

    The first webinar organized by the GODAN (Global Open Data for Agriculture & Nutrition) Working Group on Capacity Development gave an overview of GODAN, its objectives and how people can get involved. The webinar also provided information on the purpose of the GODAN Working Group on Capacity Development and explained how to join and get involved in the activities.

  • Curriculum on Open Data and Research Data Management in Agriculture and Nutrition

    This paper details the curriculum for the Open Data Management in Agriculture and Nutrition e-learning course, including background to the course, course design, target audiences, and lesson objectives and outcomes. 
    This free online course aims to strengthen the capacity of data producers and data consumers to manage and use open data in agriculture and nutrition. One of the main learning objectives is for the course to be used widely within agricultural and nutrition knowledge networks, in different institutions. The course also aims to raise awareness of different types of data formats and uses, and to highlight how important it is for data to be reliable, accessible and transparent.
    The course is delivered through Moodle e-learning platform.  Course units include:
    Unit 1:  Open data principles
    Unit 2:  Using open data
    Unit 3:  Making data open
    Unit 4:  Sharing open data
    Unit 5:  IPR and Licensing
    By the end of the course, participants will be able to:
    - Understand the principles and benefits of open data
    -  Understand ethics and responsible use of data
    -  Identify the steps to advocate for open data policies
    -  Understand how and where to find open data
    -  Apply techniques to data analysis and visualisation
    -  Recognise the necessary steps to set up an open data repository
    -  Define the FAIR data principles
    -  Understand the basics of copyright and database rights
    -  Apply open licenses to data
    The course is open to infomediaries which includes ICT workers, technologist - journalists, communication officers, librarians and extensionists; policy makers, administrators and project managers, and researchers, academics and scientists working in the area of  agriculture, nutrition, weather and climate, and land data.

  • New England Collaborative Data Management Curriculum

    NECDMC is an instructional tool for teaching data management best practices to undergraduates, graduate students, and researchers in the health sciences, sciences, and engineering disciplines. Each of the curriculum’s seven online instructional modules aligns with the National Science Foundation’s data management plan recommendations and addresses universal data management challenges. Included in the curriculum is a collection of actual research cases that provides a discipline specific context to the content of the instructional modules. These cases come from a range of research settings such as clinical research, biomedical labs, an engineering project, and a qualitative behavioral health study. Additional research cases will be added to the collection on an ongoing basis. Each of the modules can be taught as a stand-alone class or as part of a series of classes. Instructors are welcome to customize the content of the instructional modules to meet the learning needs of their students and the policies and resources at their institutions.

  • Imaging and Analyzing Southern California’s Active Faults with High-Resolution Lidar Topography

    Over the past 5+ years, many of Southern California’s active faults have been scanned with airborne lidar through various community and PI-data collection efforts (e.g., the B4 Project, EarthScope, and the post-El Mayor–Cucapah earthquake). All of these community datasets are publicly available (via OpenTopography: https://www.opentopography.org) and powerfully depict the effect of repeated slip along these active faults as well as surface processes in a range of climatic regimes. These datasets are of great interest to the Southern California Earthquake Center (SCEC) research and greater academic communities and have already yielded important new insights into earthquake processes in southern California.

    This is a short course on LiDAR technology, data processing, and analysis techniques. The foci of the course are fault trace and geomorphic mapping applications, integration with other geospatial data, and data visualization and analysis approaches. Course materials include slide presentations, video demonstrations, and text-based software application tutorials.

  • GODAN Webinar Series

    A series of webinars organised by the GODAN Working Group on Capacity Development in collaboration with CTA. The Global Open Data for Agriculture and Nutrition (GODAN) supports the proactive sharing of open data to make information about agriculture and nutrition available, accessible and usable to deal with the urgent challenge of ensuring world food security. A core principle behind GODAN is that a solution to Zero Hunger lies within existing, but often unavailable, agriculture and nutrition data. At the GODAN Summit in September 2016, GODAN launched a new Working Group on Capacity Development. More info here: https://www.godan.info/news/leveraging-power-webinars-support-open-data-...

  • Sustaining Science Gateways—Finding your "best fit" model

    Digital projects – science gateways, data repositories, educational websites, and others—have a few things in common. They can deliver a great deal of value to users – by sharing widely sophisticated tools, large data sets, or access to computing capacity among those in the academic sector who really need them to advance their work.  But they share something else in common, too: They are devilishly hard to run in a way that permits ongoing growth and expansion.

    In this webinar, Nancy Maron, a lead instructor in the Science Gateways Bootcamp, introduces participants to the key elements of sustainability planning – the building blocks for developing Science Gateways that have the best chance for ongoing growth.

    The webinar will introduce sustainability models and share some key tactics for identifying the models that are most likely to work for your gateway. We will touch upon funding models, the competitive environment, and audience assessment, to show how these need to be considered in tandem with any plan.

  • Working in the R Ecosystem: Building Applications & Content for Your Gateway

    The R programming language first appeared on the scene in the 1990's as an open source environment for statistical modeling and data analysis. Throughout the last decade, interest in the language has grown alongside researcher's abilities to collect and store larger amounts of data. Today, scientific and business decisions increasingly rely on the interpretation of this data. New libraries for processing data and communicating results are being debuted in ways that break down traditional language silos. Technologies like interactive documents, HTML based applications, and RESTful APIs have exposed capability gaps between R's interfaces for numerical analysis libraries and its built-in ability for graphical display. In this webinar, Derrick Kearney will survey several R libraries that are helping people bridge the gap between their R-based analysis and the numerous ways people are representing results today, all of which can be published on your science gateway, thus extending your research impact to others in a reproducible way.

  • Webinar: National Data Service (NDS) Labs Workbench

    The growing size and complexity of high-value scientific datasets are pushing the boundaries of traditional models of data access and discovery. Many large datasets are only accessible through the systems on which they were created or require specialized software or computational resources for re-use. In response to this growing need, the National Data Service (NDS) consortium is developing the Labs Workbench platform, a scalable, web-based system intended to support turn-key deployment of encapsulated data management and analysis tools to support exploratory analysis and development on cloud resources that are physically "near" the data and associated high-performance computing (HPC) systems.  The Labs Workbench may complement existing science gateways by enabling exploratory analysis of data and the ability for users to deploy and share their own tools. The Labs Workbench platform has also been used to support a variety training and workshop environments.

    This webinar includes a demonstration of the Labs Workbench platform and a discussion of several key use cases. A presentation of findings from the recent Workshop on Container Based Analysis Environments for Research Data Access and Computing further highlight compatibilities between science gateways and interactive analysis platforms such as Labs Workbench.
     

  • Facing the data challenge: Developing data policy and services

    Overview of research data management (RDM), who is responsible for RDM, the components of a researh data service, and policy and research activity roadmap development in compliance with Engineering and Physical Sciences Research Council (EPSRC) funding expectations in the UK. 

  • Research Data Management and Sharing MOOC

    This course will provide learners with an introduction to research data management and sharing. After completing this course, learners will understand the diversity of data and their management needs across the research data lifecycle, be able to identify the components of good data management plans and be familiar with best practices for working with data including the organization, documentation, and storage and security of data. Learners will also understand the impetus and importance of archiving and sharing data as well as how to assess the trustworthiness of repositories.

    Note: The course is free to access. However, if you pay for the course, you will have access to all of the features and content you need to earn a Course Certificate from Coursera. If you complete the course successfully, your electronic Certificate will be added to your Coursera Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. Note that the Course Certificate does not represent official academic credit from the partner institution offering the course.
    Also, note that the course is offered on a regular basis. For information about the next enrollment, go to the provided URL.

     
  • GEOMAPApp Tutorials

    The video tutorials are available from the home page under the general topics listed below, and also on the GeoMapApp YouTube channel at:  https://www.youtube.com/user/GeoMapApp. The tutorials demonstrate how to perform common tasks with GeoMapApp. Full information on the functions is available at the provided web address. General topics include:
     - Introduction
     - Import Your Own Data
     - Analyze Data
     - Working with Gridded Data
     - Available Data and examples
     - Portals (including, for example Ocean Floor Drilling, Multibeam Swath Bathymetry DAta, Seismic Data, Earthquake data)
     - In-Depth Webinars

    GeoMapApp is an earth science exploration and visualization application that is continually being expanded as part of the Marine Geoscience Data System (MGDS) at the Lamont-Doherty Earth Observatory of Columbia University. The application provides direct access to the Global Multi-Resolution Topography (GMRT) compilation that hosts high resolution (~100 m node spacing) bathymetry from multibeam data for ocean areas and ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer) and NED (National Elevation Dataset) topography datasets for the global land masses.  

  • Do-It-Yourself Research Data Management Training Kit for Librarians

    Online training materials on topics designed for small groups of librarians who wish to gain conficence and understanding of research data management.  The DIY Training Kit is designed to contain everything needed to complete a similar training course on your own (in small groups) and is based on open educational materials. The materials have been enhanced with Data Curation Profiles and reflective questions based on the experience of academic librarians who have taken the course.

    The training kit includes:  
     - Promotional slides for the RDM Training Kit
    - Training schedule
    - Research Data MANTRA online course by EDINA and Data Library, University of Edinburgh
    - Reflective writing questions
    - Selected group exercises (with answers) from UK Data Archive, University of Essex - Managing and sharing data: Training resources. September, 2011 (PDF). Complete RDM Resources Training Pack available: 
    https://data-archive.ac.uk/create-manage/training-resources
    - Podcasts for short talks by the original Edinburgh speakers if running course without ‘live’ speakers (Windows or Quicktime versions).
    - Presentation files (pptx) if learners decide to take turns presenting each topic.
    - Evaluation forms
    - Independent study assignment: Interview with a researcher, based on Data Curation Profile, from D2C2, Purdue University Libraries and Boston University Libraries.

  • CESSDA Expert Tour Guide on Data Management

    Target audience and mission:
    This tour guide was written for social science researchers who are in an early stage of practising research data management. With this tour guide, CESSDA wants to contribute to increased professionalism in data management and to improving the value of research data.

    Overview:
    If you follow the guide, you will travel through the research data lifecycle from planning, organising, documenting, processing, storing and protecting your data to sharing and publishing them. Taking the whole roundtrip will take you approximately 15 hours. You can also just hop on and off.

    During your travels, you will come across the following recurring topics:
    Adapt Your DMP
    European Diversity
    Expert Tips
    Tour Operators

    Current chapters include the following topics:  Plan; Organise & Document; Process; Store; Protect;  Archive & Publish.  Other chapters may be added over time.

  • DataONE Data Management Module 01: Why Data Management

    As rapidly changing technology enables researchers to collect large, complex datasets with relative ease, the need to effectively manage these data increases in kind. This is the first lesson in a series of education modules intended to provide a broad overview of various topics related to research data management. This 30-40 minute module covers trends in data collection, storage and loss, the importance and benefits of data management, and an introduction to the data life cycle and includes a downloadable presentation (PPT or PDF) with supporting hands-on exercise and handout.

  • DataONE Data Management Module 02: Data Sharing

    When first sharing research data, researchers often raise questions about the value, benefits, and mechanisms for sharing. Many stakeholders and interested parties, such as funding agencies, communities, other researchers, or members of the public may be interested in research, results and related data. This 30-40 minute lesson addresses data sharing in the context of the data life cycle, the value of sharing data, concerns about sharing data, and methods and best practices for sharing data and includes a downloadable presentation (PPT or PDF) with supporting hands-on exercise and handout.

  • DataONE Data Management Module 03: Data Management Planning

    Data management planning is the starting point in the data life cycle. Creating a formal document that outlines what you will do with the data during and after the completion of research helps to ensure that the data is safe for current and future use. This 30-40 minute lesson describes the benefits of a data management plan (DMP), outlines the components of a DMP, details tools for creating a DMP, provides NSF DMP information, and demonstrates the use of an example DMP and includes a downloadable presentation (PPT or PDF) with supporting hands-on exercise and handout.

  • DataONE Data Management Module 04: Data Entry and Manipulation

    When entering data, common goals include: creating data sets that are valid, have gone through an established process to ensure quality, are organized, and reusable. This lesson outlines best practices for creating data files. It will detail options for data entry and integration, and provide examples of processes used for data cleaning, organization and manipulation and includes a downloadable presentation (PPT or PDF) with supporting hands-on exercise, handout, and supporting data files.

  • DataONE Data Management Module 05: Data Quality Control and Assurance

    Quality assurance and quality control are phrases used to describe activities that prevent errors from entering or staying in a data set. These activities ensure the quality of the data before it is collected, entered, or analyzed, as well as actively monitoring and maintaining the quality of data throughout the study. In this lesson, we define and provide examples of quality assurance, quality control, data contamination and types of errors that may be found in data sets. After completing this lesson, participants will be able to describe best practices in quality assurance and quality control and relate them to different phases of data collection and entry. This 30-40 minute lesson includes a downloadable presentation (PPT or PDF) with supporting hands-on exercise and handout.