All Learning Resources

  • USGS Data Management Training Modules – Planning for Data Management

    This is one of six interactive modules created to help researchers, data stewards, managers and the public gain an understanding of the value of data management in science and provide best practices to perform good data management within their organization. In this module, we will provide an overview of data management plans. First, we will define and describe Data Management Plans, or DMPs. We will then explain the benefits of creating a DMP. Finally, we will provide instructions on how to prepare a DMP, including covering key components common to most DMPs.

  • USGS Data Management Training Modules – Best Practices for Preparing Science Data to Share

    This is one of six interactive modules created to help researchers, data stewards, managers and the public gain an understanding of the value of data management in science and provide best practices to perform good data management within their organization. In this module, you’ll learn:

    The importance of maintaining well-managed science data
    Nine fundamental practices scientists should implement when preparing data to share
    Associated best practices for each data management habit

  • USGS Data Management Training Modules – Science Data Lifecycle

    This is one of six interactive modules created to help researchers, data stewards, managers and the public gain an understanding of the value of data management in science and provide best practices to perform good data management within their organization. By the end of this module, you should be able to answer the following questions…  What is a science data lifecycle?  Why is a science data lifecycle important and useful?  What are the elements of the USGS science data lifecycle, and how are they connected?  What are the difference roles and responsibilities?  Where do you go if you need more information?

  • USGS Data Management Training Modules – Planning for Data Management Part II

    This is one of six interactive modules created to help researchers, data stewards, managers and the public gain an understanding of the value of data management in science and provide best practices to perform good data management within their organization. By the end of this course you should know the difference between data management plans and project plans; you should know how to use the DMPTool to create a data management plan; and you should understand the basic information that should go into a data management plan.

  • Template Research Data Management workshop for STEM researchers

    These materials are designed as a template for an introductory Research Data Management workshop for STEM postgraduate students and Early Career Researchers. The workshop is interactive and is designed to be run for 2-3 hours depending on which sections of the workshop are delivered. As it is a template workshop there is a lot of material to cover all disciplines, it is unlikely that all sections would be of interest to any one group of researchers. The sections are:
    Introduction
    Backup and file sharing
    How to organise your data well
    Data Tools
    Personal and sensitive data
    Data sharing
    Data Management Plans

    The workshop works best when adapted for a particular discipline and with a maximum of 30 participants. This workshop was developed for the Data Champions programme at the University of Cambridge and is an adaptation of workshops which are run on a regular basis for PhD students and Postdoctoral researchers. If you would like any more information please email [email protected] and we would be happy to answer any questions that you have.

  • LEARN Toolkit of Best Practice for Research Data Management

    The LEARN Project's Toolkit of Best Practice for Research Data Management expands on the issues outlines in the  LERU Roadmap for Research Data (2013).  It is freely downloadable, and is a deliverable for the European Commission.  It includes:

    • 23 Best-Practice Case Studies from institutions around the world, drawn from issues in the original LERU Roadmap;
    • 8 Main Sections, on topics such as Policy and Leadership, Open Data, Advocacy and Costs;
    • One Model RDM Policy, produced by the University of Vienna and accompanied by guidance and an overview of 20 RDMpolicies across Europe;
    • An Executive Briefing in six languages, aimed at senior institutional decision makers.

    The Executive Briefing of the LEARN Toolkit is available in English, Spanish, German, Portuguese, French and Italian translations.

  • Planning for Software Reproducibility and Reuse

    Many research projects depend on the development of scripts or other software to collect data, perform analyses or simulations, and visualize results.  Working in a way that makes it easier for your future self and others to understand and re-use your code means that more time can be dedicated to the research itself, rather than troubleshooting hard-to-understand code, resulting in more effective research. In addition, by following some simple best practices around code sharing, the visibility and impact of your research can be increased.  In this introductory session, you will:

    • learn about best practices for writing, documenting (Documentation), and organizing code (Organization & Automation),
    • understand the benefits of using version control (Version Control & Quality Assurance),
    • learn about how code can be linked to research results and why (Context & Credit),
    • understand why it is important to make your code publishable and citable and how to do so (Context & Credit),
    • learn about intellectual property issues (Licensing),
    • learn about how and why your software can be preserved over time (Archiving).
  • Structuring and Documenting a USGS Public Data Release

    This tutorial is designed to help scientists think about the best way to structure and document their USGS public data releases. The ultimate goal is to present data in a logical and organized manner that enables users to quickly understand the data. The first part of the tutorial describes the general considerations for structuring and documenting a data release, regardless of the platform being used to distribute the data. The second part of the tutorial describes how these general consideration can be implemented in ScienceBase. The tutorial is designed for USGS researchers, data managers, and collaborators, but some of the content may be useful for non-USGS researchers who need some tips for structuring and documenting their data for public distribution.

  • Unidata Data Management Resource Center

    In this online resource center, Unidata provides information about evolving data management requirements, techniques, and tools. They walk through common requirements of funding agencies to make it easier for researchers to fulfill them. In addition, they explain how to use some common tools to build a scientific data managment workflow that makes the life of an investigator easier, and enhances access to their work. The resource center provides information about: 1) Agency Requirements, 2) Best Practices, 3) Tools for Managing Data, 4) Data Management Plan Resources, 5) Data Archives, and 6) Scenarios and Use Case.

  • Data Management Planning Part 1: overview and a USGS program experience

    Emily Fort of the USGS presents an introduction to data management planning and a USGS program experience.

  • Data Management Planning Part 2: theory and practice in research data management

    Steve Tessler and Stan Smith present an example of a data management planning strategy for USGS science centers.

  • Introduction to the ISO 19115-2 Metadata Standard - DISL Data Management Metadata Training Webinar Series - Part 2

    Begins with a brief overview of how the components of the ISO 19115-2 metadata standard are organized, followed by an example completed metadata record. Overview of how to use NOAA NCEI's ISO workbooks and EDM Wiki as resources for writing ISO metadata. The video is 34 minutes. 

  • Open Principles in Education - Building Bridges, Empowering Communities

    This presentation shared experiences from “Geo for All” initiative on the importance of having open principles in education for empowering communities worldwide . Central to “Geo for All” mission is the belief that knowledge is a public good and Open Principles in Education will provide great opportunities for everyone.  By combining the potential of free and open  software, open data, open standards, open access to research publications, open education resources in Geospatial education and research will enable the creation of sustainable innovation ecosystem . This is key for widening  education opportunities, accelerating new discoveries and helping solving global cross disciplinary societal challenges from Climate change mitigation to sustainable cities. Service for the benefit and betterment of humanity is a key fundamental principle of “Geo for All” and we want to contribute and focus our efforts for the United Nations Sustainable Development Goals. We aim to create openness in Geo Education for developing creative and open minds in students which is critical for building open innovation and contributes to building up Open Knowledge for the benefit of the whole society and for our future generations. The bigger aim is to advance STEM education across the world and bring together schools, teachers and students across the world in joint projects and help building international understanding and global peace.

  • The Geoscience Paper of the Future: OntoSoft Training

    This presentation was developed to train scientists on best practices for digital scholarship, reproducibility, and data and software sharing.  It was developed as part of the NSF EarthCube Initiative and funded under the OntoSoft project.  More details about the project can be found at http://www.ontosoft.org/gpf.

    A powerpoint version of the slides is available upon request from [email protected].

    These OntoSoft GPF training materials were developed and edited by Yolanda Gil (USC), with contributions from the OntoSoft team including Chris Duffy (PSU), Chris Mattmann (JPL), Scott Pechkam (CU), Ji-Hyun Oh (USC), Varun Ratnakar (USC), Erin Robinson (ESIP).  They were significantly improved through input from GPF pioneers Cedric David (JPL), Ibrahim Demir (UI), Bakinam Essawy (UV), Robinson W. Fulweiler (BU), Jon Goodall (UV), Leif Karlstrom (UO), Kyo Lee (JPL), Heath Mills (UH), Suzanne Pierce (UT), Allen Pope (CU), Mimi Tzeng (DISL), Karan Venayagamoorthy (CSU), Sandra Villamizar (UC), and Xuan Yu (UD).  Others contributed with feedback on best practices, including Ruth Duerr (NSIDC), James Howison (UT), Matt Jones (UCSB), Lisa Kempler (Matworks), Kerstin Lehnert (LDEO), Matt Meyernick (NCAR), and Greg Wilson (Software Carpentry).  These materials were also improved thanks to the many scientists and colleagues that have taken the training and asked hard questions about GPFs.

  • Data Collection Part 1: How to avoid a spreadsheet mess - Lessons learned from an ecologist

    Most scientists have experienced the disappointment of opening an old data file and not fully understanding the contents. During data collection, we frequently optimize ease and efficiency of data entry, producing files that are not well formatted or described for longer term uses, perhaps assuming in the moment that the details of our experiments and observations would be impossible to forget. We can make the best of our sometimes embarrassing data management errors by using them as ‘teachable moments’, opening our dusty file drawers to explore the most common errors, and some quick fixes to improve day-to-day approaches to data.
     

  • Data Collection Part 2: Relational databases - Getting the foundation right

  • Data Sharing and Management within a Large-Scale, Heterogeneous Sensor Network using the CUAHSI Hydrologic Information System

    Hydrology researchers are collecting data using in situ sensors at high frequencies, for extended durations, and with spatial distributions that require infrastructure for data storage, management, and sharing. Managing streaming sensor data is challenging, especially in large networks with large numbers of sites and sensors.  The availability and utility of these data in addressing scientific questions related to water availability, water quality, and natural disasters relies on effective cyberinfrastructure that facilitates transformation of raw sensor data into usable data products.  It also depends on the ability of researchers to share and access the data in useable formats.  In this presentation I will describe tools that have been developed for research groups and sites conducting long term monitoring using in situ sensors.  Functionality includes the ability to track equipment, deployments, calibrations, and other events related to monitoring site maintenance and to link this information to the observational data that they are collecting, which is imperative in ensuring the quality of sensor-based data products. I will present these tools in the context of a data management and publication workflow case study for the iUTAH (innovative Urban Transitions and Aridregion Hydrosustainability) network of aquatic and terrestrial sensors.  iUTAH researchers have developed and deployed an ecohydrologic observatory to monitor Gradients Along Mountain to Urban Transitions (GAMUT). The GAMUT Network measures aspects of water inputs, outputs, and quality along a mountain-to-urban gradient in three watersheds that share common water sources (winter-derived precipitation) but differ in the human and biophysical nature of land-use transitions. GAMUT includes sensors at aquatic and terrestrial sites for real-time monitoring of common meteorological variables, snow accumulation and melt, soil moisture, surface water flow, and surface water quality. I will present the overall workflow we have developed, our use of existing software tools from the CUAHSI Hydrologic Information System, and new software tools that we have deployed for both managing the sensor infrastructure and for storing, managing, and sharing the sensor data.

  • Metadata: Standards, tools and recommended techniques

  • Monitoring Resources: web tools promoting documentation, data discovery and collaboration

    The presentation focuses on USGS/​Pacific Northwest Aquatic Monitoring Partnership's (PNAMP) Monitoring Resources toolset.

  • How high performance computing is changing the game for scientists, and how to get involved

  • Best practices for preparing data to share and preserve

    Scientists spend considerable time conducting field studies and experiments, analyzing the data collected, and writing research papers, but an often overlooked activity is effectively managing the resulting data. The goal of this webinar is to provide guidance on fundamental data management practices that investigators should perform during the course of data collection to improve the usability of their data sets.  Topics covered will include data structure, quality control, and data documentation. In addition, I will briefly discuss data curation practices that are done by archives to ensure that data can be discovered and used in the future. By following the practices, data will be less prone to error, more efficiently structured for analysis, and more readily understandable for any future questions that they might help address.

  • Data citation and you: Where things stand today

  • Open data and the USGS Science Data Catalog

  • Essentials 4 Data Support

    Essentials 4 Data Support is an introductory course for those people who (want to) support researchers in storing, managing, archiving and sharing their research data.  The Essentials 4 Data Support course aims to contribute to professionalization of data supporters and coordination between them. Data supporters are people who support researchers in storing, managing, archiving and sharing their research data.  Course may be taken online-only (no fee) with or without registration, or online plus face to face meetings as a full course with certificate (for a fee).  

  • Ocean Teacher Data Management Courses

    Ocean Teacher is a comprehensive e-Learning platform developed by the International Oceanographic Data and Information Exchange (IODE) to help build equitable capacity related to ocean research, observations, and services in UNESCO's Intergovernmental Oceanographic Commission. This series of presentations and short courses is focused on data management in the ocean sciences and is intended to be used in conjunction with classroom training. Course offerings from this collection that are discoverable in the ESIP Data Management Training Clearinghouse include:

    • Marine GIS Applications (using QGIS)
    • Quality Management System Essentials for IODE National Oceanographic Data Centres (NODC) and Associate Data Units (ADU)
    • Management of Marine Biogeographic Data (Contributing to the Use of OBIS) (2016) (available in English and Spanish)
  • Ocean Teacher Information Management Courses

    Ocean Teacher is a comprehensive e-Learning platform developed by the International Oceanographic Data and Information Exchange (IODE) to help build equitable capacity related to ocean research, observations, and services in UNESCO's Intergovernmental Oceanographic Commission.  This series of presentations, short courses is focused on information management in the ocean sciences and is intended to be used in conjunction with classroom training. 

  • ISRIC - World Soil Information Educational Videos

    YouTube Channel of videos on various topics related to world soils data creation and management.  Example categories of videos include Digital Soil Mapping; Screencast:  How to use ISRIC's Soil Data Hub; Sustainable Soil Managment; and Global Soil Information Facilities.  

  • ORNL DAAC Data Management Workshops

    Educational workshops on various scientific data management best practices designed to (1) introduce new data collectors to best practices in data curation and (2) enhance the skillsets of experienced data providers.  New workshops are added as they are made available.

  • Environmental Data Management Best Practices Part 1: Tabular Data

    This webinar is the first in a two-part webinar focused on Environmental Data Management Best Practices.  The topic of this webinar is tabular data, and is one of a series of educational workshops focused on best practices for and tips to best manage environmental research data presented by experts from the NASA's Distributed Active Archive Center for Biogeochemical Dynamics.  

  • Environmental Data Management Best Practices Part 2: Geospatial Data

    This webinar is the second in a two-part webinar focused on Environmental Data Management Best Practices.  The topic of this webinar is geospatial data, and is one of a series of educational workshops focused on best practices for and tips to best manage environmental research data presented by experts from the NASA's Distributed Active Archive Center for Biogeochemical Dynamics.  

  • Mozilla Science Lab's Open Data Primers

  • GeoMapApp Education Resources

    From this home page, a variety of GeoMapApp-related education activities and resources is available. The educational materials target audiences ranging from middle school and high school level to the undergraduate and graduate level.  Resources available include, for example:  mini lessons, GeoMapApp learning activities, data exercises and tutorials.  

    GeoMapApp is an earth science exploration and visualization application that is continually being expanded as part of the Marine Geoscience Data System (MGDS) at the Lamont-Doherty Earth Observatory of Columbia University. The application provides direct access to the Global Multi-Resolution Topography (GMRT) compilation that hosts high resolution (~100 m node spacing) bathymetry from multibeam data for ocean areas and ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer) and NED (National Elevation Dataset) topography datasets for the global land masses.  

  • GeoMapApp Learning Activities Collection

    GeoMapApp Learning Activities are web-based, high-impact, ready-to-use learning modules. They are aimed at the K-12, community college and introductory university level. Each learning activity provides hands-on participation that allows students to enhance their learning experience through the exploration of geoscience data in a map-based setting. They range from short in-class activities to multi-class modules covering broader content.

    GeoMapApp Learning Activities are tied to ESLI and NYS STEM and PS:ES standards and have a standard format:

    Title, summary, intended learning outcomes
    Prep time, grade level, required prior knowledge
    ESLI and NYS standards and NY Regents questions
    Step-by-step guide and teacher tips
    Student instructions and "grading-friendly" answer sheet
    Pre- and post- student quizzes
    Ideas for further knowledge gain

    GeoMapApp is an earth science exploration and visualization application that is continually being expanded as part of the Marine Geoscience Data System (MGDS) at the Lamont-Doherty Earth Observatory of Columbia University. The application provides direct access to the Global Multi-Resolution Topography (GMRT) compilation that hosts high resolution (~100 m node spacing) bathymetry from multibeam data for ocean areas and ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer) and NED (National Elevation Dataset) topography datasets for the global land masses.  

  • Data Skills Curricula Framework

    The Data Skills Curricula Framework to enhance information management skills for data-intensive science was developed by the Belmont Forum’s e-Infrastructures and Data Management (e-I&DM) Project to improve data literacy, security and sharing in data-intensive, transdisciplinary global change research. 

  • What are Persistent Identifiers and Why are they Important? (Webinar)

    Are you intrigued, interested or simply a bit confused by persistent identifiers and would like to know more? Then this introductory level webinar is for you! The webinar will be especially interesting if you are working with digital archives and digital collections. You will get a clear understanding of what persistent identifiers are, why they are important and how trustworthy they are. We also discuss how you can determine the most appropriate identifier for your needs. Note that this is not a deeply technical webinar.

    Topics that will be covered include:

    - What are persistent identifiers?
    - The case for PIDs – knowing what’s what and who’s who
    - The data architecture of PIDs
    - What is social infrastructure and why is it important?
    - Review of current identifier systems
    - How to choose a PID System
    - Case studies in documents, data, video

    This is part 1 of a 3 part series.

    Copies of version 2 of the slides can be found at:  https://figshare.com/articles/Overview_of_PID_Systems_for_THOR_Webinar/5...
    Version 2 of the slides have the following DOI:  https://doi.org/10.6084/m9.figshare.5016803.v2 .

  • Persistent Identifiers: Current Features and Future Properties (Webinar)

    This webinar is for people who know what persistent identifiers are, but are interested in knowing much more about what you can actually do with them. In other words, what are the services that are being built on top of identifier systems that could be useful to the digital preservation community? We will cover topics such as party identification, interoperability and (metadata) services such as multiple resolution. Following on from that, we will explain more about the next generation of resolvers and work on extensions, such as specification of the URN r-component semantics.

    This part 2 of a 3 part series.

  • Persistent Identifier (S)election Guide (Webinar)

    Cultural heritage organisations, regardless of size, are often hesitant to implement PIDs. They lack knowledge of what PIDs are, don’t know about their capabilities and benefits, and fear a possible complex and costly implementation process as well as the maintenance costs for a sustained service. The Digital Heritage Network and the Dutch Coalition on Digital Preservation addresses these issues in three ways:

    By raising awareness of (the importance of) PIDs in cultural heritage organisations.
    By increasing the knowledge regarding the use of PIDs within cultural heritage.
    By supporting the technical implementation of PIDs in cultural heritage collection management systems. How we did this on a nationwide scale will be explained in the webinar.

    There are multiple PID systems. But which system is most suited to your situation: Archival Resource Keys (ARKs), Digital Object Identifiers (DOIs), Handle, OpenURL, Persistent Uniform Resource Locators (PURLs) or Uniform Resource Names (URNs)? Each system has its own particular properties, strengths and weaknesses. The PID Guide from the Digital Heritage Network’s Persistent Identifier project helps you learn and think about important PID subjects, and guides your first steps towards selecting a PID system.

    This is part 3 of a 3 part series.

  • Research Data Management and Access: The Basis for Preserving and Providing Access to Research Data

    Brief introduction to research data preservation; research data management (RDM), including curation, documentation, metadata, and controlled vocabulary; and data access.

  • ODINAFRICA Ocean Data Portal Training-of-Trainers Course

    This ODINAFRICA Ocean Data Portal training-of-trainers course will demonstrate the International Oceanographic Data and Information Exchange (IODE) Ocean Data Portal V2. The focus will be on ODP Data Provider for national nodes (NODCs). This course includes some specific topics like an introduction to Linux OS and an application server which are used as an operating environment. This course can be also used to gain a better understanding of the IODE Ocean Data Portal and its capabilities.

    The course is comprised of three sessions and each session is composed of several video presentations.  

    • Session 1: Introduction to IODE Ocean Data Portal V2
    • Session 2: ODP Data Provider - Data Connect and Share
    • Session 3: Metadata Maintenance

    PowerPoint slides for each presentation are available for download from the main course page.

    About the Ocean Data Portal:

    • Formally established in 2007 as a program under the IODE and supported by the Partnership Centre for IODE Ocean Data Portal, Russian Federation
    • Seeks to provide open and seamless access to marine data collections in an enabling and globally distributed environment
    • Facilitates discovery, evaluation, and access to data
    • Provides benefits to both data providers and data users
    • Focuses on standards, technology, people, and capacity planning 
  • Research Data Management and Open Data

    This two-day workshop combines both presentations and interactive activities, exercises, and discussions, will provide Ph.D. students with practical skills directly applicable to their own research. Drawing on a wide range of data – both quantitative and qualitative – the workshop will address the following key topics in data management:

    • Documentation and contextual description
    • Ethical and legal aspects of managing and sharing sensitive data
    • Anonymising research data for reuse
    • Writing a data management plan
    • Data handling (e.g. file organisation and data storage and security)
    • Data preparation

    By the end of the workshop, participants will know how to apply good data management practices in their own research and will be able to work more efficiently and effectively with data individually or as part of a research team, where data are often co-produced and shared.

    From the Agenda on the home page, the PDF, text, and Epub versions of these slide presentations can be accessed by clicking on the link for each presentation.