All Learning Resources

  • Data Management In The Arts and Humanities

    This presentation provides a response to a few tricky questions that often come up at the Digital Curation Centre in the UK by people providing services to art and humanities researchers.  Martin Donnely at the Digital Curation Centre discusses topics including. 
    - A brief introduction about DCC: The Digital Curation Centre (DCC) is an internationally-recognized center of expertise in digital curation with a focus on building capability and skills for research data management. The DCC provides expert advice and practical help to research organizations wanting to store, manage, protect, and share digital research data.
    - What is data and what do we mean by research data management?
    - What are the scientific methods and why is different in Arts and Humanities?
    - What are the strengths and weaknesses of data in the Art and Humanities?
    - Archiving issues around Art and Humanities
  • Findability of Research Data and Software Through PIDs and FAIR Repositories

    This presentation introducing the "Findability of Research Data and Software Through PIDs and FAIR Repositories" is one of 9 webinars on topics related to FAIR Data and Software that was offered at a Carpentries-based Workshop in Hannover, Germany, Jul 9-13 2018.  Presentation slides are also available in addition to the recorded presentation.
    Other topics included in the series include:
    - Introduction, FAIR Principles and Management Plans
    - Accessibility through Git, Python Functions and Their Documentation
    - Interoperability through Python Modules, Unit-Testing and Continuous Integration
    - Reusability through Community Standards, Tidy Data Formats and R Functions, their Documentation, Packaging, and Unit-Testing
    - Reusability:  Data Licensing
    - Reusability:  Software Licensing
    - Reusability:  Software Publication
    - FAIR Data and Software - Summary
     
    URL locations for the other modules in the webinar can be found at the URL above.

  • Accessibility Through Git, Python Functions and Their Documentation

    This presentation " Accessibility Through Git, Python Functions and Their Documentation" is one of 9 webinars on topics related to FAIR Data and Software that was offered at a Carpentries-based Workshop in Hannover, Germany, Jul 9-13 2018.  Presentation slides are also available in addition to the recorded presentation.
    In this presentation they Talk about:
    - The definitions and role of Accessibility
    - Version control & project management with GIT(HUB)
    - Accessible software & comprehensible code
    - Functions in python & R

    Other topics included in the series include:
    - Introduction, FAIR Principles and Management Plans
    - Findability of Research Data and Software Through PIDs and FAIR Repositories
    - Interoperability through Python Modules, Unit-Testing and Continuous Integration
    - Reusability through Community Standards, Tidy Data Formats and R Functions, their Documentation, Packaging, and Unit-Testing
    - Reusability:  Data Licensing
    - Reusability:  Software Licensing
    - Reusability:  Software Publication
    - FAIR Data and Software - Summary
     
    URL locations for the other modules in the webinar can be found at the URL above.

  • Interoperability Through Python Modules, Unit-Testing and Continuous Integration

    This presentation " Interoperability Through Python Modules, Unit-Testing and Continuous Integration" is one of 9 webinars on topics related to FAIR Data and Software that was offered at a Carpentries-based Workshop in Hannover, Germany, Jul 9-13 2018.  Presentation slides are also available in addition to the recorded presentation.
     
    Other topics included in the series include:
    - Introduction, FAIR Principles and Management Plans
    -Findability of Research Data and Software Through PIDs and FAIR Repositories
    - Accessibility through Git, Python Functions and Their Documentation
    - Reusability through Community Standards, Tidy Data Formats and R Functions, their Documentation, Packaging, and Unit-Testing
    - Reusability:  Data Licensing
    - Reusability:  Software Licensing
    - Reusability:  Software Publication
    - FAIR Data and Software - Summary
     
    URL locations for the other modules in the webinar can be found at the URL above.

  • Reusability Through Community-Standards, Tidy Data Formats and R Functions, Their Documentation, Packaging and Unit-Testing

    This presentation introducing the "Reusability Through Community-Standards, Tidy Data Formats and R Functions, Their Documentation, Packaging and Unit-Testing" is one of 9 webinars on topics related to FAIR Data and Software that was offered at a Carpentries-based Workshop in Hannover, Germany, Jul 9-13 2018.  Presentation slides are also available in addition to the recorded presentation.
     
    Other topics included in the series include:
    - Introduction, FAIR Principles and Management Plans
    -Findability of Research Data and Software Through PIDs and FAIR Repositories
    - Accessibility through Git, Python Functions and Their Documentation
    - Interoperability through Python Modules, Unit-Testing and Continuous Integration
    - Reusability:  Data Licensing
    - Reusability:  Software Licensing
    - Reusability:  Software Publication
    - FAIR Data and Software - Summary
     
    URL locations for the other modules in the webinar can be found at the URL above.

  • Reusability: Data Licensing

    This presentation "Reusability: Data Licensing" is one of 9 webinars on topics related to FAIR Data and Software that was offered at a Carpentries-based Workshop in Hannover, Germany, Jul 9-13 2018.  Presentation slides are also available in addition to the recorded presentation.
     
    Other topics included in the series include:
    - Introduction, FAIR Principles and Management Plans
    -Findability of Research Data and Software Through PIDs and FAIR Repositories
    - Accessibility through Git, Python Functions and Their Documentation
    - Interoperability through Python Modules, Unit-Testing and Continuous Integration
    - Reusability Through Community-Standards, Tidy Data Formats and R Functions, Their Documentation, Packaging and Unit-Testing
    - Reusability:  Software Licensing
    - Reusability:  Software Publication
    - FAIR Data and Software - Summary
     
    URL locations for the other modules in the webinar can be found at the URL above.

  • Reusability: Software Licensing

    This presentation " Reusability: Software Licensing" is one of 9 webinars on topics related to FAIR Data and Software that was offered at a Carpentries-based Workshop in Hannover, Germany, Jul 9-13 2018.  Presentation slides are also available in addition to the recorded presentation.
     
    Other topics included in the series include:
    - Introduction, FAIR Principles and Management Plans
    -Findability of Research Data and Software Through PIDs and FAIR Repositories
    - Accessibility through Git, Python Functions and Their Documentation
    - Interoperability through Python Modules, Unit-Testing and Continuous Integration
    - Reusability Through Community-Standards, Tidy Data Formats and R Functions, Their Documentation, Packaging and Unit-Testing
    - Reusability: Data Licensing
    - Reusability:  Software Publication
    - FAIR Data and Software - Summary
     
    URL locations for the other modules in the webinar can be found at the URL above.

  • Reusability: Software Publication

    This presentation " Reusability: Software Publication" is one of 9 webinars on topics related to FAIR Data and Software that was offered at a Carpentries-based Workshop in Hannover, Germany, Jul 9-13 2018.  Presentation slides are also available in addition to the recorded presentation.
     
    Other topics included in the series include:
    - Introduction, FAIR Principles and Management Plans
    -Findability of Research Data and Software Through PIDs and FAIR Repositories
    - Accessibility through Git, Python Functions and Their Documentation
    - Interoperability through Python Modules, Unit-Testing and Continuous Integration
    - Reusability Through Community-Standards, Tidy Data Formats and R Functions, Their Documentation, Packaging and Unit-Testing
    - Reusability: Data Licensing
    - Reusability: Software Licensing
    - FAIR Data and Software - Summary
     
    URL locations for the other modules in the webinar can be found at the URL above

  • FAIR Data and Software - Summary

    This presentation Summary of FAIR Data and Software  is one of 9 webinars on topics related to FAIR Data and Software that was offered at a Carpentries-based Workshop in Hannover, Germany, Jul 9-13 2018.  Presentation slides are also available in addition to the recorded presentation.
     
    Other topics included in the series include:
    - Introduction, FAIR Principles and Management Plans
    -Findability of Research Data and Software Through PIDs and FAIR Repositories
    - Accessibility through Git, Python Functions and Their Documentation
    - Interoperability through Python Modules, Unit-Testing and Continuous Integration
    - Reusability Through Community-Standards, Tidy Data Formats and R Functions, Their Documentation, Packaging and Unit-Testing
    - Reusability: Data Licensing
    - Reusability: Software Licensing
    - Reusability:  Software Publication

    URL locations for the other modules in the webinar can be found at the URL above.

  • Formal Ontologies: A Complete Novice's Guide

    This module is specifically aimed at those who are not yet familiar with ontologies as a means of research data management, and will take you through some of the main features of ontologies, and the reasons for using them.  If you’d like to take a step back to a very basic introduction to knowledge representation systems, you could have a look at the brief summary we have given in the ‘Introduction to Research Infrastructures Module’ before starting.
    By the end of this module, participants should be able to:
    -Understand what we mean by ‘Data Hetereogeneity’, and how it affects knowledge representation
    -Understand and explain the basic concept of an ontology
    -Understand and explain how ontologies are used to curate and share research data

    PARTHENOS training provides modules and resources in digital humanities and research infrastructures with the goal of strengthening the cohesion of research in the broad sector of Linguistic Studies, Humanities, Cultural Heritage, History, Archaeology and related fields.  Activities designed to meet this goal will address and provide common solutions to the definition and implementation of joint policies and solutions for the humanities and linguistic data lifecycle, considering the specific needs of the sector including the provision of joint training activities and modules on topics related to understanding research infrastructures and managing, improving and opening up research and data for both learners and trainers.
    More information about the PARTHENOS project can be found at:  http://www.parthenos-project.eu/about-the-project-2.
      Other training modules created by PARTHENOS can be found at:  http://training.parthenos-project.eu/training-modules/.
     

  • PARTHENOS E-Humanities and E-Heritage Webinar Series

    The PARTHENOS eHumanities and eHeritage Webinar Series provides a lens through which a more nuanced understanding of the role of Digital Humanities and Cultural Heritage research infrastructures in research can be obtained.  Participants of the PARTHENOS Webinar Series will delve into a number of topics, technologies, and methods that are connected with an “infrastructural way” of engaging with data and conducting humanities research.

    Topics include: theoretical and practical reflections on digital and analogue research infrastructures; opportunities and challenges of eHumanities and eResearch; finding, working and contributing to Research Infrastructure collections; standards; FAIR principles; ontologies; tools and Virtual Research Environments (VREs), and; new publication and dissemination types.  

    Slides and video recordings of the webinars can be found from the "Wrap Up & Materials" pages at the landing page for each webinar's separate listing/linking that can be found on this series landing page.  

    Learning Objectives: 
    Each webinar of the PARTHENOS Webinar Series has an individual focus and can be followed independently.  Participants who follow the whole series will gain a complete overview on the role and value of Digital Humanities and Cultural Heritage Research Infrastructures for research, and will be able to identify Research Infrastructures especially valuable for their research and data.
     

  • Analyzing Documents with TF-IDF

    This lesson focuses on a core natural language processing and information retrieval method called Term Frequency - Inverse Document Frequency (tf-idf). You may have heard about tf-idf in the context of topic modeling, machine learning, or other approaches to text analysis. Tf-idf comes up a lot in published work because it’s both a corpus exploration method and a pre-processing step for many other text-mining measures and models.

    Looking closely at tf-idf will leave you with an immediately applicable text analysis method. This lesson will also introduce you to some of the questions and concepts of computationally oriented text analysis. Namely, this lesson addresses how you can isolate a document’s most important words from the kinds of words that tend to be highly frequent across a set of documents in that language. In addition to tf-idf, there are a number of computational methods for determining which words or phrases characterize a set of documents, and I highly recommend Ted Underwood’s 2011 blog post as a supplement.

    Suggested Prior Skills
    - Prior familiarity with Python or a similar programming language. Code for this lesson is written in Python 3.6, but you can run tf-idf in several different versions of Python, using one of several packages, or in various other programming languages. The precise level of code literacy or familiarity recommended is hard to estimate, but you will want to be comfortable with basic types and operations. To get the most out of this lesson, it is recommended that you work your way through something like Codeacademy’s “Introduction to Python” course, or that you complete some of the introductory Python lessons on the Programming Historian.
    - In lieu of the above recommendation, you should review Python’s basic types (string, integer, float, list, tuple, dictionary), working with variables, writing loops in Python, and working with object classes/instances.
    - Experience with Excel or an equivalent spreadsheet application if you wish to examine the linked spreadsheet files. You can also use the pandas library in python to view the CSVs.

  • Temporal Network Analysis with R

    This tutorial introduces methods for visualizing and analyzing temporal networks using several libraries written for the statistical programming language R. With the rate at which network analysis is developing, there will soon be more user-friendly ways to produce similar visualizations and analyses, as well as entirely new metrics of interest. For these reasons, this tutorial focuses as much on the principles behind creating, visualizing, and analyzing temporal networks (the “why”) as it does on the particular technical means by which we achieve these goals (the “how”). It also highlights some of the unhappy oversimplifications that historians may have to make when preparing their data for temporal network analysis, an area where our discipline may actually suggest new directions for temporal network analysis research.

    One of the most basic forms of historical argument is to identify, describe, and analyze changes in a phenomenon or set of phenomena as they occur over a period of time. The premise of this tutorial is that when historians study networks, we should, insofar as it is possible, also be acknowledging and investigating how networks change over time.

    Lesson Goals
    In this tutorial you will learn:
    -The types of data necessary to model a temporal network
    -How to visualize a temporal network using the NDTV package in R
    -How to quantify and visualize some important network-level and node-level metrics that describe temporal networks using the TSNA package in R.

    Prerequisites:
    This tutorial assumes that you have:
    - a basic familiarity with static network visualization and analysis, which you can get from excellent tutorials on the Programming Historian such as From Hermeneutics to Data to Networks: Data Extraction and Network Visualization of Historical Sources and Exploring and Analyzing Network Data with Python
    - RStudio with R version 3.0 or higher
    - A basic understanding of how R can be used to modify data. You may want to review the excellent tutorial on R Basics with Tabular Data found at:  https://programminghistorian.org/en/lessons/r-basics-with-tabular-data.

  • File Naming Convention Worksheet

    This worksheet walks researchers through the process of creating a file naming convention for a group of files. This process includes: choosing metadata, encoding and ordering the metadata, adding version information, and properly formatting the file names. Two versions of the worksheet are available: a Caltech Library branded version (PDF) and a generic editable version (MS Word).

  • Data Science Training Camp at Woods Hole Oceanographic Institution: Syllabus and slide presentations in 2020

    With data and software increasingly recognized as scholarly research products, and aiming towards open science and reproducibility, it is imperative for today's oceanographers to learn foundational practices and skills for data management and research computing, as well as practices specific to the ocean sciences. This educational package was developed as a data science training camp for graduate students and professionals in the ocean sciences and implemented at the Woods Hole Oceanographic Institution (WHOI) in 2019 and 2020. Here we provide materials for the 2020 camp.  Contents of this package include the syllabus and slide presentations for each of the four modules:
    1 "Good enough practices in scientific computing,"
    2 Data management,
    3 Software development and research computing,
    and 4 Best practices in the ocean sciences.
    The 3rd module is split into two parts. We also include a poster presented at the 2020 Ocean Science Meeting, which has some results from pre- and post-surveys.
     

  • Project Close-Out Checklist for Research Data

    The close-out checklist describes a range of activities for helping ensure that research data are properly managed at the end of a project or at researcher departure. Activities include: making stewardship decisions, preparing files for archiving, sharing data, and setting aside important files in a "FINAL" folder. Two versions of the checklist are available: a Caltech Library branded version (PDF) and a generic editable version (MS Word).

  • Efficient BIM Data Management & Quality Control of Revit Projects

    This AGACAD webinar provides guidance to speedy building design, facility management, and BIM data analysis in Revit projects. The contents include:
    • Manage BIM data in your Revit model and set LOD
    • Review, change & easily update BIM Data in your Revit projects
    • Find and modify any element parameters in BIM model with ease
    • Use formulas to make your own data tables
    • Insert elements into your project using various predefined rules
    • Set up and control LOD requirements based on standards, specifications, or framework agreed upon by the IPD team
    • Ensure that BIM models fit the agreed standards.

  • Top 5 Workflows for Precise BIM Data Management

    Do you have the need to easily rename families in your Revit project to match standards?  Do you find it hard to edit and control revisions within Revit?  Do you need accurate Quantity Take-Off information from your Revit model?  How about the need to edit parameter information more easily than a Revit Schedule?  Tired of assigning View Templates and managing view properties manually? Review this webcast as we cover these examples and more, utilizing a powerful Revit add-on application from Ideate Software called Ideate BIMLink.  It’s precise, fast, and easy Data Management of your BIM information.

  • Visualizing Data with Bokeh and Pandas

    The ability to load raw data, sample it, and then visually explore and present it is a valuable skill across disciplines. In this tutorial, you will learn how to do this in Python by using the Bokeh and Pandas libraries. Specifically, we will work through visualizing and exploring aspects of WWII bombing runs conducted by Allied powers, i.e., the WW II THOR dataset (Theater History of Operations Reports (THOR).
    At the end of the lesson you will be able to:
    -Load tabular CSV data
    -Perform basic data manipulation, such as aggregating and sub-sampling raw data
    -Visualize quantitative, categorical, and geographic data for web display
    -Add varying types of interactivity to your visualizations

    Prerequisites
    -This tutorial can be completed using any operating systems. It requires Python 3 and a web browser. You may use any text editor to write your code.
    -This tutorial assumes that you have a basic knowledge of the Python language and its associated data structures, particularly lists.
    -If you work in Python 2, you will need to create a virtual environment for Python 3, and even if you work in Python 3, creating a virtual environment for this tutorial is good practice.

  • Introduction To MySQL With R

    MySQL is a relational database used to store and query information. This lesson will use the R language to provide a tutorial and examples to:
    -Set up and connect to a table in MySQL.
    -Store records to the table.
    -Query the table.
    In this tutorial you will make a database of newspaper stories that contain words from a search of a newspaper archive. The program will store the title, date published and URL of each story in a database. They’ll use another program to query the database and look for historically significant patterns. Sample data will be provided from the Welsh Newspapers Online newspaper archive. They are working toward having a list of stories they can query for information. At the end of the lesson, they will run a query to generate a graph of the number of newspaper stories in the database to see if there is a pattern that is significant.

    To do this lesson you will need a computer where you have permission to install software such as R and RStudio, if you are not running that already. In addition to programming in R, they will be installing some components of a database system called MySQL which works on Windows, Mac and Linux.

    Some knowledge of installing software as well as organizing data into fields is helpful for this lesson which is of medium difficulty.

  • Dealing with Big Data and Network Analysis Using Neo4j

    In this lesson, you will learn how to use a graph database to store and analyze complex networked information. Networks are all around us. Social scientists use networks to better understand how people are connected. This information can be used to understand how things like rumors or even communicable diseases can spread throughout a community of people.
    This tutorial will focus on the Neo4j graph database and the Cypher query language that comes with it.
    -Neo4j is a free, open-source graph database written in java that is available for all major computing platforms.
    -Cypher is the query language for the Neo4j database that is designed to insert and select information from the database.
    By the end of this lesson you will be able to construct, analyze, and visualize networks based on big — or just inconveniently large — data. The final section of this lesson contains code and data to illustrate the key points of this lesson.

  • SPSS Data Curation Primer

    This data curation primer primarily discusses .sav and .por files. SPSS Statistics (.sav): Data files saved in IBM SPSS Statistics format. Portable (.por): Portable format that can be read by other versions of IBM SPSS Statistics and versions on other operating systems.
    This work was created as part of the Data Curation Network “Specialized Data Curation” Workshop #1 co-located with the Digital Library Federation (DLF) Forum 2018 in Las Vegas, Nevada on October 17-18, 2018.
    Table of Contents:
    -Description of Format
    -Example Data
    -Start the Conversation: Broad Questions and Clarifications on Research Data
    -Key Questions
    -Key Clarifications
    -Applicable Metadata Standards, Recommended Elements, and Readme File
    -Tutorials
    -Software
    -Preservation Actions
    -FAIR Principles & SPSS
    -Format Use
    -Documentation of Curation Process
    -Appendix A: Other SPSS File Formats
    -Appendix B: Project Level or Study Level Metadata
    -Appendix C: DDI Metadata
    -Appendix D: Dictionary Schema
    -Bibliography

    Other Data Curation Primers can be found at:  https://conservancy.umn.edu/handle/11299/202810.  Interactive primers available for download and derivatives at: https://github.com/DataCurationNetwork/data-primers.

     

  • STL Data Curation Primer

    An STL file stores information about 3D models. It is commonly used for printing 3D objects. The STL format approximates 3D surfaces of a solid model with oriented triangles (facets) of different size and shape (aspect ratio) in order to achieve a representation suitable for viewing or reproduction using digital fabrication. This format describes only the surface geometry of a three-dimensional object without any representation of color, texture, or other common model attributes. These files are usually generated as an end product of a 3D modeling or spatial capture process. The purpose of this primer is to guide a data curator through the curation process for STL files.

    This work was created as part of the Data Curation Network “Specialized Data Curation” Workshop #2 held at Johns Hopkins University on April 17-18, 2019.
    The full set of Data Curation Primers can be found at:https://conservancy.umn.edu/handle/11299/202810
    Interactive primers available for download and derivatives at:https://github.com/DataCurationNetwork/data-primers
     

  • Virtual Summer Camp: Computer Science

    Online classes and interactive camps taught by expert instructors are available with a focus on Computer Science in the Virtual Summer Camp catalog. Focus of the resources are for K-12 students on topics such as introductions to Python, arrays, recursion, and Java theory, foundations of web development including HTML and CSS, Javascript programming, SQL for beginners, etc. Classes are added all the time. In addition, this site offers other online elementary, middle, and high school classes in large and small groups. Many courses are free; for others, fees apply. Other educational resources are available for instructors and students.

  • The Paper and The Data: Authors, Reviewers, and Editors Webinar on Updated Journal Practices for Data (and Software)

    The Paper and The Data workshop was first presented at the Ocean Science meeting held in February 2020.  Following that conference it was updated and presented as five recorded modules for the purpose of sharing broadly. The workshop consists of 5 modules on topics related to the new practices of publishers for journals for the improvement of data and software sharing that are targeted to journal authors, reviewers, and editors.  Both slides and video presentations of the slides are available.  Modules include:

    Module 1: Introduction 

    • Challenges with Accessing Data
    • AGU Data Position Statement
    • Recommendations from the NAS
    • Updated Journal Guidelines
    • Benefits for Sharing Data/Software



    Module 2: Data

    •   What Data  
    •   What Repository
    •   Availability Statement
    •   Citation
    •   Examples


    Module 3: Software 

    •  What Software
    •   Availability Statement
    •   Citation
    •   Github?  Nope.  But now what?
    •   Examples 


    Module 4: Peer Review 

    •   Recommendation from AGU
    •   Examples



    Module 5: Persistent Identifiers 

    •   ORCID, DOI…
    •   PID Graph

  • GeoJSON Data Curation Primer

    GeoJSON is a geospatial data interchange format for encoding vector geographical data structures, such as point, line, and polygon geometries, as well as their non-spatial attributes. The purpose of this primer is to guide a data curator through the curation process for GeoJSON files. Key questions for curation review:
    ● Are coordinates listed in the following format: [longitude, latitude, elevation] 
    ● Can the file be opened in a text editor and viewed in QGIS 
    ● Does the file pass validation 
    ● Is there associated metadata/README.md files
    This work was created as part of the Data Curation Network “Specialized Data Curation” Workshop #2 held at Johns Hopkins University on April 17-18, 2019.
    The full set of Data Curation Primers can be found at:https://conservancy.umn.edu/handle/11299/202810
    Interactive primers available for download and derivatives at:https://github.com/DataCurationNetwork/data-primers
     

  • Confocal Microscopy Data Curation Primer

    The purpose of this primer is to guide a data curator through the curation process for confocal images. Confocal microscopy is a type of microscopy technique to image objects that are too small to view with the unassisted human eye. Primary fields in which confocal microscopy is used are:  Biology, health, engineering, chemistry.  This primer describes the image specifics, as well as what details and metadata from the instrumentation and experiment are needed to understand the images and use them for further research or educational purposes.
    This work was created as part of the Data Curation Network “Specialized Data Curation” Workshop #2 held at Johns Hopkins University on April 17-18, 2019.
    The full set of Data Curation Primers can be found at:https://conservancy.umn.edu/handle/11299/202810
    Interactive primers available for download and derivatives at:https://github.com/DataCurationNetwork/data-primers
  • R Data Curation Primer

    The purpose of this primer is to guide a data curator through the curation process for text files with a “.R” extension that contain code for executing programs in the R language.
    Key questions for curation review
    -What is the purpose of the file?
    -Are any data associated with the file?
    -Are the referenced data present at the indicated location? 
    This work was created as part of the Data Curation Network “Specialized Data Curation” Workshop #2 held at Johns Hopkins University on April 17-18, 2019.
    The full set of Data Curation Primers can be found at:https://conservancy.umn.edu/handle/11299/202810
    Interactive primers available for download and derivatives at:https://github.com/DataCurationNetwork/data-primers
     

  • Tableau Data Curation Primer

    Tableau Software is a proprietary suite of products for data exploration, analysis, and visualization with an initial concentration in business intelligence. This primer focuses on the Tableau workbook files – .twb and .twbx – produced using Tableau Desktop. Like Microsoft Excel, Tableau Desktop uses a workbook and sheet file structure. Workbooks can contain worksheets, dashboards, and stories.
    Key questions for curation review
    ● Can the Tableau workbook file be opened?
    ● If the Tableau workbook is provided as a .twb file, is there an accompanying data source file or data extract?
    ● Is there documentation for how to navigate and work with the Tableau workbook?
    ● Is there an accompanying snapshot to show how a workbook, dashboard, or story view should be rendered?

    This work was created as part of the Data Curation Network “Specialized Data Curation” Workshop #2 held at Johns Hopkins University on April 17-18, 2019.
    The full set of Data Curation Primers can be found at:https://conservancy.umn.edu/handle/11299/202810
    Interactive primers available for download and derivatives at:https://github.com/DataCurationNetwork/data-primers

  • PDF Data Curation Primer

    The purpose of this primer is to guide a data curator through the curation process for Portable Document Format (PDF) files. As a highly-used document publication format, PDF documents represent considerable bodies of important information globally and have become commonly used for publishing data and related files.
    This work was created as part of the Data Curation Network “Specialized Data Curation” Workshop #2 held at Johns Hopkins University on April 17-18, 2019.

    The full set of Data Curation Primers can be found at:https://conservancy.umn.edu/handle/11299/202810
    Interactive primers available for download and derivatives at:https://github.com/DataCurationNetwork/data-primers
     

  • Atlas.ti Data Curation Primer

    Altas.ti is a software application that allows researchers to analyze qualitative data in a systematic and transparent way, increasing the validity of results (Friese 2019). ATLAS.ti handles different types of data that are kept in a project. The project files can contain text documents, images, audio recordings, videos, pdf files, geodata, Twitter data, citations from Evernote and reference managers, and survey data. The purpose of this primer is to guide a data curator through the curation process for Altas.ti files.
    Key questions for curation review
    -What ATLAS.ti version was used?
    -Can other researchers open the project without the ATLAS.ti?
    -Does the project include metadata/documentation/codebook?
    -Are there consent forms/participation agreements? Is there sensitive information that can compromise human subjects’ rights?
    -Are there associated data that has been exported (i.e. result reports, codebook) outside the project?

    This work was created as part of the Data Curation Network “Specialized Data Curation” Workshop #2 held at Johns Hopkins University on April 17-18, 2019.
    The full set of Data Curation Primers can be found at:https://conservancy.umn.edu/handle/11299/202810
    Interactive primers available for download and derivatives at:https://github.com/DataCurationNetwork/data-primers

  • Data Management and Reporting: BCO-DMO Data Management Services and Best Practices

    The University-National Oceanographic Laboratory System (UNOLS) hosted an Early Career Chief Scientist Training Workshop in June 2019. The goal of this workshop was to help early-career marine scientists plan and write effective cruise proposals, develop collaborative sampling strategies and plans, become familiar with shipboard equipment and sampling at sea, and communicate major findings through the writing of manuscripts and cruise reports. This presentation provides information on data management and reporting best practices for chief scientists. It includes information on the National Science Foundation (NSF) data policy requirements, writing a Data Management Plan (DMP), the data lifecycle, data publication, and shipboard data management recommendations.

  • Fundamentals of Remote Sensing [Introductory]

    These webinars are available for viewing at any time. They provide basic information about the fundamentals of remote sensing and are often a prerequisite for other ARSET training.

    OBJECTIVE
    Participants will become familiar with satellite orbits, types, resolutions, sensors, and processing levels. In addition to a conceptual understanding of remote sensing, attendees will also be able to articulate their advantages and disadvantages. Participants will also have a basic understanding of NASA satellites, sensors, data, tools, portals, and applications to environmental monitoring and management.

    SESSIONS
    Session 1: Fundamentals of Remote Sensing
    A general overview of remote sensing and its application to disasters, health & air quality, land, water resource, and wildfire management.
    Session 1A: NASA's Earth Observing Fleet
    Get familiar with Earth-observing satellites in NASA's fleet, sensors that collect data you can use in ARSET training, and potential applications. 
    Session 2A: Satellites, Sensors, Data and Tools for Land Management and Wildfire Applications
    Specific satellites, sensors, and resources for remote sensing in land management and wildfires. This includes land cover mapping and products, fire detection products, detecting land cover change, and NDVI and EVI. 
    Session 2B: Satellites, Sensors, and Earth Systems Models for Water Resources Management
    Water resources management, an overview of relevant satellites and sensors, an overview of relevant Earth system models, and data and tools for water resources management. 
    Session 2C: Fundamentals of Aquatic Remote Sensing
    Overview of relevant satellites and sensors, and data and tools for aquatic environmental management. 

  • Remote Sensing of Coastal Ecosystems [Introductory]

    Coastal and marine ecosystems serve key roles for carbon storage, nutrients, and materials cycling, as well as reservoirs of biodiversity. They also provide ecosystem services such as sustenance for millions of people, coastal protection against wave action, and recreational activities. Remote sensing of coastal and marine ecosystems is particularly challenging. Up to 90% of the signal received by the sensors in orbit comes from the atmosphere. Additionally, dissolved and suspended constituents in the water column attenuate most of the light received through absorption or scattering. When it comes to retrieving information about shallow-water ecosystems, even in the clearest waters under the clearest skies, less than 10% of the signal originates from the water and its bottom surface. Users, particularly those with little remote sensing experience, stand to benefit from this training covering some of the difficulties associated with remote sensing of coastal ecosystems, particularly beaches and benthic communities such as coral reefs and seagrass.

    OBJECTIVES
    by the end of this training, attendees will be able to:

    • Identify the different water column components and how they affect the remote sensing signal of shallow-water ecosystems
    • Outline existing satellite sensors used for ocean color and shallow-water ecosystem characterization
    • Understand the interaction between water constituents, the electromagnetic spectrum, and the remote sensing signal
    • Recognize the different processes used to remove the water column attenuation from the remotely-sensed signal to characterize benthic components
    • Summarize techniques for characterizing shoreline beach environments with remotely-sensed data and field methods for beach profiling
    COURSE FORMAT
    • Three one-hour sessions with presentations in English and Spanish
    • One Google Form homework
    • Spanish sessions 
    PREREQUISITES
    Part One: Overview of Coastal Ecosystems and Remote Sensing
    • Introduction to coastal and marine ecosystems
    • Overview of sensors for remote sensing of coastal areas
    • Q&A
    Part Two: Penetration of Light in the Water Column
    • Apparent and inherent optical properties 
    • Field bio-optical measurements 
    • Water column corrections 
    • Deriving bathymetry and benthic characterization from multispectral data 
    • Validation and calibration of ocean color data 
    • Q&A
    Part Three: Remote Sensing of Shorelines
    • Geophysical components of shorelines 
    • The parts of a beach 
    • Field-based measurements in shorelines for image validation 
    • Image processing and analysis for shoreline characterization 
    • Q&A
    Each part of 3 includes links to the recordings, presentation slides,  and Question & Answer Transcripts. 
  • Teledetección de Ecosistemas Costeros

    Los ecosistemas marinos y costeros tienen roles vitales en el almacenamiento de carbono, reciclaje de nutrientes y otros materiales, al igual que sirven de reservorios de biodiversidad. Además, proveen servicios ecosistémicos tales como comida para millones de personas, protección costera contra el oleaje, y actividades recreativas. La teledetección de los ecosistemas costeros y marinos es particularmente difícil. Hasta el 80% de la señal recibida por los sensores en órbita proviene de la atmósfera. Además, los componentes de la columna de agua (disueltos y suspendidos) atenúan la mayor parte de la luz mediante absorción o dispersión. Cuando se trata de recuperar información del fondo del océano, incluso en las aguas más claras, solo menos del 10% de la señal proviene de el fondo marino. Los usuarios, particularmente aquellos con poca experiencia en teledetección, pueden beneficiarse de esta capacitación que cubre algunas de las dificultades asociadas con la teledetección de ecosistemas costeros, particularmente playas y comunidades bentónicas tales como arrecifes de coral y yerbas marinas.

    OBJETIVOS DE APRENDIZAJEAl
    final de esta capacitación, los asistentes podrán:

    • Identificar los diferentes componentes de la columna de agua y cómo afectan la señal de teledetección remota de los ecosistemas de aguas poco profundas.
    • Describir los sensores satelitales existentes utilizados para analizar el color del océano y en la caracterización de ecosistemas de aguas poco profundas.
    • Comprender la interacción entre los componentes del agua, el espectro electromagnético y la señal de detección remota.
    • Reconocer los diferentes procesos utilizados para eliminar la atenuación de la columna de agua de la señal de teledetección remota para caracterizar los componentes bentónicos.
    • Resumir las técnicas para caracterizar los entornos de playas costeras con datos de teledetección remota y métodos de campo para el perfil de playas.

    FORMATO DEL CURSO

    • Tres sesiones de una hora cada una con presentaciones en inglés y español
    • Una tarea a someter usando Google Forms 
    • English

    Parte Uno: Una Mirada a los Ecosistemas Costeros y la Teledetección

    • Introducción a ecosistemas costeros 
    • Un resumen de los sensores más utilizados para la teledetección de áreas costeras 
    • Preguntas y Respuestas

    Parte Dos: Penetración de la Luz en la Columna de Agua

    • Propiedades Aparentes e Inherentes 
    • Medidas de Campo Bio-ópticas 
    • Correcciones de la Columna de Agua 
    • Derivación de Batimetría y Caracterización Béntica Usando Datos Multiespectrales 
    • Calibración y Validación de Datos de Color del Océano 
    • Preguntas y Respuestas

    Parte Tres: Teledetección de Componentes de la Línea de Costa

    • Componentes Geofísicos de la Línea de Costa
    • Las Partes de una Playa
    • Medidas de Campo en la Línea de Costa Necesarias para Validar Imágenes
    • Procesamiento y Análisis de Imágenes para la Caracterización de la Línea de Costa
    • Preguntas y Respuestas
    Materiales:
    • Ver Grabación
    • Diapositivas de la Presentación
    • Tarea 
    • Transcripción de Preguntas y Respuestas
  • Remote Sensing for Freshwater Habitats [Intermediate]

    Freshwater habitats play an important role in ecological function and biodiversity. Remote sensing of these ecosystems is primarily tied to observations of the drivers of biodiversity and ecosystem health. Remote sensing can be used to understand things like land use and land cover change in a watershed, habitat connectivity along a water body, water body location and extent, and water quality parameters. This webinar series will guide participants through using NASA Earth observations for habitat monitoring, specifically for freshwater fish and other species. The training will also provide a conceptual overview, as well as the tools and techniques for applying landscape environmental variables to genetic and habitat diversity in species. 

    Learning Objectives: By the end of this training, attendees will: 


    • understand the limitations of using remote sensing for freshwater habitats
    • find data and models that can be used in their landscape genetics and habitat monitoring work
    • see how remote sensing can be used for habitat restoration, ecological assessments, and climate change assessments relating to freshwater systems
    • be able to use the Riverscape Analysis Project decision support system
    • be familiar with the Freshwater Health Index


    Course Format: 


    • Three, one-hour parts that include lectures, demonstrations, and question & answer sessions
    • This training will only be broadcast in English
    • A certificate of completion will be available to participants who attend all parts and complete all homework assignments. Note: certificates of completion only indicate the attendee participated in all aspects of the training, they do not imply proficiency on the subject matter, nor should they be seen as a professional certification.


    Prerequisites: Please complete ARSET's Fundamentals of Remote Sensing or have equivalent knowledge. Attendees that do not complete the prerequisite may not be prepared for the pace of the training. 

    Part One: Review of Aquatic Remote Sensing & Freshwater Habitats
    As a result of this part of the webinar series, attendees will be able to: 


    • identify which NASA satellites & sensors can be used for freshwater monitoring
    • understand the limitations of remote sensing of freshwater habitats
    • find data and models they can use in landscape genetics and habitat monitoring work


    Part Two: Overview of the Riverscape Analysis Project (RAP)
    As a result of this part of the webinar series, attendees will be able to: 


    • understand using remote sensing for habitat restoration, ecological assessments, and climate change assessments relating to freshwater systems through case studies
    • use the RAP decision-support system for accessing, downloading, and applying remote sensing data


    Part Three: Overview of the Freshwater Health Index (FHI)
    As a result of this part of the webinar series, attendees will be able to: 


    • understand how to evaluate freshwater ecosystem health
    • have the ability to use the FHI data and tools to assess freshwater ecosystem health
    • identify potential uses of the FHI for their work and decision-making
    • Use the FHI to identify vulnerabilities to degradation and/or climate change, as well as opportunities for improvement of infrastructure development within a basin


    Each part of 3 includes links to the recordings, presentation slides, and Question & Answer Transcripts.
     

  • 'Good Enough' Research Data Management: A Brief Guide for Busy People

    This brief guide presents a set of good data management practices that researchers can adopt, regardless of their data management skills and levels of expertise.

  • Remote Sensing for Monitoring Land Degradation and Sustainable Cities Sustainable Development Goals (SDGs) [Advanced]

    The Sustainable Development Goals (SDGs) are an urgent call for action by countries to preserve our oceans and forests, reduce inequality, and spur economic growth. The land management SDGs call for consistent tracking of land cover metrics. These metrics include productivity, land cover, soil carbon, urban expansion, and more. This webinar series will highlight a tool that uses NASA Earth Observations to track land degradation and urban development that meet the appropriate SDG targets. 

    SDGs 11 and 15 relate to sustainable urbanization and land use and cover change. SDG 11 aims to "make cities and human settlements inclusive, safe, resilient, and sustainable." SDG 15 aims to "combat desertification, drought, and floods, and strive to achieve a land degradation neutral world." To assess progress towards these goals, indicators have been established, many of which can be monitored using remote sensing. 

    In this training, attendees will learn to use a freely-available QGIS plugin, Trends.Earth, created by Conservation International (CI) and have special guest speakers from the United Nations Convention to Combat Desertification (UNCCD) and UN Habitat. Trends.Earth allows users to plot time series of key land change indicators. Attendees will learn to produce maps and figures to support monitoring and reporting on land degradation, improvement, and urbanization for SDG indicators 15.3.1 and 11.3.1. Each part of the webinar series will feature a presentation, hands-on exercise, and time for the speaker to answer live questions. 

    Learning Objectives: By the end of this training, attendees will: 

    • Become familiar with SDG Indicators 15.3.1 and 11.3.1
    • Understand the basics on how to compute sub indicators of SDG 15.3.1 such as: productivity, land cover, and soil carbon 
    • Understand how to use the Trends.Earth Urban Mapper web interface
    • Learn the basics of the Trends.Earth toolkit including: 
      • Plotting time series 
      • Downloading data
      • Use default or custom data for productivity, land cover, and soil organic carbon
      • Calculate a SDG 15.3.1 spatial layers and summary table 
      • Calculate urban change metrics
      • Create urban change summary tables



    Course Format: This training has been developed in partnership with Conservation International, United Nations Convention to Combat Desertification (UNCCD), and UN Habitat. 

    • Three, 1.5-hour sessions that include lectures, hands-on exercises, and a question and answer session
    • The first session will be broadcast in English, and the second session will contain the same content, broadcast in Spanish (see separate record for Spanish version at:  https://dmtclearinghouse.esipfed.org/node/10935 


    ​Prerequisites: 


    Each part of 3 includes links to the recordings, presentation slides, exercises and Question & Answer Transcripts.   

  • Teledetección para el Monitoreo de los ODS sobre la Degradación de Tierras y Ciudades Sostenibles

    Los Objetivos de Desarrollo Sostenible (ODS) son un llamado urgente a la acción a todos los países para preservar nuestros océanos y bosques, reducir la desigualdad y fomentar el crecimiento económico. Los ODS sobre la gestión de tierras exigen un seguimiento consistente de las métricas de la cobertura terrestre. Estas métricas incluyen productividad, cobertura terrestre, carbono en el suelo, expansión urbana y más. Esta serie de webinars resaltará una herramienta que utiliza observaciones de la tierra de la NASA para monitorear la degradación de las tierras y el desarrollo urbano que cumplen las metas de los ODS apropiados.  

    Los ODS 11 y 15 tratan la urbanización sostenible así como el uso y los cambios en la cobertura terrestre. El ODS anhela “lograr que las ciudades y los asentamientos humanos sean inclusivos, seguros, resilientes y sostenibles.” El ODS 15 promueve “luchar contra la desertificación, la sequía y las inundaciones y procurar lograr un mundo con una degradación neutra del suelo." Para evaluar el progreso hacia estos fines, hay indicadores establecidos, muchos de los cuales se pueden monitorear mediante la teledetección.

    En esta capacitación los/las participantes aprenderán a utilizar un plugin de QGIS libremente disponible, Trends.Earth, creado por Conservation International (CI). Trends.Earth permite a los usuarios diagramar series temporales de indicadores clave de cambios en la cobertura terrestre. Los/las participantes aprenderán a producir mapas y figuras para apoyar el seguimiento y la información sobre la degradación de tierras, mejoras y urbanización para los indicadores de los ODS 15.3.1 y 11.3.1. Cada parte de esta serie contendrá una presentación, un ejercicio práctico y tiempo para hacerle preguntas en vivo al presentador/ la presentadora.  

    Objetivos de Aprendizaje:

    Durante esta capacitación, usted hará lo siguiente:

    • Se familiarizará con los Indicadores de los ODS 15.3.1 y 11.3.1
    • Llegará a entender lo básico de cómo computar los sub-indicadores del ODS 15.3.1 como productividad, cobertura terrestre y carbono del suelo
    • Aprenderá a utilizar la interfaz en línea Trends.Earth Urban Mapper
    • Aprenderá lo básico del conjunto de herramientas (Toolkit) de Trends.Earth incluyendo:
      • Diagramación de series temporales
      • Descarga de datos
      • Cómo utilizar los datos preconfigurados o personalizados para productividad, cobertura terrestre y carbono orgánico del suelo
      • Cómo calcular capas espaciales y una tabla de resumen para el ODS 15.3.1
      • Cómo calcular métricas de cambios urbanos
      • Cómo crear tablas de resumen para cambios urbanos


    Formato del Curso:

    • Esta capacitación ha sido desarrollada en colaboración con Conservation International
    • Tres sesiones de una hora y media cada una que incluyen presentaciones, ejercicios prácticos y una sesión de preguntas y respuestas
    • La primera sesión se transmitirá en inglés y la segunda sesión tendrá el mismo contenido transmitido en español.
    • Habrá un certificado de finalización disponible para quienes asistan a las tres sesiones y completen la tarea asignada, la cual se basará en las presentaciones del webinar. Nota: los certificados de finalización sólo indican que el poseedor participó en todos los aspectos de la capacitación, no implican proficiencia en el material de esta, ni se deben ver como una certificación profesional.


    Prima Parte

    En esta sesión aprenderán acerca del marco de los ODS y la coordinación entre agencias a nivel mundial; se familiarizarán con el ODS 15, Meta 15.3 e Indicador 15.3.1; aprenderán sobre el concepto de la productividad primaria neta y cómo monitorear esa métrica con datos por teledetección; también aprenderemos cómo visualizar e interpretar datos por teledetección asociados con el ODS 15 dentro de una herramienta para QGIS desarrollada por Conservation International llamada Trends.Earth como un ejercicio práctico.

    • Ver grabación »
      • Diapositivas de la Presentación »
      • Ejercicio 1 (subindicadores) »
      • Ejercicio 1.2 (descargar resultados) »
      • Transcripción de preguntas y respuestas »


    Segunda Parte

    En esta sesión, aprenderán acerca de los cambios en la cobertura terrestre y el carbono orgánico del suelo y cómo monitorear esas métricas mediante la teledetección; aprenderán acerca de los requisitos en cuanto a la presentación de informes para el ODS 15; además,visualizarán e interpretarán datos por teledetección locales asociados con el ODS dentro de Trends.Earth.

    • Ver grabación »
      • Diapositivas de la Presentación »
      • Ejercicio 2 »
      • Transcripción de preguntas y respuestas »


    Tercera Parte

    En esta sesión aprenderán acerca del ODS 11, Meta 11.3 e Indicador 11.3.1; aprenderán acerca de las entradas necesarias para calcular el Indicador 11.3.1 y visualizarán e interpretarán el mapeo de áreas urbanas dentro de Trends.Earth.

    • Ver grabación »
      • Diapositivas de la Presentación »
      • Ejercicio 3 »
      • Tarea (completar hasta el 6 de agosto) »
      • Transcripción de preguntas y respuestas »

  • SAR for Landcover Applications [Advanced]

    This webinar series will build on the knowledge and skills previously developed in ARSET SAR training. Presentations and demonstrations will focus on agriculture and flood applications. Participants will learn to characterize floods with Google Earth Engine. Participants will also learn to analyze synthetic aperture radar (SAR) for agricultural applications, including retrieving soil moisture and identifying crop types.

    Learning Objectives: By the end of this training, attendees will be able to: 

    1. analyze SAR data in Google Earth Engine
    2. generate soil moisture analyses
    3. identify different types of crops   


    Course Format: 

    • This webinar series will consist of two, two-hour parts
    • Each part will include a presentation on the theory of the topic followed by a demonstration and exercise for attendees. 
    • This training is also available in Spanish. Please visit the Spanish page for more information.
    • A certificate of completion will also be available to participants who attend all sessions and complete the homework assignment, which will be based on the webinar sessions. Note: certificates of completion only indicate the attendee participated in all aspects of the training, they do not imply proficiency on the subject matter, nor should they be seen as a professional certification.


    Prerequisites: 
    Prerequisites are not required for this training, but attendees that do not complete them may not be adequately prepared for the pace of the training. 



    Part One: Monitoring Flood Extent with Google Earth Engine
    This session will focus on the use of Google Earth Engine (GEE) to generate flood extent products using SAR images from Sentinel-1. The first third of the session will cover the basic principles of radar remote sensing related to flooded vegetation. The remaining time in the session will be dedicated to a demonstration on how to use GEE to generate flood extent products with Sentinel-1.
    Part Two: Exploiting SAR to Monitor Agriculture
    Featuring guest speaker Dr. Heather McNairn, from Agriculture and Agri-Food Canada, this session will focus on using SAR to monitor different agriculture-related topics, building on the skills learned in the SAR agriculture session from 2018. The first part of the session will cover the basics of radar remote sensing as related to agriculture. The remainder of the session will focus on the use of SAR to retrieve soil moisture, identify crop types, and map land cover.

    Each part of 2 includes links to the recordings, presentation slides, and Question & Answer Transcripts.