All Learning Resources

  • SQL for Librarians

    This Library Carpentry lesson introduces librarians to relational database management system using SQLite. At the conclusion of the lesson you will: understand what SQLite does; use SQLite to summarise and link databases. DB Browser for SQLite (https://sqlitebrowser.org/needs to be installed before the start of the training. The tutorial covers:
    1. Introduction to SQL
    2. Basic Queries
    3. Aggregation
    4. Joins and aliases
    5. Database design supplement
    Exercises are included with most of the sections.

  • Coffee and Code: Natural Language Processing with Python

    Github repository for this workshop: https://github.com/unmrds/cc-nlp https://github.com/unmrds/cc-nlp

    The processing and analysis of natural languages is a core requirement for extracting structured information from spoken, signed, or written language and for feeding that information into systems or processes that generate insights from, or responses to provided language data. As languages that are naturally evolved and not designed for a specific purpose natural languages pose significant challenges when developing automated systems.

    Natural Language Processing - the class of activities in which language analysis, interpretation, and generation play key roles - is used in many disciplines as is demonstrated by this random sample of recent papers using NLP to address very different research problems:

    "Unsupervised entity and relation extraction from clinical records in Italian" (1)
    "Candyflipping and Other Combinations: Identifying Drug–Drug Combinations from an Online Forum" (2)
    "How Can Linguistics Help to Structure a Multidisciplinary Neo Domain such as Exobiology?" (3)
    "Bag of meta-words: A novel method to represent document for the sentiment classification" (4)
    "Information Needs and Communication Gaps between Citizens and Local Governments Online during Natural Disasters" (5)
    "Mining the Web for New Words: Semi-Automatic Neologism Identification with the NeoCrawler" (6)
    "Distributed language representation for authorship attribution" (7)
    "Toward a computational history of universities: Evaluating text mining methods for interdisciplinarity detection from PhD dissertation abstracts" (8)
    "Ecological momentary interventions for depression and anxiety" (9)

  • Data and Software Skills Training for Librarians

    Library Carpentry is an open education volunteer network and lesson organization dedicated to teaching librarians data and software skills. The goal is to help librarians better engage with constituents and improve how they do their work. This presentation will serve as an introduction on how Library Carpentry formed in 2015, evolved as a global community of library professionals and will continue as a future sibling of the Carpentries, an umbrella organization of distinct lesson organizations, such as Data and Software Carpentry. We’ll cover existing collaborative lesson development, curricula coverage, workshop activities and the global instructor community. We’ll then talk about the future coordinating activities led by the UC system to align and prepare for a merging with Data and Software Carpentry.

  • Train the Trainer Workshop: How do I create a course in research data management?

    Presentations and excercises of a train-the-trainer Workshop on how to create a course in research data management, given at the International Digital Curation Conference 2018 in Barcelona.

  • Developing Data Management Education, Support, and Training

    These presentations were part of an invited guest lecture on data management for CISE graduates students of the CAP5108: Research Methods for Human-centered Computing course at the University of Florida (UF) on April 12, 2018. Graduate students were introduced to the DCC Checklist for a Data Management Plan, OAIS Model (cessda adaptation), ORCiD, IR, high-performance computing (HPC) storage options at UF, data lifecycle models (USGS and UNSW), data publication guides (Beckles, 2018) and reproducibility guidelines (ACM SIGMOD 2017/2018). This was the first guest lecture on data management for UF computer & information science & engineering (CISE) graduate students in CAP 5108: Research Methods for Human-centered Computing - https://www.cise.ufl.edu/class/cap5108sp18/.  A draft of a reproducibility template is provided in version 3 of the guest lecture.  

  • Code of Best Practices and Other Legal Tools for Software Preservation: 2019 Webinar Series

    Since 2015, the Software Preservation Network (SPN) has worked to create a space where organizations from industry, academia, government, cultural heritage, and the public sphere can contribute their myriad skills and capabilities toward collaborative solutions that will ensure persistent access to all software and all software-dependent objects. The organization's goal is to make it easier to deposit, discover, and reuse software.

    A key activity of the SPN is to provide webinar series on topics related to software preservation.  The 2019 series include:
    Episode 1: The Code of Best Practices for Fair Use in Software Preservation, Why and How?
    Episode 2:  Beginning the Preservation Workflow
    Episode 3:  Making Software Available Within Institutions and Networks
    Episode 4:  Working with Source Code and Software Licenses
    Episode 5:  Understanding the Anti-circumvention Rules and the Preservation Exemptions
    Episode 6:  Making the Code Part of Software Preservation Culture
    Episode 7:  International Implications

    See information about each episode separately.
     

     

  • Coffee and Code: Write Once Use Everywhere (Pandoc)

    Pandoc at http://pandoc.org  is a document processing program that runs on multiple operating systems (Mac, Windows, Linux) and can read and write a wide variety of file formats. In many respects, Pandoc can be thought of as a universal translator for documents. This workshop focuses on a subset of input and output document types, just scratching the surface of the transformations made possible by Pandoc.

    Click 00-Overview.ipynb on the provided GitHub page or go directly to the overview, here:
    https://github.com/unmrds/cc-pandoc/blob/master/00-Overview.ipynb

  • U.S. Fish and Wildlife Service National Conservation Training Center

    The National Conservation Training Center (NCTC) of  the U.S. Fish and Wildlife (USFWS) provides a search service on top of a catalog of the courses offered at the NCTC physical location and online that are related to data skills, and data management.  The courses include instructor led,  online self study,  online instructor led courses, and webinars.  Some courses are free;  others have a fee associated with them.  Many of the courses use various GIS data sources and systems including USFWS datasets that can be found at:  https://www.fws.gov/gis/data/national/index.html  The NCTC provides a searching interface on its home page.

  • LP DAAC Data Recipes

    A collection of tutorials that describe how to use Earth science data from NASA's Land Processes Distributed Active Archive Center (LP DAAC) using easily available tools and commonly used formats for Earth science data.  These tutorials are available to assist those wishing to learn or teach how to obtain and view these data. 

  • Genomics Workshop

    Getting Started
    This lesson assumes no prior experience with the tools covered in the workshop. However, learners are expected to have some familiarity with biological concepts, including nucleotide abbreviations and the concept of genomic variation within a population. 
    Workshop Overview.  Workshop materials include a recommendation for a dataset to be used with the lesson materials.Project organization and management:
    Learn how to structure your metadata, organize and document your genomics data and bioinformatics workflow, and access data on the NCBI sequence read archive (SRA) database.Introduction to the command line:
    Learn to navigate your file system, create, copy, move, and remove files and directories, and automate repetitive tasks using scripts and wildcards.Data wrangling and processing:
    Use command-line tools to perform quality control, align reads to a reference genome, and identify and visualize between-sample variation.Introduction to cloud computing for genomics:
    Learn how to work with Amazon AWS cloud computing and how to transfer data between your local computer and cloud resources.
     

  • The Horizon 2020 Open Research Data Pilot: Introduction to the Requirements of the Open Research Data Pilot

    This course provides an introduction to the European Commission's Open Research Data Pilot in Horizon 2020. It includes two sections: Introduction to the Requirements of the Open Research Data Pilot and How to Comply with the Requirements of the Open Research Data Pilot. Each section may include videos, presentation slides, demonstrations, associated readings, and quizzes which can be found at the URL to the home page for this course.
    Learning objectives:

    • Understand what is required of participants in the Horizon 2020 Open Research Data pilot
    • Learn about the concepts of open data, metadata, licensing and repositories
    • Identify key resources and services that can help you to comply with requirements
    • Undertake short tests to check your understanding
  • Software Preservation Network 2019 Webinar Series Episode 3: Making Software Available Within Institutions and Networks

    This episode is one of 7 in the Software Preservation Network's 2019 Webinar Series on Code of Best Practices and Other Legal Tools for Software Preservation.  Each episode is recorded;  both presentation slides and webinar transcript as well as links to supplementary resources are also available.   Information about the full series can be found at:  https://www.softwarepreservationnetwork.org/events 

    In this third episode in a seven-part series about using the Code of Best Practices in Fair Use, you’ll learn about:

    • How fair use enables institutions to provide access to software for use in research, teaching, and learning settings while minimizing any negative impact on ordinary commercial sales
    • How to provide broader networked access to software maintained and shared across multiple institutions, including off-premise access under some circumstances
    • Safeguards to minimize potential risks, such as the establishment of a mechanism to register concerns by stakeholders
  • Software Preservation Network 2019 Webinar Series Episode 6: Making the Code Part of Software Preservation Culture

    This episode is one of 7 in the Software Preservation Network's 2019 Webinar Series on Code of Best Practices and Other Legal Tools for Software Preservation.  Each episode is recorded;  both presentation slides and webinar transcript as well as links to supplementary resources are also available.   Information about the full series can be found at:  https://www.softwarepreservationnetwork.org/events 

    In this sixth episode in a seven-part series about using the Code of Best Practices in Fair Use, you’ll learn:

    • The difference between a document and a shift in practice
    • How other communities have incorporated fair use into their professional practice
    • How to talk to gatekeepers and to allies in your network, to strengthen field-wide practice
  • Software Preservation Network 2019 Webinar Series Episode 5: Understanding the Anti-circumvention Rules and the Preservation Exemptions

    This episode is one of 7 in the Software Preservation Network's 2019 Webinar Series on Code of Best Practices and Other Legal Tools for Software Preservation.  Each episode is recorded;  both presentation slides and webinar transcript as well as links to supplementary resources are also available.   Information about the full series can be found at:  https://www.softwarepreservationnetwork.org/events 

    In this fifth episode in a seven-part series about using the Code of Best Practices in Fair Use, you’ll learn :

    • What the DMCA anti-circumvention provisions are and how they relate to copyright, fair use, and the Code
    • How the triennial exemption rulemaking works and how SPN obtained an exemption for software (and video game) preservation
    • How to apply the new exemption to your own activities
  • Software Preservation Network 2019 Webinar Series Episode 4: Working with Source Code and Software Licenses

    This episode is one of 7 in the Software Preservation Network's 2019 Webinar Series on Code of Best Practices and Other Legal Tools for Software Preservation.  Each episode is recorded;  both presentation slides and webinar transcript as well as links to supplementary resources are also available.   Information about the full series can be found at:  https://www.softwarepreservationnetwork.org/events 

    In this fourth episode in a seven-part series about using the Code of Best Practices in Fair Use, you’ll learn:

    • How the Code treats preservation and access to source code in your collections
    • How software licenses interact with fair use
    • What kinds of software license provisions might prevent using fair use
    • When licenses bind (and do not bind) owners of physical copies of software
    • Non-copyright concerns associated with software licenses.
  • Software Preservation Network 2019 Webinar Series Episode 7: International Implications

    This episode is one of 7 in the Software Preservation Network's 2019 Webinar Series on Code of Best Practices and Other Legal Tools for Software Preservation.  Each episode is recorded;  both presentation slides and webinar transcript as well as links to supplementary resources are also available.   Information about the full series can be found at:  https://www.softwarepreservationnetwork.org/events 

    In this seventh and final episode of a series about using the Code of Best Practices in Fair Use, you’ll learn: 

    • Why licensing isn’t a viable solution to copyright issues in preservation projects with global reach
    • How U.S. fair use law applies to initiatives that involve foreign materials
    • How preservationists in other countries can take advantage of local law (and the Code) to advance their work and the roles they can play in advocacy for better and more flexible copyright exceptions
  • Data Management Lifecycle and Software Lifecycle Management in the Context of Conducting Science

    This paper examines the potential for comparisons of digital science data curation lifecycles to software lifecycle development to provide insight into promoting sustainable science software. The goal of this paper is to start a dialog examining the commonalities, connections, and potential complementarities between the data lifecycle and the software lifecycle in support of sustainable software. We argue, based on this initial survey, delving more deeply into the connections between data lifecycle approaches and software development lifecycles will enhance both in support of science.

  • Software Preservation Network 2019 Webinar Series Episode 2: Beginning the Preservation Workflow

    This episode is one of 7 in the Software Preservation Network's 2019 Webinar Series on Code of Best Practices and Other Legal Tools for Software Preservation.  Each episode is recorded;  both presentation slides and webinar transcript as well as links to supplementary resources are also available.   Information about the full series can be found at:  https://www.softwarepreservationnetwork.org/events 
    In the second episode in the series, speakers cover how to read the Situations in the Code (parsing descriptions, Principles, and Limitations), and explore how fair use applies to the foundational steps in a preservation workflow including stabilizing, describing, evaluating, and documenting software.

  • Software Preservation Network 2019 Webinar Series Episode 1: The Code of Best Practices for Fair Use in Software Preservation, Why and How?

    This episode is one of 7 in the Software Preservation Network's 2019 Webinar Series on Code of Best Practices and Other Legal Tools for Software Preservation.  Each episode is recorded;  both presentation slides and webinar transcript as well as links to supplementary resources are also available.   Information about the full series can be found at:  https://www.softwarepreservationnetwork.org/events 
    Episode 1: The Code of Best Practices for Fair Use in Software Preservation, Why and How
    In this introduction to the webinar series, you’ll learn: 

    • What the Code is 
    • What the copyright doctrine of fair use is 
    • How it addresses problems such as making preservation copies, patron access onsite and remotely, sharing software with other institutions, and providing access to source code 
    • Why best practices codes are a robust, reliable guide to practice
  • Do-It-Yourself Research Data Management Training Kit for Librarians

    Online training materials on topics designed for small groups of librarians who wish to gain conficence and understanding of research data management.  The DIY Training Kit is designed to contain everything needed to complete a similar training course on your own (in small groups) and is based on open educational materials. The materials have been enhanced with Data Curation Profiles and reflective questions based on the experience of academic librarians who have taken the course.
    The training kit includes:  
     - Promotional slides for the RDM Training Kit
    - Training schedule
    - Research Data MANTRA online course by EDINA and Data Library, University of Edinburgh
    - Reflective writing questions
    - Selected group exercises (with answers) from UK Data Archive, University of Essex - Managing and sharing data: Training resources. September, 2011 (PDF). Complete RDM Resources Training Pack available: 
    http://data-archive.ac.uk/create-manage/training-resources
    - Podcasts for short talks by the original Edinburgh speakers if running course without ‘live’ speakers (Windows or Quicktime versions).
    - Presentation files (pptx) if learners decide to take turns presenting each topic.
    - Evaluation forms
    - Independent study assignment: Interview with a researcher, based on Data Curation Profile, from D2C2, Purdue University Libraries and Boston University Libraries.

  • Seismic Data Quality Assurance Using IRIS MUSTANG Metrics

    Seismic data quality assurance involves reviewing data in order to identify and resolve problems that limit the use of the data – a time-consuming task for large data volumes! Additionally, no two analysts review seismic data in quite the same way. Recognizing this, IRIS developed the MUSTANG automated seismic data quality metrics system to provide data quality measurements for all data archived at IRIS Data Services. Knowing how to leverage MUSTANG metrics can help users quickly discriminate between usable and problematic data and it is flexible enough for each user to adapt it to their own working style.
    This tutorial presents strategies for using MUSTANG metrics to optimize your own data quality review. Many of the examples in this tutorial illustrate approaches used by the IRIS Data Services Quality Assurance (QA) staff.