All Learning Resources

  • Data Management: File Organization

    Do you struggle with organizing your research data? This workshop teaches practical techniques for organizing your data files. Topics include: file and folder organizational structures, file naming, and versioning.

  • Management Challenges in Research Infrastructures

    This module will look at some of the key management issues that arise within research infrastructures with a clarity and urgency they don’t often have below the infrastructural scale.  It will also look at key trends and developments in these areas, and how exemplar projects are applying them.  In particular, this module will cover: User Engagement, Communications and Audiences, Sustainability, and the Macro-level issues, including managing the political environment.

    This training module is targeted to the Intermediate Level student who wants to learn about digital humanities research infrastructures and includes approaches some of the major challenges in building and maintaining research infrastructures.  These materials are somewhat more dense than the beginning level module.  Students would benefit from a more comprehensive grounding in digital humanities and the management of research projects.

    PARTHENOS training provides modules and resources in digital humanities and research infrastructures with the goal of strengthening the cohesion of research in the broad sector of Linguistic Studies, Humanities, Cultural Heritage, History, Archaeology and related fields.  Activities designed to meet this goal will address and provide common solutions to the definition and implementation of joint policies and solutions for the humanities and linguistic data lifecycle, taking into account the specific needs of the sector including the provision of joint training activities and modules on topics related to understanding research infrastructures and mangaging, improving and openin up research and data for both learners and trainers. 

    More information about the PARTHENOS project can be found at:  http://www.parthenos-project.eu/about-the-project-2.  Other training modules created by PARTHENOS can be found at:  http://training.parthenos-project.eu/training-modules/.

  • Introduction to Research Infrastructures

    This training module provides an introduction to research infrastructures targeted to the Beginner Level.  Beginner Level assumes only a moderate level of experience with digital humanities, and none with research infrastructures.  Units, videos and lectures are all kept to short, manageable chunks on topics that may be of general interest, but which are presented with an infrastructural twist..  By the end of this module, you should be able to…

    • Understand the elements of common definitions of research infrastructures
    • Be able to discuss the importance of issues such as sustainability and interoperability
    • Understand how research infrastructure supports methods and communities
    • Be aware of some common critiques of digital research infrastructures in the Humanities.

    PARTHENOS training provides modules and resources in digital humanities and research infrastructures with the goal of strengthening the cohesion of research in the broad sector of Linguistic Studies, Humanities, Cultural Heritage, History, Archaeology and related fields.  Activities designed to meet this goal will address and provide common solutions to the definition and implementation of joint policies and solutions for the humanities and linguistic data lifecycle, taking into account the specific needs of the sector including the provision of joint training activities and modules on topics related to understanding research infrastructures and mangaging, improving and openin up research and data for both learners and trainers.

    More information about the PARTHENOS project can be found at:  http://www.parthenos-project.eu/about-the-project-2.  Other training modules created by PARTHENOS can be found at:  http://training.parthenos-project.eu/training-modules/.

  • Manage, Improve and Open Up your Research and Data

    This module will look at emerging trends and best practice in data management, quality assessment and IPR issues.

    We will look at policies regarding data management and their implementation, particularly in the framework of a Research Infrastructure.
    By the end of this module, you should be able to:

    • Understand and describe the FAIR Principles and what they are used for
    • Understand and describe what a Data Management Plan is, and how they are used
    • Understand and explain what Open Data, Open Access and Open Science means for researchers
    • Describe best practices around data management
    • Understand and explain how Research Infrastructures interact with and inform policy on issues around data management

    PARTHENOS training provides modules and resources in digital humanities and research infrastructures with the goal of strengthening the cohesion of research in the broad sector of Linguistic Studies, Humanities, Cultural Heritage, History, Archaeology and related fields.  Activities designed to meet this goal will address and provide common solutions to the definition and implementation of joint policies and solutions for the humanities and linguistic data lifecycle, taking into account the specific needs of the sector including the provision of joint training activities and modules on topics related to understanding research infrastructures and mangaging, improving and openin up research and data for both learners and trainers.

    More information about the PARTHENOS project can be found at:  http://www.parthenos-project.eu/about-the-project-2.  Other training modules created by PARTHENOS can be found at:  http://training.parthenos-project.eu/training-modules/.

  • Introduction to Collaboration in Research Infrastructures

    Is humanities research collaborative?  Some would say that with our traditions of independent research and single authorship, it is not. This is not really true for any humanist, however, as collaboration does occur within classrooms, on-line communities, and within disciplinary networks.  For the digital humanities, this is even more the case, as the hybridity of our methods require us to work together.  Very few digital humanists can master entirely on their own the domain, information and software challenges their approach presents, and so we tend to work together.

    This training module provides an introduction to research infrastructures targeted to the Advanced Level, and as such, presents some of the exciting new research directions coming out of the PARTHENOS Cluster.  These modules approach some of the theoretical issues that shape the design, delivery and indeed the success of research infrastructure developments, challenging us to think about how we develop and support humanities at scale in the interaction with technology.

    By the end of this module, you should be able to….

    • Understand what is meant by collaboration in humanities research
    • Be aware of how this model impacts upon the development of digital humanities, and digital humanities research infrastructures

    PARTHENOS training provides modules and resources in digital humanities and research infrastructures with the goal of strengthening the cohesion of research in the broad sector of Linguistic Studies, Humanities, Cultural Heritage, History, Archaeology and related fields.  Activities designed to meet this goal will address and provide common solutions to the definition and implementation of joint policies and solutions for the humanities and linguistic data lifecycle, taking into account the specific needs of the sector including the provision of joint training activities and modules on topics related to understanding research infrastructures and mangaging, improving and openin up research and data for both learners and trainers

    More information about the PARTHENOS project can be found at:  http://www.parthenos-project.eu/about-the-project-2.  Other training modules created by PARTHENOS can be found at:  http://training.parthenos-project.eu/training-modules/.

  • Introduction to Research Data Management

    This slideshow was used in an Introduction to Research Data Management course taught for the Mathematical, Physical and Life Sciences Division, University of Oxford, on 2017-02-15. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.  Various data policies are referenced that are pertinent to the UK including the Research Councils of the UK's common Principles on Data Policy and the EPSRC Policy Framework on Research Data.  Research data skills guide and tools are referenced.  Text of the slides are also available.  

  • Preparing Your Research Material for the Future

    This slideshow was used in a Preparing Your Research Material for the Future course for the Humanities Division, University of Oxford, on 2018-06-08. It provides an overview of some key issues, focusing on the long-term management of data and other research material, including sharing and curation.

  • Science Data Resources: From Astronomy to Bioinformatics

    In this one-hour workshop, Michelle Hudson, Kayleigh Bohemier and Kristin Bogdan gives an overview of the types, formats, sources and general refernce resources of scientific data in various disciplines: geology, astronomy, physics, and physical samples. Rolando Garcia Milian, who recently joined the Cushing/Whitney Medical Library as Biomedical Sciences Research Support librarian, gives an overview of bioinformatics as a discipline, and the kinds of questions he answers in the course of his work including tools for data retrieval and data mining.

  • ISO Online Metadata Training

    Course Description: This course presents the concept, principles and value of metadata utilizing the International Organization for Standardization (ISO) metadata in seven online sessions. It provides the content and structure of the IS0 191** series metadata in detail, along with methods for writing quality metadata. Each session will last approximately one hour. URL brings you to an index (FTP) page for the course which includes downloadable resources for the course including:  agenda, exercises, handouts, presentation slides, recorded sessions, sample metadata, schemas, templates, transforms and workbooks.  Please contact [email protected] with any questions. 

    Other online courses from the parent directory at:  ftp://ftp.ncddc.noaa.gov/pub/Metadata/Online_ISO_Training/ include introductions to CSDGM, and to Geospatial Metadata.

  • Research Data Management: Can Librarians Really Help?

    This presentation provides information on how librarians can help with research data managment from the point of view of assisting researchers in the research lifecycle.  The presentation was made at the GL 20, the Twentieth International Conference on Grey Literature.   

  • C++ Programming Tutorial Series

    This is everything you need to know to get started as a C++ Programming Software developer / Software engineer. We start off with the super basics and work our way to intermediate topics.  Videos are available as an "All-in-One Tutorial Series" of 10 hours or 101 shorter videos that range from introductory concepts to various functions such as swap functions and function overloading, and creating makefiles and namespaces to name a few.

  • JavaScript Tutorial

    This playlist is an introductory course to the concepts behind JavaScript!  This series of 101 short videos will better help you understand how JavaScript works behind the scenes. The series covers everything you need to know to start building applications in JavaScript.  Fundamentals are covered first but eventually cover topics including object-oriented programming, scoping, hoisting, closures, ES6 classes, factory and constructor functions and more.

  • Introduction to Python GIS- CSC Training 2018

    Introduction to Python GIS is 6 lessons organized by CSC Finland – IT Center for Science. During the course you will learn how to do different GIS-related tasks in Python programming language. Each lesson is a tutorial with specific topic(s) + Exercises where the aim is to learn how to solve common GIS-related problems and tasks using Python tools. Lecturer of the course is Henrikki Tenkanen who is a geo-data scientist and postdoctoral researcher at the Digital Geography Lab, University of Helsinki. These lessons are for  those who know the basics of Python programming.  If Python is not familiar to you, we recommend to start with a course from us focusing on the basics of Python from geo-python.github.io.

    The majority of this course will be spent in front of a computer learning to program in the Python language and working on exercises. The provided exercises will focus on developing basic programming skills using the Python language and applying those skills to various GIS related problems.

    Learning objectives

    At the end of the course you should have a basic idea how to conduct following GIS tasks in Python:

    Read / write spatial data from/to different file formats
    Deal with different projections
    Conduct different geometric operations and spatial queries
    Convert addresses to points (+ vice versa) i.e. do geocoding
    Reclassify your data based on different criteria
    Know how to fetch data from OpenStreetMap easily with Python
    Know the basics of raster processing in Python
    Visualize data and create (interactive) maps

    Course information

    Lesson 1:GIS with Python; Spatial data model; Geometric Objects; Shapely
    Lesson 2:Working with Geo Data Frames; Managing projections
    Lesson 3: Geocoding; Table join; Working with Open Street Map data
    Lesson 4: Geometric operations; Spatial queries
    Lesson 5: Visualization, making static and interactive maps
    Lesson 6:Raster processing in Python

  • Data Management Plan Templates

    Do you need a template to draft a data management plan?  Not everyone wants to use the DMPTool, and we understand. Maybe you would like to have a template that you can use in a classroom setting so your students can practice writing a plan.  Perhaps you would simply like to see what the requirements are for a given funder, so you can get a head start on your next grant proposal.

    The NSF and NEH templates are identical to the ones in the DMPTool.  The NIH and DOE templates were created in response to these funders changing landscapes. Some funders have specific requirements for a program, and those guidance documents are also available here. All templates are in Word format, and Rich Text format are available by request.

  • Data Management Plan Template

    This link takes you to a MS Word based, downloadable Data Management Plan (DMP) template with tips on how to complete each section. Your completed DMP can be used in grant applications or put into practice as a protocol for handling data individually or within your research group or lab. This template provides a basic method of organizing your research data managment information as you begin a new project. The template was created and made available as part of a workshop series on data management in Winter 2015.  
     

  • Research Data Management (RDM) Open Training Materials

    Openly accessible, curated online training materials which can be shared and repurposed for RDM training. All contributions in any language are welcome.  Resources are stored in the Zenodo platform.  Formats, licenses and terms for use vary.

  • Data Management Plans - EUDAT best practices and case study

    Science and more specifically projects using HPC is facing a digital data explosion. Instruments and simulations are producing more and more volume; data can be shared, mined, cited, preserved… They are a great asset, but they are facing risks: we can miss storage, we can lose them, they can be misused… To start this session, we reviewed why it is important to manage research data and how to do this by maintaining a Data Management Plan. This was based on the best practices from EUDAT H2020 project and European Commission recommendation. During the second part we interactively drafted a DMP for a given use case.  Presentation slides and video recording of this event is available at the link given.

  • Data Services: Data Management Classes

    This Libguide provides information on managing data and obtaining secondary data for research. This site includes videos on writing a data management plan, data management best practices, and links to tool and data sources. Presentation slides are also available as well as references more specifically tailored to the University of Tennessee, Knoxville.

  • Essentials 4 Data Support

    Essentials 4 Data Support is an introductory course for those people who (want to) support researchers in storing, managing, archiving and sharing their research data.

    Essentials 4 Data Support is a product of Research Data Netherlands.

  • Metadata Management for Spatial Data Infrastructures

    This presentation will focus on creating geospatial metadata for spatial data infrastructures. The growing emphasis on data management practices in recent years has underscored the need for well-structured metadata to support the preservation and reuse of digital geographic information. Despite its value, creation of geospatial metadata is widely recognized as a complex and labor-intensive process, often creating a barrier to effective identification and evaluation of digital datasets. We will discuss our set of best practices for describing a variety of spatial content types using the ISO Series of Geographic Metadata Standards. We will share a series of Python and XSLT routines, which automate the creation of ISO-compliant metadata for geospatial datasets, web services, and feature catalogs. These auto-generation tools are designed to work directly with XML documents, making them suitable for use within any XML-aware cataloging platform. Our goals are to make metadata creation simpler for data providers, and to increase standardization across organizations in order to increase the potential for metadata sharing and data synchronization among the geospatial community.
  • Dash: Making Data Sharing Easier

    Dash is a self-service tool for researchers to select, describe, identify, upload, update, and share their research data. 

    For more information about Dash go to ​https://cdlib.org/services/uc3/dryad/.   Dash is the front end to the Dryad repository platform.

  • Rethinking Research Data | Kristin Briney | TEDxUWMilwaukee

    The United States spends billions of dollars every year to publicly support research that has resulted in critical innovations and new technologies. Unfortunately, the outcome of this work, published articles, only provides the story of the research and not the actual research itself. This often results in the publication of irreproducible studies or even falsified findings, and it requires significant resources to discern the good research from the bad. There is way to improve this process, however, and that is to publish both the article and the data supporting the research. Shared data helps researchers identify irreproducible results. Additionally, shared data can be reused in new ways to generate new innovations and technologies. We need researchers to “React Differently” with respect to their data to make the research process more efficient, transparent, and accountable to the public that funds them.

    Kristin Briney is a Data Services Librarian at the University of Wisconsin-Milwaukee. She has a Ph.D. in physical chemistry, a Masters in library and information studies, and currently works to help researchers manage their data better. She is the author of “Data Management for Researchers” and regular blogs about data best practices at dataabinitio.com.

    This talk was given at a TEDx event using the TED conference format but independently organized by a local community. Learn more at http://ted.com/tedx

  • Introduction to Lidar

    This self-paced, online training introduces several fundamental concepts of lidar and demonstrates how high-accuracy lidar-derived elevation data support natural resource and emergency management applications in the coastal zone.  Note: requires Adopbe Flash Plugin.

    Learning objectives:

    • Define lidar
    • Select different types of elevation data for specific coastal applications
    • Describe how lidar are collected
    • Identify the important characteristics of lidar data
    • Distinguish between different lidar data products
    • Recognize aspects of data quality that impact data usability
    • Locate sources of lidar data
    • Discover additional information and additional educational resources

     

  • OntoSoft Tutorial: A distributed semantic registry for scientific software

    An overview of the OntoSoft project, an intelligent system to assist scientists in making their software more discoverable and reusable.

    For more information on the OntoSoft project, go to ​https://imcr.ontosoft.org/.

  • Digital Preservation Handbook

    The Handbook provides an internationally authoritative and practical guide to the subject of managing digital resources over time and the issues in sustaining access to them. A key knowledge base for digital preservation, peer-reviewed and freely accessible to all. It will be of interest to all those involved in the creation and management of digital materials.
    The contents page provides an "at a glance" view of the major sections and all their component topics. You can navigate the Handbook by clicking and expanding the "Explore the Handbook" navigation bar or by clicking links in the contents page.
    The contents are listed hierarchically and indented to show major sections and sub-sections. Landing pages provide overviews and information for major sections with many sub-sections.
    Contents:
    -Introduction
    -Digital preservation briefing [landing page]
    -Getting started
    -Institutional strategies [landing page]
    -Organizational activities [landing page]
    -Technical solutions and tools [landing page]
    -Content-specific preservation [landing page]
    -Glossary

  • Data Management Lifecycle and Software Lifecycle Management in the Context of Conducting Science

    This paper examines the potential for comparisons of digital science data curation lifecycles to software lifecycle development to provide insight into promoting sustainable science software. The goal of this paper is to start a dialog examining the commonalities, connections, and potential complementarities between the data lifecycle and the software lifecycle in support of sustainable software. We argue, based on this initial survey, delving more deeply into the connections between data lifecycle approaches and software development lifecycles will enhance both in support of science.

  • USGS Data Management Training Modules—Metadata for Research Data

    This is one of six interactive modules created to help researchers, data stewards, managers and the public gain an understanding of the value of data management in science and provide best practices to perform good data management within their organization. This module covers metadata for research data. The USGS Data Management Training modules were funded by the USGS Community for Data Integration and the USGS Office of Organizational and Employee Development's Technology Enabled Learning Program in collaboration with Bureau of Land Management, California Digital Library, and Oak Ridge National Laboratory. Special thanks to Jeffrey Morisette, Dept. of the Interior North Central Climate Science Center; Janice Gordon, USGS Core Science Analytics, Synthesis, and Libraries; National Indian Programs Training Center; and Keith Kirk, USGS Office of Science Quality Information.

  • The Oxford Common File Layout

    The Oxford Common File Layout (OCFL) specification describes an application-independent approach to the storage of digital information in a structured, transparent, and predictable manner. It is designed to promote long-term object management best practices within digital repositories.  This presentation was given at under the topic of Preservation Tools, Techniques and Policies for the Research Data Alliance Preserving Scientific Annotation Working Group on April 4, 2017. 
     

  • Data Champions: leading the way to proper research data management

    Presentation given by Esther Plomp at the "FAIR Data - the Key to Sustainable Research" seminar at Tartu University Library on the 9th of April 2019.

    Data Champions are experts in data management who share their experience with their group/department members. They are volunteers that act as advocates for good data management and sharing practises and they help their Faculty’s Data Steward with disciplinary specific understandings of Research Data Management (RDM). The Data Champion programme started in September 2018 at the TU Delft and is open to researchers from all levels (PhD students to professors) as well as support staff (data managers, software developers and technicians). As the Data Champions are members of all faculties and various departments of Delft University of Technology (TU Delft) they form a network across the university campus. The Champions are invited for meetings where they interact with each other and share their experiences, such as achievements and problems that they encounter in managing data and software. They are also encouraged to participate at a national and international level by being informed on current trends in data management and there is a travel grant available that allows them to participate in RDM events, trainings and workshops. At TU Delft they are actively working together with the Data Stewards on RDM policy development as well as involved in more practical activities such as coding support and Software Carpentry workshops. These activities increase the visibility and impact of the Data Champions, recognise their data management efforts, and offer them opportunities to learn new skills which they can share with their local community members.

  • Reproducible Quantitative Methods: Data analysis workflow using R

    Reproducibility and open scientific practices are increasingly being requested or required of scientists and researchers, but training on these practices has not kept pace. This course, offered by the Danish Diabetes Academy, intends to help bridge that gap. This course is aimed mainly at early career researchers (e.g. PhD and postdocs) and covers the fundamentals and workflow of data analysis in R.

    This repository contains the lesson, lecture, and assignment material for the course, including the website source files and other associated course administration files. 

     By the end of the course, students will have:

    1. An understanding of why an open and reproducible data workflow is important.
    2. Practical experience in setting up and carrying out an open and reproducible data analysis workflow.
    3. Know how to continue learning methods and applications in this field.

    Students will develop proficiency in using the R statistical computing language, as well as improving their data and code literacy. Throughout this course we will focus on a general quantitative analytical workflow, using the R statistical software and other modern tools. The course will place particular emphasis on research in diabetes and metabolism; it will be taught by instructors working in this field and it will use relevant examples where possible. This course will notteach statistical techniques, as these topics are already covered in university curriculums.

    For more detail on the course, check out the syllabus at:  https://dda-rcourse.lwjohnst.com.
     

  • Template Research Data Management workshop for STEM researchers

    These materials are designed as a template for an introductory Research Data Management workshop for STEM postgraduate students and Early Career Researchers. The workshop is interactive and is designed to be run for 2-3 hours depending on which sections of the workshop are delivered. As it is a template workshop there is a lot of material to cover all disciplines, it is unlikely that all sections would be of interest to any one group of researchers. The sections are:
    Introduction
    Backup and file sharing
    How to organise your data well
    Data Tools
    Personal and sensitive data
    Data sharing
    Data Management Plans

    The workshop works best when adapted for a particular discipline and with a maximum of 30 participants. This workshop was developed for the Data Champions programme at the University of Cambridge and is an adaptation of workshops which are run on a regular basis for PhD students and Postdoctoral researchers. If you would like any more information please email [email protected] and we would be happy to answer any questions that you have.

  • Developing Data Management Education, Support, and Training

    These presentations were part of an invited guest lecture on data management for CISE graduates students of the CAP5108: Research Methods for Human-centered Computing course at the University of Florida (UF) on April 12, 2018. Graduate students were introduced to the DCC Checklist for a Data Management Plan, OAIS Model (cessda adaptation), ORCiD, IR, high-performance computing (HPC) storage options at UF, data lifecycle models (USGS and UNSW), data publication guides (Beckles, 2018) and reproducibility guidelines (ACM SIGMOD 2017/2018). This was the first guest lecture on data management for UF computer & information science & engineering (CISE) graduate students in CAP 5108: Research Methods for Human-centered Computing - https://www.cise.ufl.edu/class/cap5108sp18/.  A draft of a reproducibility template is provided in version 3 of the guest lecture.  

  • How to motivate researcher engagement?

    Presentation given about Data Stewardship at TU Delft and Data Championship at Cambridge University at Dutch LCRDM (Landelijk Coördinatiepunt Research Data Management) Data Steward meeting 1st December 2017.  Topics covered include suggestions by data stewards about how to approach and persuade researchers to engage in data management and stewardship activities.  

  • CURATE! The Digital Curator Game

    The CURATE game is designed to be used as an exercise that prompts players to put themselves into digital project scenarios in order to address issues and challenges that arise when institutions engage with digital curation and preservation.

    Developed as a means to highlight the importance of training in digital curation among practitioners and managers working in libraries, museums and cultural heritage institutes, the game has been used as a self-assessment tool, a team-building exercise and a training tool for early career students.

    The CURATE game package includes:

    • Welcome to CURATE Presentation
    • Game Board (PDF)
    • Game Cards (PDF)
    • About the Game (PDF)
    • Rules (PDF)
    • Record Sheet & Closing Questions (PDF)
    • Frequently Asked Questions (DoC)
  • Coffee and Code: R & RStudio

    What is R?

    R is an [Open Source](https://opensource.org) programming language that is specifically designed for data analysis and visualization. It consists of the [core R system](https://cran.r-project.org) and a collection of (currently) over [13,000 packages](http://cran.cnr.berkeley.edu) that provide specialized data manipulation, analysis, and visualization capabilities. R is an implementation of the *S* statistical language developed in the mid-1970s at Bell Labs, with the start of development in the early 1990s and a stable beta version available by 2000. R has been under continuous development for over 25 years and has hit major development [milestones](https://en.wikipedia.org/wiki/R )(programming_language) over that time.

    R syntax is relatively straightforward and is based on a core principle of providing reasonable default values for many functions, and allowing a lot of flexibility and power through the use of optional parameters.

  • Train the Trainer Workshop: How do I create a course in research data management?

    Presentations and excercises of a train-the-trainer Workshop on how to create a course in research data management, given at the International Digital Curation Conference 2018 in Barcelona.

  • Data and Software Skills Training for Librarians

    Library Carpentry is an open education volunteer network and lesson organization dedicated to teaching librarians data and software skills. The goal is to help librarians better engage with constituents and improve how they do their work. This presentation will serve as an introduction on how Library Carpentry formed in 2015, evolved as a global community of library professionals and will continue as a future sibling of the Carpentries, an umbrella organization of distinct lesson organizations, such as Data and Software Carpentry. We’ll cover existing collaborative lesson development, curricula coverage, workshop activities and the global instructor community. We’ll then talk about the future coordinating activities led by the UC system to align and prepare for a merging with Data and Software Carpentry.

  • Workshop: Research Data Management in a Nutshell

    The workshop Research Data in a Nutshell was part of the Doctoral Day of the Albertus Magnus Graduate Center (AMGC) at the University of Cologne on January 18 2018.

    The workshop was intended as a brief, interactive introduction into RDM for beginning doctoral students.

  • Ecology Curriculum

    This workshop uses a tabular ecology dataset from the Portal Project Teaching Database and teaches data cleaning, management, analysis, and visualization. There are no pre-requisites, and the materials assume no prior knowledge about the tools. We use a single dataset throughout the workshop to model the data management and analysis workflow that a researcher would use.
    Lessons:

    • Data Organization in Spreadsheets
    • Data Cleaning with OpenRefine
    • Data Management with SQL
    • Data Analysis and Visualization in R
    • Data Analysis and Visualization in Python


    The Ecology workshop can be taught using R or Python as the base language.
    Portal Project Teaching Dataset: the Portal Project Teaching Database is a simplified version of the Portal Project Database designed for teaching. It provides a real-world example of life-history, population, and ecological data, with sufficient complexity to teach many aspects of data analysis and management, but with many complexities removed to allow students to focus on the core ideas and skills being taught.
     

  • Planning for Software Reproducibility and Reuse

    Many research projects depend on the development of scripts or other software to collect data, perform analyses or simulations, and visualize results.  Working in a way that makes it easier for your future self and others to understand and re-use your code means that more time can be dedicated to the research itself, rather than troubleshooting hard-to-understand code, resulting in more effective research. In addition, by following some simple best practices around code sharing, the visibility and impact of your research can be increased.  In this introductory session, you will:

    • learn about best practices for writing, documenting (Documentation), and organizing code (Organization & Automation),
    • understand the benefits of using version control (Version Control & Quality Assurance),
    • learn about how code can be linked to research results and why (Context & Credit),
    • understand why it is important to make your code publishable and citable and how to do so (Context & Credit),
    • learn about intellectual property issues (Licensing),
    • learn about how and why your software can be preserved over time (Archiving).