USGS Science Support Framework
USGS Data Management Training Modules – Best Practices for Preparing Science Data to Share
This is one of six interactive modules created to help researchers, data stewards, managers and the public gain an understanding of the value of data management in science and provide best practices to perform good data management within their organization. In this module, you’ll learn:
The importance of maintaining well-managed science data
Nine fundamental practices scientists should implement when preparing data to share
Associated best practices for each data management habit
Structuring and Documenting a USGS Public Data Release
This tutorial is designed to help scientists think about the best way to structure and document their USGS public data releases. The ultimate goal is to present data in a logical and organized manner that enables users to quickly understand the data. The first part of the tutorial describes the general considerations for structuring and documenting a data release, regardless of the platform being used to distribute the data. The second part of the tutorial describes how these general consideration can be implemented in ScienceBase. The tutorial is designed for USGS researchers, data managers, and collaborators, but some of the content may be useful for non-USGS researchers who need some tips for structuring and documenting their data for public distribution.
Data Sharing and Management within a Large-Scale, Heterogeneous Sensor Network using the CUAHSI Hydrologic Information System
Hydrology researchers are collecting data using in situ sensors at high frequencies, for extended durations, and with spatial distributions that require infrastructure for data storage, management, and sharing. Managing streaming sensor data is challenging, especially in large networks with large numbers of sites and sensors. The availability and utility of these data in addressing scientific questions related to water availability, water quality, and natural disasters relies on effective cyberinfrastructure that facilitates transformation of raw sensor data into usable data products. It also depends on the ability of researchers to share and access the data in useable formats. In this presentation I will describe tools that have been developed for research groups and sites conducting long term monitoring using in situ sensors. Functionality includes the ability to track equipment, deployments, calibrations, and other events related to monitoring site maintenance and to link this information to the observational data that they are collecting, which is imperative in ensuring the quality of sensor-based data products. I will present these tools in the context of a data management and publication workflow case study for the iUTAH (innovative Urban Transitions and Aridregion Hydrosustainability) network of aquatic and terrestrial sensors. iUTAH researchers have developed and deployed an ecohydrologic observatory to monitor Gradients Along Mountain to Urban Transitions (GAMUT). The GAMUT Network measures aspects of water inputs, outputs, and quality along a mountain-to-urban gradient in three watersheds that share common water sources (winter-derived precipitation) but differ in the human and biophysical nature of land-use transitions. GAMUT includes sensors at aquatic and terrestrial sites for real-time monitoring of common meteorological variables, snow accumulation and melt, soil moisture, surface water flow, and surface water quality. I will present the overall workflow we have developed, our use of existing software tools from the CUAHSI Hydrologic Information System, and new software tools that we have deployed for both managing the sensor infrastructure and for storing, managing, and sharing the sensor data.
Best practices for preparing data to share and preserve
Scientists spend considerable time conducting field studies and experiments, analyzing the data collected, and writing research papers, but an often overlooked activity is effectively managing the resulting data. The goal of this webinar is to provide guidance on fundamental data management practices that investigators should perform during the course of data collection to improve the usability of their data sets. Topics covered will include data structure, quality control, and data documentation. In addition, I will briefly discuss data curation practices that are done by archives to ensure that data can be discovered and used in the future. By following the practices, data will be less prone to error, more efficiently structured for analysis, and more readily understandable for any future questions that they might help address.
Data citation and you: Where things stand today
Open data and the USGS Science Data Catalog
Python for Data Management
This training webinar for Python is part of a technical webinar series created by the USGS Core Science Analytics, Synthesis, and Library section to improve data managers’ and scientists' skills with using Python in order to perform basic data management tasks.
Who: These training events are intended for a wide array of users, ranging from those with little or no experience with Python to others who may be familiar with the language but are interested in learning techniques for automating file manipulation, batch generation of metadata, and other data management related tasks.
Requirements: This series will be taught using Jupyter notebook and the Python bundle that ships with the new USGS Metadata Wizard 2.x tool.
- Working with Local Files
- Batch Metadata Handling
- Using the USGS ScienceBase Platform with PySB
ScienceBase as a Platform for Data Release
This video tutorial provides information about using ScienceBase as a platform for data release. We will describe the data release workflow and demonstrate, step-by-step, how to complete a data release in ScienceBase.