601 West Lombard Street
Baltimore MD 21201-1512
Reference: 410-706-7996
Circulation: 410-706-7928
The Center for Data and Bioinformation Services offers workshops on a wide variety of topics. Below is a list of recently offered workshops and links to available slides, handouts, and recordings. Our most recent instruction offerings are also listed in Open Science Framework (OSF).
If you would like to request a workshop tailored specifically to your interests or group, please contact data@hshsl.umaryland.edu to discuss.
This workshop series included sessions on R Basics, Data Wrangling, and Data Visualization (2-session series included only Data Wrangling and Data Visualization)
The R Basics session provides a solid foundation in working with R and RStudio and lays the groundwork to enable participants to explore more advanced topics in R programming. The Data Wrangling in R session introduces participants to the basics of getting started with R and RStudio and introduces the workhorse package dplyr. Participants will get hands-on experience wrangling real datasets. The Data Visualization in R session explores how to use ggplot2, a robust Tidyverse package used to create high quality graphics for exploring and communicating your data. We go beyond basic graphs and learn how to customize and annotate our graphs for more effective storytelling. Participants have the best experience if they attended session one in this series or have some previous experience with R and the Tidyverse.
Materials (2-sessions) (updated July 2024)This workshop series included sessions introducing the Shell, R and RStudio, and Python. The Intro to Shell session introduces concepts essential for using the command line for bioinformatics, such as navigating the file system, computationally manipulating your files (including copying, moving, and renaming), searching files, redirecting output, and writing shell scripts. The Intro to R and RStudio session introduces to use R is used for data science and bioinformatics, including R syntax, data types, and how to set up and use RStudio and the popular Tidyverse collection of packages for data wrangling and visualization. The Intro to Python session introduces how to use Python for data science and bioinformatics, including Python syntax and package management, using both a text editor and Jupyter notebooks to write and execute Python code.
Materials (updated October 2023)Go beyond the basics and learn important practical skills for working with data in R! This partnership with NNLM Region 1 was a pilot program for librarians to build a community of practice around R/RStudio following an initial introduction to R through Library Carpentry. Over 6 live sessions, we covered topics such as data transformation, visualization, and communication using tidyr, lubridate, ggplot2, and RMarkdown/Quarto packages. Participants were encouraged to work on their own data projects and get feedback in a supportive, collaborative learning environment. Learning objectives included: 1) practice new programming skills with real world data; 2) apply programming techniques to work with and visualize data; and 3) develop the confidence and skills needed to independently solve programming problems.
Materials (updated July 2023)
Get organized and avoid a "data disaster"! This workshop provides basic strategies and best practices for effectively managing research data to ensure its organization and accessibility. Topics covered include: funder and journal requirements for data management and sharing, standards for file naming and structure, resources for data management planning and sharing, and strategies for storing data during research and preserving it for the future.
Materials (updated January 2024)This seminar provides an overview of the 2023 NIH Data Management and Sharing Policy (DMSP) and resources available to help comply with requirements. The goal of this seminar is to prepare participants to meet requirements. Participants will learn what the NIH DMSP entails and how it differs from the existing policy, as well as tips and resources for preparing data management plans. The seminar covers the free online application DMPTool, strategies for choosing a data repository, and insights into data sharing best practices.
Materials (updated July 2023)Researchers, do you have an upcoming grant application that requires you to write a data management or data sharing plan? Are you striving to maintain well-organized research projects? In this workshop, we will cover the components of good data management plans with a particular eye toward NIH data sharing requirements. Participants will also be introduced to DMPTool, an online platform which provides plan templates and guidance from most major funders. Additionally, we will be providing participants with time during the workshop to work on plans for their own projects and ask questions and receive guidance from the instructors.
Materials (updated February 2024)In this interactive, hands-on workshop, we will be learning to use OSF, an open-source project management tool from the Center for Open Science. Learn about the main features, understand use cases, and practice setting up a project. Learning objectives include 1) understand what OSF is and some examples of how it can be used; 2) become familiar with the main features of an OSF project; 3) understand how to affiliate research with UMB and find affiliated research; and 4) practice creating, forking, and linking projects and components.
Materials (updated February 2022)This workshop focuses on NIH data sharing requirements, UMB options for collecting and storing data, and options for data repositories.
Materials (updated April 2024)Audience: SOM Center for International Health, Education, and Biosecurity (CIHEB)
This presentation discusses the benefits of data repositories, different repository models – including subject types and access levels, getting your data ready for deposit, and a repository complement/alternative the UMB Data Catalog. Learning objectives include 1) become familiar with data sharing requirements under the 2023 NIH Data Management and Sharing Policy; 2) understand the benefits of using a data repository; 3) identify several different repository models; 4) become familiar with the UMB Data Catalog; and 5) understand factors that go into choosing a repository.
Materials (updated April 2023)Wrangling. Munging. Data Sanitation. These and other names describe an aspect of the data analysis life cycle typically thought of as boring and unglamorous, but which occupies the majority of time spent during a data analysis project. The time you spend in preparing your data for analysis, while crucial, cuts into the time available for using software to produce a visualization, calculate a statistic, or run a favorite machine learning algorithm. The goal of this seminar is to provide a reproducible workflow for performing your own data wrangling. I will suggest methods to help you to: 1) get to know your data, 2) cultivate habits that will help you to spend less time on wrangling, and 3) optimally prepare your data for the output you're interested in producing.
Materials (updated August 2020)