Bioinformatics

Covid-19 Cases in the Central Valley

Data Source: USA Facts — downloaded July 6, 2020 library("tidyverse") library("zoo") start_date <- "5/28/20" end_date <- "7/5/20" county_list <- c("Santa Clara", "Stanislaus", "Calaveras", "San Benito", "Merced", "Tuolumne", "Fresno", "Madera", "Mariposa") lag <- 7 #number of days for rolling average #loads files cases_raw <- read_csv("covid_confirmed_usafacts.csv") populations <- read_csv("covid_county_population_usafacts.csv") Data Wrangling raw_data_merged <- cases_raw %>% full_join(populations, by = c("County Name", "State")) # find column positions by date column_names <- colnames(raw_data_merged) start_loc <- match(start_date, column_names) end_loc <- match(end_date, column_names) cases_filtered <- cases_raw %>% filter(State == "CA") %>% select("County Name", all_of(start_loc:end_loc)) populations_filtered <- populations %>% filter(State == "CA") %>% select("County Name", "population") df_merged <- cases_filtered %>% full_join(populations_filtered, by = "County Name") df_clean <- df_merged %>% # avoids unallocated cases and the cruise ship!

Introduction to Bioconductor

“Bioconductor provides tools for the analysis and comprehension of high-throughput genomic data. Bioconductor uses the R statistical programming language, and is open source and open development.” https://www.bioconductor.org/ library("dplyr") library("ggplot2") Installation To install core packages, type the following in an R command window. This may take around 5 minutes When the option for updating packages appears, type in “a” for “all” #leave as eval = FALSE when knitting if (!

gganatogram and gganimate

Today I wanted to see if I could create a slideshow of pictures from the gganatogram package. I wanted to combine them with the gganimate package, but I have not figured out how to get that to work. (In particular, the gganatogram() function seems to return a different list layout than ggplot objects.) library(gganatogram) ## Loading required package: ggpolypath ## Loading required package: ggplot2 library(gganimate) library(profvis) N <- 25 # number of cell samples num_cell_parts <- nrow(cell_key$cell) # randomly select a random number of cell parts part_picker <- sample(1:num_cell_parts, sample(1:num_cell_parts, 1)) cell_num <- rep(1, length(part_picker)) this_cell <- cell_key[['cell']][part_picker, ] cell_samples <- cbind(this_cell, cell_num) for(j in 2:N){ part_picker <- sample(1:num_cell_parts, sample(1:num_cell_parts, 1)) cell_num <- rep(j, length(part_picker)) this_cell <- cbind( cell_key[['cell']][part_picker, ], cell_num) cell_samples <- rbind(cell_samples, this_cell) # figure_list[j] <- gganatogram(data = this_cell, # outline = FALSE, fillOutline='steelblue', organism="cell", fill="colour") + # theme_void() + # coord_fixed() png(filename = paste0(j, ".