Data mining

TidyTuesday: Roman Emperors

Introduction Today, for practice with ggplot2, I wish to replicate @JoshuaFeldman’s wonderful #TidyTuesday submission about the dataset of Roman emperors. library("tidyverse") TidyTuesday’s Roman Emperor dataset — posted on August 13, 2019 # TidyTuesday's given line of code to load the data emperors <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-08-13/emperors.csv") Exploring the Data dim(emperors) ## [1] 68 16 colnames(emperors) ## [1] "index" "name" "name_full" "birth" "death" ## [6] "birth_cty" "birth_prv" "rise" "reign_start" "reign_end" ## [11] "cause" "killer" "dynasty" "era" "notes" ## [16] "verif_who" emperors %>% filter(birth_prv !

Gender of Frasier Characters

“I am not a man…” For work, I need to take a list of names and try to infer the gender. Here I try an R package on the character names in the TV show Fraiser. The gender package #install.packages("gender") #works fine ## user needs to download database too #install_genderdata_package() #did not work ("error reading from connection") ## as suggested by the bug report at https://github.com/ropensci/drat/issues/6 #install.packages("devtools") #library(devtools) #devtools::install_github("ropensci/genderdata") Trial Run library(gender) library(ggpubr) library(tidyverse) gender("frasier", method = "ssa", years = c(1940, 1990)) ## # A tibble: 1 x 6 ## name proportion_male proportion_female gender year_min year_max ## <chr> <dbl> <dbl> <chr> <dbl> <dbl> ## 1 frasier 1 0 male 1940 1990 Cast of Characters Now I will try to run the gender function over a list of names (criteria: characters that appeared in at least 6 episodes).