Feel free to try the exercises below at your leisure. Solutions will be posted later in the week! Note: as usual, the answers below are just one way of solving the prompts!

Data Scraping

  1. Using rvest::html_table, scrape the table of City Council members in Washington D.C. from Wikipedia
wiki_url <- 'https://en.wikipedia.org/wiki/Council_of_the_District_of_Columbia'
council_outputs <- rvest::read_html(wiki_url) %>%
  rvest::html_table() %>%
  .[[3]]
council_outputs %>% head
  1. Using the inspector gadget or similar tool, web scrape two pages of Climate Change news articles titles and links from Politico.
url <- 'https://www.politico.com/news/climate-change'
item <- 'h3'

titles_1 <- rvest::read_html(url) %>% 
  rvest::html_elements(item) %>%
  rvest::html_text2()

hyperlink_1 <- rvest::read_html(url) %>% 
  rvest::html_elements(item) %>%
  rvest::html_elements('a') %>% 
  rvest::html_attr("href") 

#page 2(!)
url <- 'https://www.politico.com/news/climate-change/2'
item <- 'h3'

titles_2 <- rvest::read_html(url) %>% 
  rvest::html_elements(item) %>%
  rvest::html_text2()

hyperlink_2 <- rvest::read_html(url) %>% 
  rvest::html_elements(item) %>%
  rvest::html_elements('a') %>% 
  rvest::html_attr("href") 

data.frame(title = c(titles_1, titles_2), 
           links = c(hyperlink_1, hyperlink_2)) %>%
  head

Working with APIs

  1. Register for an API key with the U.S. Census Bureau. Once it is received, download any data point of interest from the American Community Survey or Decennial Census. (Documentation here)

  2. Try to replicate #1 using the tidycensus package, which is an API wrapper.