Working with Dplyr

Here are the results for Dplyr recap file, “Final Exercises”.

First, call the libraries and load data.

library(tidyverse)
load("travel_weather.RData")

Question 2

travel_weather %>%
  select(-London,-Venice) %>%
  filter(NYC>Amsterdam) %>%
  group_by(year,month) %>%
  summarise(NYCwA_diff=round(mean(NYC)-mean(Amsterdam),1)) %>%   
  arrange(desc(NYCwA_diff))
## # A tibble: 24 x 3
## # Groups:   year [3]
##     year month NYCwA_diff
##    <dbl> <dbl>      <dbl>
##  1  2016     8        8.4
##  2  2016     7        8.1
##  3  2017     9        7.9
##  4  2016     4        7.6
##  5  2017     4        7.4
##  6  2017     7        7.3
##  7  2017     8        6.5
##  8  2016    11        6.4
##  9  2016     3        6.3
## 10  2016     6        6.0
## # ... with 14 more rows

Question 3

travel_weather %>%
  gather(key=City,value=Temperature,-year,-month,-day) %>%
  group_by(year, month, day) %>%
  summarise(max_Temperature = max(Temperature), City = City[which.max(Temperature)])
## # A tibble: 731 x 5
## # Groups:   year, month [?]
##     year month   day max_Temperature      City
##    <dbl> <dbl> <dbl>           <dbl>     <chr>
##  1  2015    11     1              16       NYC
##  2  2015    11     2              15       NYC
##  3  2015    11     3              16       NYC
##  4  2015    11     4              17       NYC
##  5  2015    11     5              18       NYC
##  6  2015    11     6              21       NYC
##  7  2015    11     7              17       NYC
##  8  2015    11     8              13    Venice
##  9  2015    11     9              13 Amsterdam
## 10  2015    11    10              14 Amsterdam
## # ... with 721 more rows


I couldn’t find a method to include City, which has the maximum value of that day, using methods that are explained in our recap file. So I googled for a solution, which is City = City[which.max(Temperature)]