Community gardens provide many benefits to New York City and its residents. They provide an opportunity to escape from the chaos of the city and enter a serene space. Their trees and plants promote cleaner air, which is invaluable in a city with a large population and high levels of air pollution. Some gardens have animals such as chickens and fish (and turtles, like the ones pictured below at Ninth Street Community Garden at Avenue C). While they are a fun feature, they also serve as an opportunity for education and urban agriculture.
Want to see some trees, chickens, or turtles? Check out this map to see which gardens in the city have what you’re interested in!
#loading libraries
library(tidyverse)
library(rvest)
library(httr)
library(leaflet)
library(crosstalk)
library(plotly)
#loading datasets
garden_info =
GET("http://data.cityofnewyork.us/resource/p78i-pat6.csv") |>
content("parsed") |>
janitor::clean_names() |>
drop_na() |>
mutate(
borough =
recode(
borough,
"B" = "Brooklyn",
"M" = "Manhattan",
"X" = "Bronx",
"R" = "Staten Island",
"Q" = "Queens"
)
)
site_visits =
GET("http://data.cityofnewyork.us/resource/xqbk-beh5.csv") |>
content("parsed") |>
janitor::clean_names()
##merging the data sets for flora and fauna analysis
#cleaning site visit data set so it only includes variables involved in flora and fauna
site_visits_flora_fauna = site_visits |>
select(parksid, inspectionid, treesingarden, fruittrees, streettrees, chickens, pond, fishinpond, turtles, totalsidewalkarea) |>
mutate_at(c('treesingarden', 'fruittrees', 'streettrees', 'chickens', 'pond', 'fishinpond', 'turtles'), as.numeric)
flora_fauna_df=
inner_join(garden_info, site_visits_flora_fauna, by = "parksid")
flora_fauna_group = flora_fauna_df |>
group_by(borough) |>
mutate(
"Trees in Garden" = sum(treesingarden),
"Fruit Trees" = sum(fruittrees),
"Street Trees" = sum(streettrees),
"Chickens" = sum(chickens),
"Pond" = sum(pond),
"Fish in Pond" = sum(fishinpond),
"Turtles" = sum(turtles))
#Interactive Map: Flora and Fauna in Gardens of NYC
map_data <- flora_fauna_group |>
select(
garden_name = gardenname,
latitude = lat,
longitude = lon,
Location = address,
borough, chickens, fruittrees, streettrees, treesingarden, pond, fishinpond, turtles
) |>
mutate(
floraandfauna = paste0(
ifelse(chickens > 0, "Chickens, ", ""),
ifelse(fruittrees > 0, "Fruit Trees, ", ""),
ifelse(streettrees > 0, "Street Trees, ", ""),
ifelse(treesingarden > 0, "Trees in Garden, ", ""),
ifelse(pond > 0, "Pond, ", ""),
ifelse(fishinpond > 0, "Fish in Pond, ", ""),
ifelse(turtles > 0, "Turtles, ", "")
)
)
map1 = leaflet(map_data) |>
addTiles() |>
setView(
lng = -74.006, # Longitude of NYC center
lat = 40.7128, # Latitude of NYC center
zoom = 11 # NYC zoom level
) |>
addCircleMarkers(
~longitude, ~latitude,
label = ~paste(garden_name, Location, floraandfauna),
popup = ~paste0("<b>", garden_name, "</b><br>Borough: ", borough, "<br>FloraandFauna: ", floraandfauna),
color = "purple",
radius = 6,
fillOpacity = 0.8
)
map1
Which borough has the most chickens in their community gardens? Are there more garden fruit trees in Manhattan or the Bronx? The multi-series bar chart allows for a comparison of the 5 boroughs through the flora and fauna present in their community gardens.
#multi-series bar chart reflecting distribution of flora and fauna features by borough
flora_fauna_tidy =
pivot_longer(
flora_fauna_group,
"Trees in Garden":"Turtles",
names_to = "item",
values_to = "total")
plot1 = ggplot(flora_fauna_tidy, aes(x = borough, y= total, fill=item)) +
geom_bar(position="dodge", stat="identity") +
labs(title = "Distribution of Flaura + Fauna Features in NYC Gardens",
x = "Borough",
y = "Number of Gardens with Each Feature by Borough",
color = "Flora/Fauna Feature",
caption = "Data from NYC Open Data"
) +
viridis::scale_fill_viridis(
name = "Flora/Fauna Feature",
discrete = TRUE
)
ggplotly(plot1)
Among the 5 boroughs, there are varying amounts of space for community gardens. The benefits of features like trees, which promote clean air and provide shade, may not be equally distributed across the boroughs as a result.
To understand how the presence of different types of trees differ across boroughs, a logistic regression was run for each tree type in the data set (trees in gardens, fruit trees, and street trees). The outcome was probablity of tree presence, and the predictor variable was borough. As there was no variable for garden size in the data set, sidewalk area was included as a confounder because it was the closest representation of garden size and would be associated with tree presence (more space is likely associated with tree presence) and borough (boroughs with more space likely have larger gardens/ more (or less) sidewalk area). Brooklyn was the reference group for the borough variable because it has the largest amount of community gardens.
fit_logistic_df =
flora_fauna_df |>
select('parksid','treesingarden', 'fruittrees', 'streettrees', 'borough', 'totalsidewalkarea') |>
drop_na() |>
mutate(
borough = as.factor(borough),
borough = fct_relevel(borough, "Brooklyn")
)
fit_logistic_treesingarden =
fit_logistic_df |>
glm(treesingarden~ borough + totalsidewalkarea, data = _, family = binomial()) |>
broom::tidy() |>
mutate(OR = exp(estimate)) |>
select(term, log_OR = estimate, OR, p.value) |>
knitr::kable(digits = 3)
fit_logistic_treesingarden
term | log_OR | OR | p.value |
---|---|---|---|
(Intercept) | 2.118 | 8.317 | 0.000 |
boroughBronx | 0.165 | 1.180 | 0.843 |
boroughManhattan | 0.844 | 2.326 | 0.304 |
boroughQueens | -1.149 | 0.317 | 0.134 |
boroughStaten Island | 12.428 | 249616.805 | 0.993 |
totalsidewalkarea | 0.001 | 1.001 | 0.165 |
fit_logistic_fruittrees =
fit_logistic_df |>
glm(fruittrees ~ borough + totalsidewalkarea, data = _, family = binomial()) |>
broom::tidy() |>
mutate(OR = exp(estimate)) |>
select(term, log_OR = estimate, OR, p.value) |>
knitr::kable(digits = 3)
fit_logistic_fruittrees
term | log_OR | OR | p.value |
---|---|---|---|
(Intercept) | 0.756 | 2.130 | 0.004 |
boroughBronx | 0.983 | 2.671 | 0.061 |
boroughManhattan | -0.155 | 0.856 | 0.656 |
boroughQueens | -1.602 | 0.202 | 0.007 |
boroughStaten Island | 13.500 | 729726.242 | 0.988 |
totalsidewalkarea | 0.000 | 1.000 | 0.194 |
fit_logistic_streettrees =
fit_logistic_df |>
glm(streettrees ~ borough + totalsidewalkarea, data = _, family = binomial())|>
broom::tidy() |>
mutate(OR = exp(estimate)) |>
select(term, log_OR = estimate, OR, p.value) |>
knitr::kable(digits = 3)
fit_logistic_streettrees
term | log_OR | OR | p.value |
---|---|---|---|
(Intercept) | 0.181 | 1.198 | 0.474 |
boroughBronx | -0.307 | 0.736 | 0.422 |
boroughManhattan | -0.148 | 0.862 | 0.653 |
boroughQueens | -0.055 | 0.946 | 0.923 |
boroughStaten Island | -15.354 | 0.000 | 0.986 |
totalsidewalkarea | 0.000 | 1.000 | 0.016 |
The p-values for a majority of the coefficients comparing Brooklyn to the remaining four boroughs deemed them insignificant. The only significant finding at the 5% level (and its interpretation) was as follows: The odds of a fruit tree being present in a community garden in Queens was 0.202 times the odds of a fruit tree being present in a community gardens in Brooklyn, controlling for sidewalk area (p-value: 0.007).