Community gardens provide many benefits to New York City and its residents. They provide an opportunity to escape from the chaos of the city and enter a serene space. Their trees and plants promote cleaner air, which is invaluable in a city with a large population and high levels of air pollution. Some gardens have animals such as chickens and fish (and turtles, like the ones pictured below at Ninth Street Community Garden at Avenue C). While they are a fun feature, they also serve as an opportunity for education and urban agriculture.

Map of NYC Garden Flora and Fauna

Want to see some trees, chickens, or turtles? Check out this map to see which gardens in the city have what you’re interested in!

#loading libraries
library(tidyverse)
library(rvest)
library(httr)
library(leaflet)
library(crosstalk)
library(plotly)

#loading datasets
garden_info = 
  GET("http://data.cityofnewyork.us/resource/p78i-pat6.csv") |> 
  content("parsed") |>
  janitor::clean_names() |>
  drop_na() |>
   mutate(
    borough = 
      recode(
        borough,
        "B" = "Brooklyn",
        "M" = "Manhattan",
        "X" = "Bronx",
        "R" = "Staten Island",
        "Q" = "Queens"
      )
   )

site_visits = 
  GET("http://data.cityofnewyork.us/resource/xqbk-beh5.csv") |>
  content("parsed") |>
  janitor::clean_names()

##merging the data sets for flora and fauna analysis
#cleaning site visit data set so it only includes variables involved in flora and fauna 
site_visits_flora_fauna = site_visits |>
  select(parksid, inspectionid, treesingarden, fruittrees, streettrees, chickens, pond, fishinpond, turtles, totalsidewalkarea) |>
 mutate_at(c('treesingarden', 'fruittrees', 'streettrees', 'chickens', 'pond', 'fishinpond', 'turtles'), as.numeric)

flora_fauna_df= 
  inner_join(garden_info, site_visits_flora_fauna, by = "parksid") 

flora_fauna_group = flora_fauna_df |>
  group_by(borough) |>
  mutate(
  "Trees in Garden" = sum(treesingarden),
  "Fruit Trees" = sum(fruittrees),
  "Street Trees" = sum(streettrees),
  "Chickens" = sum(chickens),
  "Pond" = sum(pond),
  "Fish in Pond" = sum(fishinpond),
  "Turtles" = sum(turtles))

#Interactive Map: Flora and Fauna in Gardens of NYC 
map_data <- flora_fauna_group |>
  select(
    garden_name = gardenname, 
    latitude = lat, 
    longitude = lon, 
    Location = address,
    borough, chickens, fruittrees, streettrees, treesingarden, pond, fishinpond, turtles
  ) |>
  mutate(
    floraandfauna = paste0(
      ifelse(chickens > 0, "Chickens, ", ""),
      ifelse(fruittrees > 0, "Fruit Trees, ", ""),
      ifelse(streettrees > 0, "Street Trees, ", ""),
      ifelse(treesingarden > 0, "Trees in Garden, ", ""),
      ifelse(pond > 0, "Pond, ", ""),
      ifelse(fishinpond > 0, "Fish in Pond, ", ""),
       ifelse(turtles > 0, "Turtles, ", "")
    )
  )

map1 = leaflet(map_data) |>
  addTiles() |>
  setView(
    lng = -74.006,  # Longitude of NYC center
    lat = 40.7128,  # Latitude of NYC center
    zoom = 11      # NYC zoom level
  ) |>
  addCircleMarkers(
    ~longitude, ~latitude,
    label = ~paste(garden_name, Location, floraandfauna),
    popup = ~paste0("<b>", garden_name, "</b><br>Borough: ", borough, "<br>FloraandFauna: ", floraandfauna),
    color = "purple",
    radius = 6,
    fillOpacity = 0.8
  )

map1

Flora and Fauna by Borough

Which borough has the most chickens in their community gardens? Are there more garden fruit trees in Manhattan or the Bronx? The multi-series bar chart allows for a comparison of the 5 boroughs through the flora and fauna present in their community gardens.

#multi-series bar chart reflecting distribution of flora and fauna features by borough
flora_fauna_tidy = 
   pivot_longer(
    flora_fauna_group, 
    "Trees in Garden":"Turtles",
    names_to = "item", 
    values_to = "total")

plot1 = ggplot(flora_fauna_tidy, aes(x = borough, y= total, fill=item)) + 
    geom_bar(position="dodge", stat="identity") +
  labs(title = "Distribution of Flaura + Fauna Features in NYC Gardens",
    x = "Borough",
    y = "Number of Gardens with Each Feature by Borough",
    color = "Flora/Fauna Feature",
    caption = "Data from NYC Open Data"
  ) +
    viridis::scale_fill_viridis(
    name = "Flora/Fauna Feature", 
    discrete = TRUE
  )

ggplotly(plot1)

The Effect of Borough on Tree Presence

Among the 5 boroughs, there are varying amounts of space for community gardens. The benefits of features like trees, which promote clean air and provide shade, may not be equally distributed across the boroughs as a result.

To understand how the presence of different types of trees differ across boroughs, a logistic regression was run for each tree type in the data set (trees in gardens, fruit trees, and street trees). The outcome was probablity of tree presence, and the predictor variable was borough. As there was no variable for garden size in the data set, sidewalk area was included as a confounder because it was the closest representation of garden size and would be associated with tree presence (more space is likely associated with tree presence) and borough (boroughs with more space likely have larger gardens/ more (or less) sidewalk area). Brooklyn was the reference group for the borough variable because it has the largest amount of community gardens.

fit_logistic_df =
  flora_fauna_df |>
  select('parksid','treesingarden', 'fruittrees', 'streettrees', 'borough', 'totalsidewalkarea') |>
  drop_na() |>
  mutate(
    borough = as.factor(borough),
    borough = fct_relevel(borough, "Brooklyn")
  )

Logistic Regression Model: Trees in the Garden

fit_logistic_treesingarden = 
  fit_logistic_df |>
  glm(treesingarden~ borough + totalsidewalkarea, data = _, family = binomial()) |>
  broom::tidy() |> 
  mutate(OR = exp(estimate)) |>
  select(term, log_OR = estimate, OR, p.value) |> 
  knitr::kable(digits = 3)

fit_logistic_treesingarden
term log_OR OR p.value
(Intercept) 2.118 8.317 0.000
boroughBronx 0.165 1.180 0.843
boroughManhattan 0.844 2.326 0.304
boroughQueens -1.149 0.317 0.134
boroughStaten Island 12.428 249616.805 0.993
totalsidewalkarea 0.001 1.001 0.165

Logistic Regression Model: Fruit Trees

fit_logistic_fruittrees = 
  fit_logistic_df |>
  glm(fruittrees ~ borough + totalsidewalkarea, data = _, family = binomial()) |>
  broom::tidy() |> 
  mutate(OR = exp(estimate)) |>
  select(term, log_OR = estimate, OR, p.value) |> 
  knitr::kable(digits = 3)

fit_logistic_fruittrees
term log_OR OR p.value
(Intercept) 0.756 2.130 0.004
boroughBronx 0.983 2.671 0.061
boroughManhattan -0.155 0.856 0.656
boroughQueens -1.602 0.202 0.007
boroughStaten Island 13.500 729726.242 0.988
totalsidewalkarea 0.000 1.000 0.194

Logistic Regression Model: Street Trees

fit_logistic_streettrees = 
  fit_logistic_df |>
  glm(streettrees ~ borough + totalsidewalkarea, data = _, family = binomial())|> 
  broom::tidy() |> 
  mutate(OR = exp(estimate)) |>
  select(term, log_OR = estimate, OR, p.value) |> 
  knitr::kable(digits = 3)

fit_logistic_streettrees
term log_OR OR p.value
(Intercept) 0.181 1.198 0.474
boroughBronx -0.307 0.736 0.422
boroughManhattan -0.148 0.862 0.653
boroughQueens -0.055 0.946 0.923
boroughStaten Island -15.354 0.000 0.986
totalsidewalkarea 0.000 1.000 0.016

The p-values for a majority of the coefficients comparing Brooklyn to the remaining four boroughs deemed them insignificant. The only significant finding at the 5% level (and its interpretation) was as follows: The odds of a fruit tree being present in a community garden in Queens was 0.202 times the odds of a fruit tree being present in a community gardens in Brooklyn, controlling for sidewalk area (p-value: 0.007).