Code
import folium
import pandas as pd
import os
from folium import plugins
Adam Cseresznye
January 6, 2025
I’m happy to share that I’ve recently completed both IBM’s Data Analyst and Data Science Professional Certificates within the past month. The course content was well-structured, and I learned a great deal from these programs. For instance, I’ve always been interested in learning SQL, and this was the perfect chance to start exploring it.
If you’re curious about these certificates, you can find more information through the links provided below. But my learning journey doesn’t stop here—I’m planning to tackle most of the courses listed in the Data Science learning path on Coursera, so there’s more to come.
While I’m at it, I wanted to introduce you to a neat library called Folium, which is fantastic for working with geospatial data. I came across Folium during the capstone project of the Data Science Specialization, where we had a fun task of predicting and visualizing the success of SpaceX rocket launches.
In this post, I’ll briefly share what I’ve learned about this library. I hope you’ll find it useful too. Let’s dive in!
We’ll be utilizing the dataset made available by https://open.toronto.ca/. This dataset includes the locations of bicycles installed on sidewalks and boulevards across the City of Toronto, wherever there’s a requirement for public bicycle parking facilities. By the way, I discovered this dataset through the Awesome Public Datasets repository on GitHub. If you haven’t already, I recommend checking them out.
Street bicycle parking data - 4326.csv read in as pandas dataframe
Considering the original dataset has over 17,300 entries, we’ll keep things light by working with just 500 rows for now. It’s all for the sake of a demonstration, after all!
_id | OBJECTID | ID | ADDRESSNUMBERTEXT | ADDRESSSTREET | FRONTINGSTREET | SIDE | FROMSTREET | DIRECTION | SITEID | WARD | BIA | ASSETTYPE | STATUS | SDE_STATE_ID | X | Y | LONGITUDE | LATITUDE | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
784 | 4481427 | 10424 | BP-05283 | 15 | Dundonald St | NaN | NaN | Dundonald St | NaN | NaN | 13.0 | NaN | Ring | Temporarily Removed | 0 | NaN | NaN | NaN | NaN | {'type': 'MultiPoint', 'coordinates': [[-79.38... |
3297 | 4483940 | 15253 | BP-35603 | 49 | Harbour Sq | Queens Quay W | South | Harbour Sq | West | NaN | 10.0 | The Waterfront | Ring | Existing | 0 | NaN | NaN | NaN | NaN | {'type': 'MultiPoint', 'coordinates': [[-79.37... |
13971 | 4494614 | 31121 | BP-22492 | 200 | Elizabeth St | Elizabeth St | West | La Plante Ave | West | NaN | 11.0 | NaN | Ring | Existing | 0 | NaN | NaN | NaN | NaN | {'type': 'MultiPoint', 'coordinates': [[-79.38... |
5139 | 4485782 | 17465 | BP-40070 | 70 | Peter St | King St W | North | Peter St | West | NaN | 10.0 | Toronto Downtown West | Ring | Existing | 0 | NaN | NaN | NaN | NaN | {'type': 'MultiPoint', 'coordinates': [[-79.39... |
7635 | 4488278 | 20375 | BP-27153 | 39 | Prince Arthur Ave | Prince Arthur Ave | South | Bedford Rd | East | NaN | 11.0 | NaN | Ring | Existing | 0 | NaN | NaN | NaN | NaN | {'type': 'MultiPoint', 'coordinates': [[-79.39... |
The geometry column holds the longitude and latitude information, but before we dive in, we need to extract the valuable details. No worries – we’ll make use of pandas’ str.extract
for this task.
ASSETTYPE | STATUS | LONGITUDE | LATITUDE | |
---|---|---|---|---|
784 | Ring | Temporarily Removed | -79.38378423783222 | 43.6660359833018 |
3297 | Ring | Existing | -79.3774934493851 | 43.6407633657936 |
13971 | Ring | Existing | -79.386799735149 | 43.6589303889453 |
5139 | Ring | Existing | -79.3926661761316 | 43.6460273003346 |
7635 | Ring | Existing | -79.3973838724551 | 43.6693038734947 |
Here’s an example of how to create a map without any overlaid data points.
FeatureGroup
After instantiating FeatureGroup
, we can easily add the bike locations using the add_child
method. It is really easy!
# let's start with a clean copy of the map of Toronto
toronto_map = folium.Map(
location=[43.651070, -79.347015], zoom_start=11, tiles="OpenStreetMap"
)
# instantiate a feature group
bike_stations = folium.map.FeatureGroup()
# loop through the bike stations
for lat, long in zip(toronto_df_processed.LATITUDE, toronto_df_processed.LONGITUDE):
bike_stations.add_child(
folium.features.CircleMarker(
[lat, long],
radius=5,
color="red",
fill=True,
fill_color="yellow",
fill_opacity=1,
)
)
# add bike stations to the map
toronto_map.add_child(bike_stations)
We can also enhance this by adding a pop-up box that displays custom text of our choice.
# let's start with a clean copy of the map of Toronto
toronto_map = folium.Map(
location=[43.651070, -79.347015], zoom_start=11, tiles="OpenStreetMap"
)
# instantiate a feature group
bike_stations = folium.map.FeatureGroup()
# loop through the bike stations
for lat, long in zip(toronto_df_processed.LATITUDE, toronto_df_processed.LONGITUDE):
bike_stations.add_child(
folium.features.CircleMarker(
[lat, long],
radius=5,
color="grey",
fill=True,
fill_color="white",
fill_opacity=1,
)
)
# add pop-up text to each marker on the map
latitudes = list(toronto_df_processed.LATITUDE)
longitudes = list(toronto_df_processed.LONGITUDE)
labels = list(toronto_df_processed.STATUS)
for lat, lng, label in zip(latitudes, longitudes, labels):
folium.Marker([lat, lng], popup=label).add_to(toronto_map)
# add bike stations to map
toronto_map.add_child(bike_stations)
MarkerCluster
And the best part, which happens to be my favorite, is that we can also integrate a MarkerCluster
. This comes in handy when we’re dealing with numerous data points clustered closely together on the map. With a MarkerCluster
, you get to see their combined values instead of each one individually. It’s a fantastic feature!
# let's start with a clean copy of the map of Toronto
toronto_map = folium.Map(
location=[43.651070, -79.347015], zoom_start=11, tiles="OpenStreetMap"
)
# instantiate a mark cluster object
bike_stations_cluster = plugins.MarkerCluster().add_to(toronto_map)
# loop through the dataframe and add each data point to the mark cluster
for lat, lng, label, in zip(
toronto_df_processed.LATITUDE,
toronto_df_processed.LONGITUDE,
toronto_df_processed.STATUS,
):
folium.Marker(
location=[lat, lng],
icon=None,
popup=label,
).add_to(bike_stations_cluster)
# display map
toronto_map
That’s a wrap! I hope these examples have been helpful. Feel free to use these techniques in your next data science or geospatial project. Until next time, happy exploring!
---
title: "\U0001F973\U0001F389 Two IBM certificates and some geospatial data"
author: Adam Cseresznye
date: '2023-08-10'
categories:
- celebration
- geospatial data
jupyter: python3
toc: true
format:
html:
code-fold: true
code-tools: true
---
![](https://gisgeography.com/wp-content/uploads/2022/05/IBM-Data-Science-Professional-Certificate.png){fig-align="center" width=50%}
I'm happy to share that I've recently completed both IBM's [Data Analyst](https://www.coursera.org/professional-certificates/ibm-data-analyst) and [Data Science Professional Certificates](https://www.coursera.org/professional-certificates/ibm-data-science) within the past month. The course content was well-structured, and I learned a great deal from these programs. For instance, I've always been interested in learning SQL, and this was the perfect chance to start exploring it.
If you're curious about these certificates, you can find more information through the links provided below. But my learning journey doesn't stop here—I'm planning to tackle most of the courses listed in the [Data Science learning path](https://www.coursera.org/learning-paths/data-science) on Coursera, so there's more to come.
While I'm at it, I wanted to introduce you to a neat library called [Folium](https://python-visualization.github.io/folium/), which is fantastic for working with geospatial data. I came across Folium during the capstone project of the Data Science Specialization, where we had a fun task of predicting and visualizing the success of SpaceX rocket launches.
In this post, I'll briefly share what I've learned about this library. I hope you'll find it useful too. Let's dive in!
```{python}
import folium
import pandas as pd
import os
from folium import plugins
```
We'll be utilizing the dataset made available by https://open.toronto.ca/. [This dataset](https://open.toronto.ca/dataset/street-furniture-bicycle-parking/) includes the locations of bicycles installed on sidewalks and boulevards across the City of Toronto, wherever there's a requirement for public bicycle parking facilities. By the way, I discovered this dataset through the [*Awesome Public Datasets*](https://github.com/awesomedata/awesome-public-datasets) repository on GitHub. If you haven't already, I recommend checking them out.
```{python}
# Let's read in the file
for file in os.listdir():
if file.endswith(".csv"):
toronto_df = pd.read_csv(file)
print(f"{file} read in as pandas dataframe")
```
Considering the original dataset has over 17,300 entries, we'll keep things light by working with just 500 rows for now. It's all for the sake of a demonstration, after all!
```{python}
#| column: page
toronto_df = toronto_df.sample(n=500)
toronto_df.head()
```
The *geometry* column holds the longitude and latitude information, but before we dive in, we need to extract the valuable details. No worries – we'll make use of pandas' `str.extract` for this task.
```{python}
pattern = r"(-?\d+\.\d+),\s*(-?\d+\.\d+)"
toronto_df_processed = toronto_df.assign(
LONGITUDE=lambda df: df.geometry.str.extract(pattern)[0],
LATITUDE=lambda df: df.geometry.str.extract(pattern)[1],
).loc[:, ["ASSETTYPE", "STATUS", "LONGITUDE", "LATITUDE"]]
toronto_df_processed.head()
```
# Creating the map and displaying it
Here's an example of how to create a map without any overlaid data points.
```{python}
#| fig-cap: The City of Toronto
#| label: fig-fig1
toronto_map = folium.Map(
location=[43.651070, -79.347015], zoom_start=11, tiles="OpenStreetMap"
)
toronto_map
```
# Superimposing bike locations on the map with `FeatureGroup`
After instantiating `FeatureGroup`, we can easily add the bike locations using the `add_child` method. It is really easy!
```{python}
#| fig-cap: The City of Toronto with available bike locations
#| label: fig-fig2
# let's start with a clean copy of the map of Toronto
toronto_map = folium.Map(
location=[43.651070, -79.347015], zoom_start=11, tiles="OpenStreetMap"
)
# instantiate a feature group
bike_stations = folium.map.FeatureGroup()
# loop through the bike stations
for lat, long in zip(toronto_df_processed.LATITUDE, toronto_df_processed.LONGITUDE):
bike_stations.add_child(
folium.features.CircleMarker(
[lat, long],
radius=5,
color="red",
fill=True,
fill_color="yellow",
fill_opacity=1,
)
)
# add bike stations to the map
toronto_map.add_child(bike_stations)
```
# Adding pop-up text with relevant information
We can also enhance this by adding a pop-up box that displays custom text of our choice.
```{python}
#| fig-cap: The City of Toronto with available bike locations
#| label: fig-fig3
# let's start with a clean copy of the map of Toronto
toronto_map = folium.Map(
location=[43.651070, -79.347015], zoom_start=11, tiles="OpenStreetMap"
)
# instantiate a feature group
bike_stations = folium.map.FeatureGroup()
# loop through the bike stations
for lat, long in zip(toronto_df_processed.LATITUDE, toronto_df_processed.LONGITUDE):
bike_stations.add_child(
folium.features.CircleMarker(
[lat, long],
radius=5,
color="grey",
fill=True,
fill_color="white",
fill_opacity=1,
)
)
# add pop-up text to each marker on the map
latitudes = list(toronto_df_processed.LATITUDE)
longitudes = list(toronto_df_processed.LONGITUDE)
labels = list(toronto_df_processed.STATUS)
for lat, lng, label in zip(latitudes, longitudes, labels):
folium.Marker([lat, lng], popup=label).add_to(toronto_map)
# add bike stations to map
toronto_map.add_child(bike_stations)
```
# Clustering the rental locations with `MarkerCluster`
And the best part, which happens to be my favorite, is that we can also integrate a `MarkerCluster`. This comes in handy when we're dealing with numerous data points clustered closely together on the map. With a `MarkerCluster`, you get to see their combined values instead of each one individually. It's a fantastic feature!
```{python}
#| fig-cap: Aggregated Bike Locations in the City of Toronto
#| label: fig-fig4
# let's start with a clean copy of the map of Toronto
toronto_map = folium.Map(
location=[43.651070, -79.347015], zoom_start=11, tiles="OpenStreetMap"
)
# instantiate a mark cluster object
bike_stations_cluster = plugins.MarkerCluster().add_to(toronto_map)
# loop through the dataframe and add each data point to the mark cluster
for lat, lng, label, in zip(
toronto_df_processed.LATITUDE,
toronto_df_processed.LONGITUDE,
toronto_df_processed.STATUS,
):
folium.Marker(
location=[lat, lng],
icon=None,
popup=label,
).add_to(bike_stations_cluster)
# display map
toronto_map
```
That's a wrap! I hope these examples have been helpful. Feel free to use these techniques in your next data science or geospatial project. Until next time, happy exploring!