I’m happy to share that I’ve recently completed both IBM’s Data Analyst and Data Science Professional Certificates within the past month. The course content was well-structured, and I learned a great deal from these programs. For instance, I’ve always been interested in learning SQL, and this was the perfect chance to start exploring it.

If you’re curious about these certificates, you can find more information through the links provided below. But my learning journey doesn’t stop here—I’m planning to tackle most of the courses listed in the Data Science learning path on Coursera, so there’s more to come.

While I’m at it, I wanted to introduce you to a neat library called Folium, which is fantastic for working with geospatial data. I came across Folium during the capstone project of the Data Science Specialization, where we had a fun task of predicting and visualizing the success of SpaceX rocket launches.

In this post, I’ll briefly share what I’ve learned about this library. I hope you’ll find it useful too. Let’s dive in!

Code

import folium
import pandas as pd
import os
from folium import plugins

We’ll be utilizing the dataset made available by https://open.toronto.ca/. This dataset includes the locations of bicycles installed on sidewalks and boulevards across the City of Toronto, wherever there’s a requirement for public bicycle parking facilities. By the way, I discovered this dataset through the Awesome Public Datasets repository on GitHub. If you haven’t already, I recommend checking them out.

Code

# Let's read in the file

for file in os.listdir():
    if file.endswith(".csv"):
        toronto_df = pd.read_csv(file)

        print(f"{file} read in as pandas dataframe")

Street bicycle parking data - 4326.csv read in as pandas dataframe

Considering the original dataset has over 17,300 entries, we’ll keep things light by working with just 500 rows for now. It’s all for the sake of a demonstration, after all!

Code

toronto_df = toronto_df.sample(n=500)
toronto_df.head()

	_id	OBJECTID	ID	ADDRESSNUMBERTEXT	ADDRESSSTREET	FRONTINGSTREET	SIDE	FROMSTREET	DIRECTION	SITEID	WARD	BIA	ASSETTYPE	STATUS	X	Y	LONGITUDE	LATITUDE	geometry
784	4481427	10424	BP-05283	15	Dundonald St	NaN	NaN	Dundonald St	NaN	NaN	13.0	NaN	Ring	Temporarily Removed	NaN	NaN	NaN	NaN	{'type': 'MultiPoint', 'coordinates': [[-79.38...
3297	4483940	15253	BP-35603	49	Harbour Sq	Queens Quay W	South	Harbour Sq	West	NaN	10.0	The Waterfront	Ring	Existing	NaN	NaN	NaN	NaN	{'type': 'MultiPoint', 'coordinates': [[-79.37...
13971	4494614	31121	BP-22492	200	Elizabeth St	Elizabeth St	West	La Plante Ave	West	NaN	11.0	NaN	Ring	Existing	NaN	NaN	NaN	NaN	{'type': 'MultiPoint', 'coordinates': [[-79.38...
5139	4485782	17465	BP-40070	70	Peter St	King St W	North	Peter St	West	NaN	10.0	Toronto Downtown West	Ring	Existing	NaN	NaN	NaN	NaN	{'type': 'MultiPoint', 'coordinates': [[-79.39...
7635	4488278	20375	BP-27153	39	Prince Arthur Ave	Prince Arthur Ave	South	Bedford Rd	East	NaN	11.0	NaN	Ring	Existing	NaN	NaN	NaN	NaN	{'type': 'MultiPoint', 'coordinates': [[-79.39...

The geometry column holds the longitude and latitude information, but before we dive in, we need to extract the valuable details. No worries – we’ll make use of pandas’ str.extract for this task.

Code

pattern = r"(-?\d+\.\d+),\s*(-?\d+\.\d+)"

toronto_df_processed = toronto_df.assign(
    LONGITUDE=lambda df: df.geometry.str.extract(pattern)[0],
    LATITUDE=lambda df: df.geometry.str.extract(pattern)[1],
).loc[:, ["ASSETTYPE", "STATUS", "LONGITUDE", "LATITUDE"]]
toronto_df_processed.head()

	ASSETTYPE	STATUS	LONGITUDE	LATITUDE
784	Ring	Temporarily Removed	-79.38378423783222	43.6660359833018
3297	Ring	Existing	-79.3774934493851	43.6407633657936
13971	Ring	Existing	-79.386799735149	43.6589303889453
5139	Ring	Existing	-79.3926661761316	43.6460273003346
7635	Ring	Existing	-79.3973838724551	43.6693038734947

Creating the map and displaying it

Here’s an example of how to create a map without any overlaid data points.

Code

toronto_map = folium.Map(
    location=[43.651070, -79.347015], zoom_start=11, tiles="OpenStreetMap"
)
toronto_map

Make this Notebook Trusted to load map: File -> Trust Notebook

Figure 1: The City of Toronto

Superimposing bike locations on the map with `FeatureGroup`

After instantiating FeatureGroup, we can easily add the bike locations using the add_child method. It is really easy!

Code

# let's start with a clean copy of the map of Toronto
toronto_map = folium.Map(
    location=[43.651070, -79.347015], zoom_start=11, tiles="OpenStreetMap"
)

# instantiate a feature group 
bike_stations = folium.map.FeatureGroup()

# loop through the bike stations
for lat, long in zip(toronto_df_processed.LATITUDE, toronto_df_processed.LONGITUDE):
    bike_stations.add_child(
        folium.features.CircleMarker(
            [lat, long],
            radius=5,
            color="red",
            fill=True,
            fill_color="yellow",
            fill_opacity=1,
        )
    )
# add bike stations to the map
toronto_map.add_child(bike_stations)

Make this Notebook Trusted to load map: File -> Trust Notebook

Figure 2: The City of Toronto with available bike locations

Adding pop-up text with relevant information

We can also enhance this by adding a pop-up box that displays custom text of our choice.

Code

# let's start with a clean copy of the map of Toronto
toronto_map = folium.Map(
    location=[43.651070, -79.347015], zoom_start=11, tiles="OpenStreetMap"
)

# instantiate a feature group 
bike_stations = folium.map.FeatureGroup()

# loop through the bike stations
for lat, long in zip(toronto_df_processed.LATITUDE, toronto_df_processed.LONGITUDE):
    bike_stations.add_child(
        folium.features.CircleMarker(
            [lat, long],
            radius=5,
            color="grey",
            fill=True,
            fill_color="white",
            fill_opacity=1,
        )
    )

# add pop-up text to each marker on the map
latitudes = list(toronto_df_processed.LATITUDE)
longitudes = list(toronto_df_processed.LONGITUDE)
labels = list(toronto_df_processed.STATUS)

for lat, lng, label in zip(latitudes, longitudes, labels):
    folium.Marker([lat, lng], popup=label).add_to(toronto_map)

# add bike stations to map
toronto_map.add_child(bike_stations)

Make this Notebook Trusted to load map: File -> Trust Notebook

Figure 3: The City of Toronto with available bike locations

Clustering the rental locations with `MarkerCluster`

And the best part, which happens to be my favorite, is that we can also integrate a MarkerCluster. This comes in handy when we’re dealing with numerous data points clustered closely together on the map. With a MarkerCluster, you get to see their combined values instead of each one individually. It’s a fantastic feature!

Code

# let's start with a clean copy of the map of Toronto
toronto_map = folium.Map(
    location=[43.651070, -79.347015], zoom_start=11, tiles="OpenStreetMap"
)

# instantiate a mark cluster object 
bike_stations_cluster = plugins.MarkerCluster().add_to(toronto_map)

# loop through the dataframe and add each data point to the mark cluster
for lat, lng, label, in zip(
    toronto_df_processed.LATITUDE,
    toronto_df_processed.LONGITUDE,
    toronto_df_processed.STATUS,
):
    folium.Marker(
        location=[lat, lng],
        icon=None,
        popup=label,
    ).add_to(bike_stations_cluster)

# display map
toronto_map

Make this Notebook Trusted to load map: File -> Trust Notebook

Figure 4: Aggregated Bike Locations in the City of Toronto

That’s a wrap! I hope these examples have been helpful. Feel free to use these techniques in your next data science or geospatial project. Until next time, happy exploring!

Creating the map and displaying it

Superimposing bike locations on the map with FeatureGroup

Adding pop-up text with relevant information

Clustering the rental locations with MarkerCluster

Superimposing bike locations on the map with `FeatureGroup`

Clustering the rental locations with `MarkerCluster`