Back to Projects
End-to-End ProjectINFO-664-01: Programming for Cultural Heritage

Mapping Female Scientists and Explorers

Visualizing the WINGS Women of Discovery Network

Cultural HeritageData VisualizationPythonWeb MappingGender in ScienceGeoPandasArcGIS

Project Overview

For this final project, I created a Python-based spatial visualization of the WINGS Women of Discovery Fellows — a global network of female scientists and explorers working across fields such as geology, biology, climate science, and cultural documentation.

The goal was to show how computational tools, particularly Python geospatial libraries, can be used to support visibility, representation, and collaboration among women in STEM by making their global presence more legible.

Cleaned Dataset

WINGS Fellows data

Google Colab Notebook

Full documented workflow

Interactive Map

Locations & metadata

Filters

Research area & region

What is WINGS?

WINGS funds women researchers and scientists worldwide in exploration and groundbreaking discoveries. It is the only organization that awards unrestricted grants to women making pioneering discoveries in science, exploration, and conservation.

< 2%

of U.S. charitable giving goes to organizations serving women

33.3%

of all researchers worldwide are women

12%

of national science academies are women

200+

grants awarded across 100+ countries since 2003

Since 2003, WINGS has published the findings of more than 50 women-led field expeditions, creating a growing record of women's contributions to science and exploration.

Why I Chose This Topic

Growing up, I rarely saw images of female scientists or knew any by name, so I didn't realize science was even a possible career path. That changed in 2020, during the pandemic, when I began archiving materials at my parents' home in Baltimore.

While going through family collections, I came across the papers of Florence Bascom— the first woman to earn a PhD from Johns Hopkins University in 1893. My ancestor had run the Geology Department at the time, which is how her archives were preserved alongside his.

As I tried to understand her (and George Huntington Williams') diaries and geology notes, I contacted the Earth and Planetary Sciences department at Johns Hopkins. By coincidence, they were preparing an exhibit on Florence Bascom to inspire more women to enter STEM, and they asked whether I had uncovered any additional materials. That experience showed how women's scientific histories could be made visible — and how easily they can be forgotten.

Years later, after moving to New York City, I discovered the WINGS Women of Discovery community and shortly after became involved. This project reflects my ongoing commitment: to work within the sciences while also elevating and connecting women scientists through spatial visualization, making their contributions more visible and accessible and strengthening collaboration and community-building across regions and disciplines.

Challenges

Data Acquisition

  • WINGS data is available online but not in a structured table
  • Manual compilation and standardization of fields: names, occupation, hometown, years, research areas, expeditions

Geocoding

  • Many Fellows list broad regions rather than exact locations
  • Converting city/country data into latitude–longitude required multiple rounds of cleaning
  • Some locations required manual corrections and verification

Dataset Inconsistencies

  • Research areas varied widely and were not standardized (e.g., "marine biology," "ocean sciences," "biological oceanography" → one category?)
  • Institutions changed over time
  • Missing values had to be filled conservatively

Interactive Map Styling & Filters

  • Balancing clarity and visual simplicity took iteration
  • Choosing appropriate color schemes for categorical data

From proposal feedback:"Any thoughts on the larger implications of using localization to single out populations in the current techno-political environment?"

Technical Workflow

The project was built entirely in Python using a Google Colab notebook. Below is the annotated workflow showing how the dataset was loaded, cleaned, geocoded, and exported for spatial visualization.

1. Loading & Exploring the Dataset

The WINGS Fellows data was manually compiled into an Excel spreadsheet with fields including names, occupation, hometown, years active, research descriptions, expeditions, and fellowship year.

import pandas as pd

df = pd.read_excel("/WINGS Fellows.xlsx")
df.head()

# Dataset fields: NAMES, OCCUPATION, HOMETOWN, YEARS,
# RESEARCH, FELLOW AWARDED, Expeditions, WEBSITE,
# WINGS WEBSITE LINK, ADVICE

2. Assessing Data Quality

Checked for missing values and data types. Several fields had gaps that needed conservative filling.

# Summary info and missing values
df.info()
df.isna().sum()

# Sample random rows to spot inconsistencies
df.sample(10)

3. Cleaning Location Data

Hometown entries varied wildly — some listed cities, others listed countries or broad regions. All were converted to strings for geocoding.

# Convert hometown column to strings
df['location_clean'] = df['HOMETOWN'].astype(str)
df['location_clean'].head()

4. Geocoding with GeoPy

Used the Nominatim geocoder (OpenStreetMap) with rate limiting to convert place names into latitude/longitude coordinates. Locations that failed were manually corrected.

from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter

geolocator = Nominatim(user_agent="wings_mapper")
geocode = RateLimiter(geolocator.geocode, min_delay_seconds=1)

# Geocode all locations
df['geo'] = df['location_clean'].apply(geocode)
df['lat'] = df['geo'].apply(
    lambda x: x.latitude if x else None
)
df['lon'] = df['geo'].apply(
    lambda x: x.longitude if x else None
)

# Check which entries failed geocoding
df[df['lat'].isna()]

5. Export for Visualization

The cleaned, geocoded dataset was exported to CSV for use in ArcGIS Online, where it was visualized as an interactive map with filters by research area and region.

# Export cleaned dataset
df.to_csv("WINGS_clean.csv", index=False)

Tools used: Python, Pandas, GeoPy/Nominatim, Google Colab, ArcGIS Online, Excel

Interactive Map

The geocoded dataset was visualized as an interactive map. Below is the Folium-generated map showing WINGS Fellows plotted at their hometown locations, followed by the ArcGIS 3D Scene Layer providing a globe view of the network.

ArcGIS 3D Globe View

The same dataset visualized as a 3D scene layer in ArcGIS Online, providing a globe perspective of the global distribution of WINGS Fellows.

Key Findings & Insights

Geographic Distribution

  • Strong clusters and representation in North America
  • Gaps or sparse representation in Africa, Australia, and South America

Disciplinary Distribution

  • Overrepresentation in certain fields (e.g., biology, ecology, ocean exploration)
  • Underrepresentation in engineering or computational disciplines

Institutional Patterns

  • Fellows often affiliated with research universities
  • Strong ties to conservation organizations and exploration institutes
  • Connections to science communication nonprofits

Visibility Matters

  • The map demonstrates how a relatively hidden network becomes legible and connected when visualized spatially
  • Spatial visualization transforms abstract data into actionable insight about representation

Future Directions

Automate Data Collection

Build a scraper or API pipeline to keep the dataset updated and reduce manual data entry

Add Filtering & Analytical Tools

Filters by discipline, region, or institution; thematic clustering; time-series analysis by year

Build a Web Dashboard

Convert the map into a public-facing interactive site with search, filters, and improved UI

Expand Data & Metadata

Add fields such as publications, institutions, mentors, and advisors

Grow the Dataset

Include women scientists and explorers beyond WINGS to build a broader resource

Community Collaboration

Enable WINGS Fellows to update their own profiles and contribute to the dataset

Reflection

This project demonstrated that computational tools — particularly Python geospatial libraries like GeoPandas, Folium, and the ArcGIS API — can serve as powerful instruments for visibility and representation. By transforming unstructured data about the WINGS Women of Discovery into a structured, geocoded, and interactive map, I was able to make a hidden network legible and connected.

The project also reinforced the importance of data stewardship: cleaning, standardizing, and responsibly geocoding information about real people requires care, ethical consideration, and transparency about the limitations of the data.