Advancing Wildlife Conservation

Through innovative data science and machine learning approaches to combat wildlife trafficking

Discovering, Analyzing, and Disrupting Wildlife Trafficking Networks

Wildlife trafficking is a global crisis affecting biodiversity, ecological systems, and public health. This illicit trade generates $7-23 billion annually and encompasses fashion, exotic pets, traditional medicine, and accessories. During the COVID-19 pandemic, traffickers shifted from face-to-face to online interactions, creating new challenges but also opportunities for digital detection.

Our NSF-funded interdisciplinary project combines computer science, criminology, and environmental science to develop innovative tools for discovering, analyzing, and disrupting wildlife trafficking networks operating online.

Wildlife Conservation
Wildlife Trafficking Impact

Our Impact

Our research directly addresses the critical need for effective wildlife trafficking detection and prevention. By developing cost-effective, scalable solutions, we empower researchers and law enforcement agencies worldwide to combat this global crisis more effectively.

The Challenge

Online marketplaces publish millions of ads, but identifying wildlife-related products is like finding a needle in a haystack. For example, searching for "shark" on eBay returns toys, shirts, vacuum cleaners, and only a few actual shark products. Even specific queries like "shark jaw" return irrelevant fossil items. This makes data collection, triage, and analysis extremely challenging for researchers studying wildlife trafficking patterns.

What if you could achieve 95% accuracy in wildlife trafficking detection while slashing labeling costs?

Learn to Sample Pipeline

Our Approach

We have developed an end-to-end pipeline to streamline the collection, extraction and triage of online ads. Our Learn to Sample (LTS) methodology addresses the fundamental challenge of identifying relevant wildlife advertisements in massive online datasets.

LTS leverages LLMs to label a small, systematically selected set of ads, and uses the labeled ads to automatically create smaller, specialized classifiers that can be deployed at scale to perform triage over large ad collections.

Explore LTS on GitHub

Contributing Institutions

New York University Indiana University University of Miami John Jay College of Criminal Justice
NSF Logo

This project is funded by the National Science Foundation