Back to Blog
Tutorial
April 7, 202612 min read

How to Build Fuzzy Matching for Sanctions Screening in Python

Sanctions screening has a name matching problem. The OFAC SDN list spells Libya's former leader as "AL-QADHAFI, Muammar." Your customer database might have "Moammar Gaddafi." The Library of Congress uses "Qaddafi." Wikipedia uses "Gaddafi." Researchers have documented 112 distinct English spellings of this single name. Exact string matching catches exactly one of them.

This article walks through building a fuzzy name matching system for sanctions screening in Python, from naive string comparison to a multi-algorithm pipeline that handles transliterations, name reordering, and the common-name false positive problem. Every step includes working code you can run locally. At the end, we will show how to skip all of this and use the Verifex Python SDK to get production-quality screening in five lines of code.

Why naive string matching fails

The simplest approach is exact comparison. Normalize both strings to lowercase, strip whitespace, and check equality. Here is what that looks like:

python
def exact_match(query: str, target: str) -> bool:
    return query.strip().lower() == target.strip().lower()

# These all return False:
exact_match("Muammar Gaddafi", "Muammar al-Qadhafi")   # transliteration
exact_match("Vladimir Putin", "Putin, Vladimir")        # name reordering
exact_match("Jose Garcia", "José García")               # diacritics
exact_match("Ossama bin Laden", "Usama bin Ladin")      # phonetic variant

Every single comparison fails, and every single pair refers to the same person. This is the fundamental problem. Sanctions data comes from government agencies that transliterate names from Arabic, Cyrillic, Chinese, and dozens of other scripts. Customer data comes from user input forms, passport OCR, and bank wire messages. The same person will be spelled differently in each system.

Exact matching misses four categories of variation that are extremely common in sanctions data:

  • Transliterations. Gaddafi, Qadhafi, Gadhafi, Kaddafi — all valid romanizations of the Arabic original.
  • Name reordering. OFAC lists names as "PUTIN, Vladimir Vladimirovich." Customer forms collect "Vladimir Putin."
  • Missing components. A customer might enter "Vladimir Putin" while the sanctions entry includes the patronymic "Vladimirovich."
  • Diacritics and encoding. José vs Jose, Müller vs Mueller, Çelik vs Celik.

We need something better. Let us build up, one technique at a time.

Step 1: Levenshtein distance

Levenshtein distance counts the minimum number of single-character edits — insertions, deletions, or substitutions — needed to transform one string into another. "Gaddafi" to "Gadafi" is distance 1 (delete one "d"). "Putin" to "Puttin" is distance 1 (insert a "t"). We convert this to a similarity ratio between 0 and 1.

python
from difflib import SequenceMatcher

def levenshtein_similarity(a: str, b: str) -> float:
    """Return similarity ratio between 0 and 1."""
    return SequenceMatcher(None, a.lower(), b.lower()).ratio()

# Typos and minor spelling differences work well:
levenshtein_similarity("Vladimir Putin", "Vladmir Putin")
# => 0.93  (one missing character)

levenshtein_similarity("Muhammad", "Mohammed")
# => 0.75  (two character substitutions)

# But phonetic variants fail:
levenshtein_similarity("Gaddafi", "Qadhafi")
# => 0.43  (looks very different despite sounding similar)

levenshtein_similarity("Ossama", "Usama")
# => 0.55  (same person, low score)

Levenshtein works well for typos and minor spelling differences. A missing letter or a swapped character produces a high similarity score. But it fails badly for phonetically similar names that are spelled differently. "Gaddafi" and "Qadhafi" share only 3 of 7 characters in the same positions, so the edit distance is large even though any English speaker would recognize them as the same name.

For a faster implementation on large datasets, use the python-Levenshtein library, which is implemented in C:

python
# pip install python-Levenshtein
import Levenshtein

distance = Levenshtein.distance("Gaddafi", "Gadafi")  # => 1
ratio = Levenshtein.ratio("Vladimir Putin", "Vladmir Putin")  # => 0.93

Levenshtein is a good first layer, but we need something that understands pronunciation.

Step 2: Jaro-Winkler similarity

Jaro-Winkler improves on basic edit distance by giving extra weight to matching characters at the beginning of the string. The intuition is that names with the same prefix are more likely to be the same person. It was originally designed for census record linkage, which is essentially the same problem as sanctions screening: matching names across imperfect records.

python
# pip install jellyfish
import jellyfish

def jaro_winkler(a: str, b: str) -> float:
    return jellyfish.jaro_winkler_similarity(a.lower(), b.lower())

# Better than Levenshtein for prefix matches:
jaro_winkler("Muhammad", "Mohammed")
# => 0.78  (shared "M" prefix gets bonus)

jaro_winkler("Vladimir Putin", "Vladmir Putin")
# => 0.97  (long shared prefix, one typo)

jaro_winkler("Sberbank", "Sberbank of Russia")
# => 0.89  (shared prefix with extra tokens)

# Still struggles with different-sounding starts:
jaro_winkler("Gaddafi", "Qadhafi")
# => 0.56  (different first character kills the prefix bonus)

Jaro-Winkler gives us better results than Levenshtein for names that share a common prefix, which is surprisingly common in sanctions data where transliterations often preserve the first few characters. But when the first character itself is different — "G" vs "Q" for Gaddafi/Qadhafi, "O" vs "U" for Ossama/Usama — the prefix bonus becomes a penalty. We need an approach that compares sounds, not characters.

Step 3: Phonetic matching with Double Metaphone

Phonetic algorithms encode names by how they sound rather than how they are spelled. The classic Soundex algorithm has been around since 1918, but Double Metaphone (developed by Lawrence Philips in 2000) is far more accurate for the kind of cross-language name matching that sanctions screening requires. It generates two possible phonetic encodings for each name, accounting for ambiguous pronunciations.

python
# pip install metaphone
from metaphone import doublemetaphone

def phonetic_match(a: str, b: str) -> bool:
    """Check if two names share any phonetic encoding."""
    codes_a = doublemetaphone(a)
    codes_b = doublemetaphone(b)
    # Compare all combinations of primary/alternate codes
    for code_a in codes_a:
        for code_b in codes_b:
            if code_a and code_b and code_a == code_b:
                return True
    return False

# Now the transliterations match:
doublemetaphone("Gaddafi")   # => ('KTF', 'TTF')
doublemetaphone("Qadhafi")   # => ('KTF', None)
phonetic_match("Gaddafi", "Qadhafi")  # => True!

doublemetaphone("Ossama")    # => ('ASM', None)
doublemetaphone("Usama")     # => ('ASM', None)
phonetic_match("Ossama", "Usama")     # => True!

# Works for European variants too:
phonetic_match("Schmidt", "Smith")    # => True
phonetic_match("Mueller", "Miller")   # => True

Double Metaphone solves the transliteration problem that Levenshtein and Jaro-Winkler cannot handle. "Gaddafi" and "Qadhafi" both encode to "KTF" because the algorithm knows that "G" and hard "Q" produce the same consonant sound, and "dd" and "dh" are equivalent phonetically.

The limitation of phonetic matching is that it is binary — two names either share a code or they do not. There is no confidence score. And phonetic codes are coarse, which means unrelated names sometimes share the same code (false positives). We need to combine phonetic matching with the similarity scores from the earlier steps to get reliable confidence values.

Step 4: Token-based matching with TF-IDF

All the approaches so far compare full name strings. But sanctions screening involves names with different orderings ("Putin, Vladimir" vs "Vladimir Putin"), missing components ("Vladimir Putin" vs "Vladimir Vladimirovich Putin"), and extra tokens ("Sberbank" vs "PUBLIC JOINT STOCK COMPANY SBERBANK OF RUSSIA"). Token-based matching handles all of these by breaking names into individual words and comparing them independently.

TF-IDF (Term Frequency-Inverse Document Frequency) goes further by weighting each token by how rare it is in the dataset. Common tokens like "Mohammed" or "Corporation" get low weights. Rare tokens like "Qadhafi" or "Sberbank" get high weights.

python
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

# Simulated sanctions list
sanctions_names = [
    "Vladimir Vladimirovich Putin",
    "Muammar al-Qadhafi",
    "Mohammed Ali al-Houthi",
    "Mohammed Ibrahim Hassan",
    "Kim Jong Un",
    "Sberbank of Russia",
    "Ali Khamenei",
    "Mohammed bin Salman",
]

# Build TF-IDF matrix from sanctions list
vectorizer = TfidfVectorizer(analyzer="word", lowercase=True)
tfidf_matrix = vectorizer.fit_transform(sanctions_names)

def tfidf_screen(query: str, top_k: int = 3) -> list:
    """Screen a query name against sanctions list using TF-IDF."""
    query_vec = vectorizer.transform([query.lower()])
    scores = cosine_similarity(query_vec, tfidf_matrix).flatten()
    ranked = np.argsort(scores)[::-1][:top_k]
    return [(sanctions_names[i], round(scores[i], 3)) for i in ranked if scores[i] > 0]

# Name reordering works perfectly:
tfidf_screen("Putin Vladimir")
# => [("Vladimir Vladimirovich Putin", 0.76)]

# Extra tokens handled gracefully:
tfidf_screen("Sberbank")
# => [("Sberbank of Russia", 0.62)]

# Partial names still match:
tfidf_screen("al-Qadhafi")
# => [("Muammar al-Qadhafi", 0.71)]

TF-IDF with cosine similarity solves the name reordering problem completely. "Putin Vladimir" matches "Vladimir Vladimirovich Putin" because cosine similarity does not care about token order — it only cares about which tokens are present and how important they are.

The IDF problem: why "Mohammed Smith" matches everything

Here is where TF-IDF earns its keep. Without IDF weighting, a query like "Mohammed Ali Hassan" would match nearly every entry containing "Mohammed" or "Ali." In a real sanctions database, "Mohammed" appears in thousands of entries. "Ali" appears in hundreds more. Without weighting, any query with these tokens generates a wall of false positives.

IDF solves this mathematically. The weight of each token is:

python
import math

total_entries = 30_000  # entries in sanctions database

# IDF = log(total_entries / entries_containing_token)
idf_mohammed = math.log(total_entries / 2400)   # => 2.53 (very common)
idf_ali      = math.log(total_entries / 1800)   # => 2.81 (very common)
idf_qadhafi  = math.log(total_entries / 3)      # => 9.21 (extremely rare)
idf_putin    = math.log(total_entries / 2)      # => 9.62 (extremely rare)

# A match on "Qadhafi" is worth ~3.6x more than a match on "Mohammed"
# This is why "Mohammed Ali Hassan" does NOT match "Mohammed Ali al-Houthi"
# at high confidence: the distinctive token "Hassan" vs "al-Houthi" differs,
# and the common tokens "Mohammed" and "Ali" contribute very little.

Let us see this in action with our TF-IDF screener:

python
# Screen a common name against our sample list
results = tfidf_screen("Mohammed Ali Hassan")

# Without IDF: would match 3+ entries at high confidence
# With IDF: only low-confidence matches because "Mohammed" and "Ali"
# have low weights, and "Hassan" is the distinctive token that
# does not appear in any sanctions entry.
# => [("Mohammed Ibrahim Hassan", 0.41),
#     ("Mohammed Ali al-Houthi", 0.38),
#     ("Mohammed bin Salman", 0.19)]

# Compare with a distinctive name:
results = tfidf_screen("Muammar Gaddafi")
# Even though "Gaddafi" != "Qadhafi" at the string level,
# we would combine this with phonetic matching (Step 3)
# to catch the phonetic equivalence.

This is the single most important insight for building a sanctions screening system: not all name tokens are created equal. A match on "Qadhafi" is overwhelmingly more meaningful than a match on "Mohammed." IDF weighting encodes this mathematical reality into your scoring. Without it, your system will drown compliance analysts in false alerts for anyone with a common name.

Putting it all together: a multi-stage pipeline

A production sanctions screening system runs all four techniques in sequence. Each stage catches different types of name variations, and the results are merged into a single ranked list with confidence scores. Here is the architecture:

python
from dataclasses import dataclass

@dataclass
class ScreeningMatch:
    name: str
    source: str           # "OFAC SDN", "UN", "EU", etc.
    confidence: float     # 0.0 to 1.0
    match_type: str       # "exact", "fuzzy", "phonetic", "token"

def screen_name(query: str, sanctions_db: list) -> list[ScreeningMatch]:
    matches = []

    for entry in sanctions_db:
        # Stage 1: Exact match (fastest, highest confidence)
        if normalize(query) == normalize(entry.name):
            matches.append(ScreeningMatch(
                name=entry.name, source=entry.source,
                confidence=1.0, match_type="exact"
            ))
            continue

        # Stage 2: Jaro-Winkler fuzzy match
        jw_score = jaro_winkler(query, entry.name)
        if jw_score > 0.85:
            matches.append(ScreeningMatch(
                name=entry.name, source=entry.source,
                confidence=jw_score, match_type="fuzzy"
            ))
            continue

        # Stage 3: Phonetic match with Double Metaphone
        if phonetic_match(query, entry.name):
            # Use Levenshtein as secondary score
            lev_score = levenshtein_similarity(query, entry.name)
            matches.append(ScreeningMatch(
                name=entry.name, source=entry.source,
                confidence=max(0.65, lev_score), match_type="phonetic"
            ))
            continue

    # Stage 4: TF-IDF token matching (handles reordering, missing tokens)
    tfidf_results = tfidf_screen(query)
    for name, score in tfidf_results:
        if score > 0.5 and not any(m.name == name for m in matches):
            matches.append(ScreeningMatch(
                name=name, source=lookup_source(name),
                confidence=score, match_type="token"
            ))

    # Deduplicate by entity, keep highest confidence
    matches = deduplicate(matches)
    matches.sort(key=lambda m: m.confidence, reverse=True)
    return matches[:10]

This pipeline catches the full spectrum of name variations:

  • Exact match handles the fast path — normalized string equality. O(1) with proper indexing.
  • Jaro-Winkler catches typos and minor spelling variations ("Vladmir" to "Vladimir").
  • Double Metaphone catches phonetic variants across transliteration systems ("Gaddafi" to "Qadhafi").
  • TF-IDF handles name reordering, missing components, and suppresses common-name false positives through IDF weighting.

Why you should not build this yourself

The code above is a reasonable prototype. But the distance between "working prototype" and "production sanctions screening system" is enormous. Here is what the prototype does not handle:

Data ingestion and maintenance. A production system screens against 20+ sanctions and watchlists: OFAC SDN, OFAC Consolidated, UN Security Council, EU Consolidated, UK HM Treasury, Australia DFAT, Canada OSFI, and a dozen more. Each list has its own data format, update schedule, and edge cases. The OFAC SDN list alone updates multiple times per week. You need automated ingestion pipelines that handle schema changes, parse XML/CSV/JSON, normalize entity data, and deduplicate across lists. Verifex maintains 972,000+ entities across all lists with daily automated syncs.

Cross-script name handling. Arabic names have dozens of valid transliterations. Chinese names can be romanized via Pinyin, Wade-Giles, or other systems. Russian names have different romanization conventions in different countries. Korean names can be Kim, Gim, or Ghim depending on the romanization standard. Handling all of these requires language-specific transliteration tables, not just generic phonetic algorithms.

Performance at scale. The naive pipeline above iterates over every entry in the database for every query. With 30,000+ entities and hundreds of aliases each, that is hundreds of thousands of comparisons per screen. At 100 screens per second, you need sub-10ms per comparison, which means pre-computed indexes, blocking strategies, and careful algorithm selection. The Verifex API achieves sub-50ms average response times including network latency.

False positive tuning. The thresholds in the prototype (0.85 for Jaro-Winkler, 0.5 for TF-IDF) are arbitrary. In production, these need to be tuned against labeled data, different for person vs. organization entities, and adjusted for name length and script. Read our technical deep-dive on fuzzy matching for more on how confidence scoring works in practice.

Compliance audit trail. Regulators require a complete audit log of every screen: what was searched, what was matched, what decision was made, and who made it. You need timestamped, immutable logs with the full match context. Our integration guide covers how to implement proper audit logging.

Using the Verifex Python SDK

If you want production-quality sanctions screening without building and maintaining the entire pipeline yourself, the Verifex Python SDK wraps all of the techniques described in this article into a single API call. Install it from PyPI:

python
pip install verifex

Then screen any name in five lines of code:

python
from verifex import VerifexClient

client = VerifexClient("YOUR_API_KEY")

result = client.screen("Muammar Gaddafi")
print(result.risk_level)   # "critical"

for match in result.matches:
    print(f"{match.name} ({match.source}) - {match.confidence}%")

Behind this call, the Verifex API runs the full multi-stage pipeline: exact matching against pre-computed normalized names, Jaro-Winkler fuzzy matching with prefix blocking, Double Metaphone phonetic matching against indexed phonetic codes, and Soft TF-IDF token matching with IDF weighting calibrated against 972,000+ entities across 20+ sanctions lists.

Here is a more complete example showing how to handle results in a production application:

python
from verifex import VerifexClient

client = VerifexClient("YOUR_API_KEY")

def screen_customer(name: str, customer_id: str) -> dict:
    """Screen a customer and return a compliance decision."""
    result = client.screen(name)

    # Log for audit trail
    log_screening(
        customer_id=customer_id,
        query=name,
        risk_level=result.risk_level,
        match_count=len(result.matches),
        request_id=result.request_id,
    )

    if result.risk_level == "critical":
        return {"approved": False, "action": "block", "reason": "sanctions_match"}
    elif result.risk_level == "high":
        return {"approved": False, "action": "review", "reason": "potential_match"}
    else:
        return {"approved": True, "action": "clear"}

# Screen against all 20+ lists in one call
decision = screen_customer("Vladimir Putin", "cust_12345")
# => {"approved": False, "action": "block", "reason": "sanctions_match"}

The SDK handles retries, rate limiting, and error cases automatically. Response times average under 50ms. The free tier includes 100 screens per month, which is enough for development and low-volume production use.

Summary

Building a fuzzy name matching system for sanctions screening requires layering multiple algorithms. Levenshtein distance catches typos. Jaro-Winkler handles prefix-similar names. Double Metaphone catches phonetic variants across transliteration systems. TF-IDF with IDF weighting handles name reordering and suppresses common-name false positives.

Each technique solves a specific failure mode of the simpler approaches. A production system needs all four running together in a pipeline, plus data ingestion for 20+ sanctions lists, cross-script transliteration support, sub-100ms latency, and a complete audit trail. That is a significant engineering investment to build and maintain.

The Verifex Python SDK gives you all of this in a single pip install. Five lines of code, sub-50ms response times, 972,000+ entities, 20+ sanctions lists. The free tier takes five minutes to set up. Sometimes the best code is the code you do not have to write.

Get started with Verifex

Screen against OFAC, UN, EU & UK sanctions lists in one API call. Free tier available.

Get Free API Key