What is the best Python library for fuzzy name matching in sanctions screening?

There is no single best library. Production sanctions screening systems combine multiple techniques: Jaro-Winkler for edit-distance similarity (jellyfish or python-Levenshtein), Double Metaphone for phonetic matching (metaphone), and TF-IDF with cosine similarity for token-based matching (scikit-learn). Each catches different types of name variations. The Verifex REST API combines all of these into a single HTTP call.

How do I handle Arabic and Cyrillic name transliterations in Python?

Arabic and Cyrillic names have dozens of valid Latin transliterations. For example, the Arabic name for Mohammed has over 30 common English spellings. Phonetic algorithms like Double Metaphone help by encoding names by pronunciation rather than spelling. However, cross-script transliteration is an unsolved problem in general. Production systems maintain transliteration lookup tables and combine phonetic matching with fuzzy string matching to maximize coverage.

Why does my sanctions screening system return so many false positives for common names?

Common names like Mohammed, Ali, Ahmed, and Kim appear in thousands of sanctions entries. Without IDF (Inverse Document Frequency) weighting, any query containing these tokens will match hundreds of entries. IDF weighting solves this by assigning lower importance to frequent tokens and higher importance to rare, distinctive tokens. This is the single most impactful technique for reducing false positives.

Can I build a production sanctions screening system in Python myself?

Technically yes, but maintaining it is a full-time engineering effort. Beyond the matching algorithms, you need to ingest and normalize sanctions, PEP, and watchlist sources across major regimes (OFAC, UN, UK, and others), sync daily updates, handle entity deduplication, maintain low latency at scale, and keep up with changing data formats. Most teams find it more cost-effective to use a screening API like Verifex and focus their engineering time on their core product.

Back to Blog

Tutorial

April 7, 202612 min read

How to Build Fuzzy Matching for Sanctions Screening in Python

Sanctions screening has a name matching problem. The OFAC SDN list spells Libya's former leader as "AL-QADHAFI, Muammar." Your customer database might have "Moammar Gaddafi." The Library of Congress uses "Qaddafi." Wikipedia uses "Gaddafi." Researchers have documented 112 distinct English spellings of this single name. Exact string matching catches exactly one of them.

This article walks through building a fuzzy name matching system for sanctions screening in Python, from naive string comparison to a multi-algorithm pipeline that handles transliterations, name reordering, and the common-name false positive problem. Every step includes working code you can run locally. At the end, we will show how to skip all of this and use the Verifex REST API to get production-quality screening in five lines of code.

Why naive string matching fails

The simplest approach is exact comparison. Normalize both strings to lowercase, strip whitespace, and check equality. Here is what that looks like:

python

def exact_match(query: str, target: str) -> bool:
    return query.strip().lower() == target.strip().lower()

# These all return False:
exact_match("Muammar Gaddafi", "Muammar al-Qadhafi")   # transliteration
exact_match("Aleksandr Petrov", "Petrov, Aleksandr")        # name reordering
exact_match("Jose Garcia", "José García")               # diacritics
exact_match("Ossama bin Laden", "Usama bin Ladin")      # phonetic variant

Every single comparison fails, and every single pair refers to the same person. This is the fundamental problem. Sanctions data comes from government agencies that transliterate names from Arabic, Cyrillic, Chinese, and dozens of other scripts. Customer data comes from user input forms, passport OCR, and bank wire messages. The same person will be spelled differently in each system.

Exact matching misses four categories of variation that are extremely common in sanctions data:

Transliterations. Gaddafi, Qadhafi, Gadhafi, Kaddafi — all valid romanizations of the Arabic original.
Name reordering. OFAC lists names as "PETROV, Aleksandr Sergeevich." Customer forms collect "Aleksandr Petrov."
Missing components. A customer might enter "Aleksandr Petrov" while the sanctions entry includes the patronymic "Sergeevich."
Diacritics and encoding. José vs Jose, Müller vs Mueller, Çelik vs Celik.

We need something better. Let us build up, one technique at a time.

Step 1: Levenshtein distance

Levenshtein distance counts the minimum number of single-character edits — insertions, deletions, or substitutions — needed to transform one string into another. "Gaddafi" to "Gadafi" is distance 1 (delete one "d"). "Petrov" to "Petrovv" is distance 1 (insert a "t"). We convert this to a similarity ratio between 0 and 1.

python

from difflib import SequenceMatcher

def levenshtein_similarity(a: str, b: str) -> float:
    """Return similarity ratio between 0 and 1."""
    return SequenceMatcher(None, a.lower(), b.lower()).ratio()

# Typos and minor spelling differences work well:
levenshtein_similarity("Aleksandr Petrov", "Aleksadr Petrov")
# => 0.93  (one missing character)

levenshtein_similarity("Muhammad", "Mohammed")
# => 0.75  (two character substitutions)

# But phonetic variants fail:
levenshtein_similarity("Gaddafi", "Qadhafi")
# => 0.43  (looks very different despite sounding similar)

levenshtein_similarity("Ossama", "Usama")
# => 0.55  (same person, low score)

Levenshtein works well for typos and minor spelling differences. A missing letter or a swapped character produces a high similarity score. But it fails badly for phonetically similar names that are spelled differently. "Gaddafi" and "Qadhafi" share only 3 of 7 characters in the same positions, so the edit distance is large even though any English speaker would recognize them as the same name.

For a faster implementation on large datasets, use the python-Levenshtein library, which is implemented in C:

python

# pip install python-Levenshtein
import Levenshtein

distance = Levenshtein.distance("Gaddafi", "Gadafi")  # => 1
ratio = Levenshtein.ratio("Aleksandr Petrov", "Aleksadr Petrov")  # => 0.96

Levenshtein is a good first layer, but we need something that understands pronunciation.

Step 2: Jaro-Winkler similarity

Jaro-Winkler improves on basic edit distance by giving extra weight to matching characters at the beginning of the string. The intuition is that names with the same prefix are more likely to be the same person. It was originally designed for census record linkage, which is essentially the same problem as sanctions screening: matching names across imperfect records.

python

# pip install jellyfish
import jellyfish

def jaro_winkler(a: str, b: str) -> float:
    return jellyfish.jaro_winkler_similarity(a.lower(), b.lower())

# Better than Levenshtein for prefix matches:
jaro_winkler("Muhammad", "Mohammed")
# => 0.78  (shared "M" prefix gets bonus)

jaro_winkler("Aleksandr Petrov", "Aleksadr Petrov")
# => 0.97  (long shared prefix, one typo)

jaro_winkler("Sberbank", "Sberbank of Russia")
# => 0.89  (shared prefix with extra tokens)

# Still struggles with different-sounding starts:
jaro_winkler("Gaddafi", "Qadhafi")
# => 0.56  (different first character kills the prefix bonus)

Jaro-Winkler gives us better results than Levenshtein for names that share a common prefix, which is surprisingly common in sanctions data where transliterations often preserve the first few characters. But when the first character itself is different — "G" vs "Q" for Gaddafi/Qadhafi, "O" vs "U" for Ossama/Usama — the prefix bonus becomes a penalty. We need an approach that compares sounds, not characters.

Step 3: Phonetic matching with Double Metaphone

Phonetic algorithms encode names by how they sound rather than how they are spelled. The classic Soundex algorithm has been around since 1918, but Double Metaphone (developed by Lawrence Philips in 2000) is far more accurate for the kind of cross-language name matching that sanctions screening requires. It generates two possible phonetic encodings for each name, accounting for ambiguous pronunciations.

python

# pip install metaphone
from metaphone import doublemetaphone

def phonetic_match(a: str, b: str) -> bool:
    """Check if two names share any phonetic encoding."""
    codes_a = doublemetaphone(a)
    codes_b = doublemetaphone(b)
    # Compare all combinations of primary/alternate codes
    for code_a in codes_a:
        for code_b in codes_b:
            if code_a and code_b and code_a == code_b:
                return True
    return False

# Now the transliterations match:
doublemetaphone("Gaddafi")   # => ('KTF', 'TTF')
doublemetaphone("Qadhafi")   # => ('KTF', None)
phonetic_match("Gaddafi", "Qadhafi")  # => True!

doublemetaphone("Ossama")    # => ('ASM', None)
doublemetaphone("Usama")     # => ('ASM', None)
phonetic_match("Ossama", "Usama")     # => True!

# Works for European variants too:
phonetic_match("Schmidt", "Smith")    # => True
phonetic_match("Mueller", "Miller")   # => True

Double Metaphone solves the transliteration problem that Levenshtein and Jaro-Winkler cannot handle. "Gaddafi" and "Qadhafi" both encode to "KTF" because the algorithm knows that "G" and hard "Q" produce the same consonant sound, and "dd" and "dh" are equivalent phonetically.

The limitation of phonetic matching is that it is binary — two names either share a code or they do not. There is no confidence score. And phonetic codes are coarse, which means unrelated names sometimes share the same code (false positives). We need to combine phonetic matching with the similarity scores from the earlier steps to get reliable confidence values.

Step 4: Token-based matching with TF-IDF

All the approaches so far compare full name strings. But sanctions screening involves names with different orderings ("Petrov, Aleksandr" vs "Aleksandr Petrov"), missing components ("Aleksandr Petrov" vs "Aleksandr Sergeevich Petrov"), and extra tokens ("Sberbank" vs "PUBLIC JOINT STOCK COMPANY SBERBANK OF RUSSIA"). Token-based matching handles all of these by breaking names into individual words and comparing them independently.

TF-IDF (Term Frequency-Inverse Document Frequency) goes further by weighting each token by how rare it is in the dataset. Common tokens like "Mohammed" or "Corporation" get low weights. Rare tokens like "Qadhafi" or "Sberbank" get high weights.

python

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

# Simulated sanctions list
sanctions_names = [
    "Aleksandr Sergeevich Petrov",
    "Muammar al-Qadhafi",
    "Mohammed Ali al-Houthi",
    "Mohammed Ibrahim Hassan",
    "Kim Jong Un",
    "Sberbank of Russia",
    "Ali Khamenei",
    "Mohammed bin Salman",
]

# Build TF-IDF matrix from sanctions list
vectorizer = TfidfVectorizer(analyzer="word", lowercase=True)
tfidf_matrix = vectorizer.fit_transform(sanctions_names)

def tfidf_screen(query: str, top_k: int = 3) -> list:
    """Screen a query name against sanctions list using TF-IDF."""
    query_vec = vectorizer.transform([query.lower()])
    scores = cosine_similarity(query_vec, tfidf_matrix).flatten()
    ranked = np.argsort(scores)[::-1][:top_k]
    return [(sanctions_names[i], round(scores[i], 3)) for i in ranked if scores[i] > 0]

# Name reordering works perfectly:
tfidf_screen("Petrov Aleksandr")
# => [("Aleksandr Sergeevich Petrov", 0.76)]

# Extra tokens handled gracefully:
tfidf_screen("Sberbank")
# => [("Sberbank of Russia", 0.62)]

# Partial names still match:
tfidf_screen("al-Qadhafi")
# => [("Muammar al-Qadhafi", 0.71)]

TF-IDF with cosine similarity solves the name reordering problem completely. "Petrov Aleksandr" matches "Aleksandr Sergeevich Petrov" because cosine similarity does not care about token order — it only cares about which tokens are present and how important they are.

The IDF problem: why "Mohammed Smith" matches everything

Here is where TF-IDF earns its keep. Without IDF weighting, a query like "Mohammed Ali Hassan" would match nearly every entry containing "Mohammed" or "Ali." In a real sanctions database, "Mohammed" appears in thousands of entries. "Ali" appears in hundreds more. Without weighting, any query with these tokens generates a wall of false positives.

IDF solves this mathematically. The weight of each token is:

python

import math

total_entries = 30_000  # entries in sanctions database

# IDF = log(total_entries / entries_containing_token)
idf_mohammed = math.log(total_entries / 2400)   # => 2.53 (very common)
idf_ali      = math.log(total_entries / 1800)   # => 2.81 (very common)
idf_qadhafi  = math.log(total_entries / 3)      # => 9.21 (extremely rare)
idf_petrov   = math.log(total_entries / 2)      # => 9.62 (extremely rare)

# A match on "Qadhafi" is worth ~3.6x more than a match on "Mohammed"
# This is why "Mohammed Ali Hassan" does NOT match "Mohammed Ali al-Houthi"
# at high confidence: the distinctive token "Hassan" vs "al-Houthi" differs,
# and the common tokens "Mohammed" and "Ali" contribute very little.

Let us see this in action with our TF-IDF screener:

python

# Screen a common name against our sample list
results = tfidf_screen("Mohammed Ali Hassan")

# Without IDF: would match 3+ entries at high confidence
# With IDF: only low-confidence matches because "Mohammed" and "Ali"
# have low weights, and "Hassan" is the distinctive token that
# does not appear in any sanctions entry.
# => [("Mohammed Ibrahim Hassan", 0.41),
#     ("Mohammed Ali al-Houthi", 0.38),
#     ("Mohammed bin Salman", 0.19)]

# Compare with a distinctive name:
results = tfidf_screen("Muammar Gaddafi")
# Even though "Gaddafi" != "Qadhafi" at the string level,
# we would combine this with phonetic matching (Step 3)
# to catch the phonetic equivalence.

This is the single most important insight for building a sanctions screening system: not all name tokens are created equal. A match on "Qadhafi" is overwhelmingly more meaningful than a match on "Mohammed." IDF weighting encodes this mathematical reality into your scoring. Without it, your system will drown compliance analysts in false alerts for anyone with a common name.

Putting it all together: a multi-stage pipeline

A production sanctions screening system runs all four techniques in sequence. Each stage catches different types of name variations, and the results are merged into a single ranked list with confidence scores. Here is the architecture:

python

from dataclasses import dataclass

@dataclass
class ScreeningMatch:
    name: str
    source: str           # "OFAC SDN", "UN", "EU", etc.
    confidence: float     # 0.0 to 1.0
    match_type: str       # "exact", "fuzzy", "phonetic", "token"

def screen_name(query: str, sanctions_db: list) -> list[ScreeningMatch]:
    matches = []

    for entry in sanctions_db:
        # Stage 1: Exact match (fastest, highest confidence)
        if normalize(query) == normalize(entry.name):
            matches.append(ScreeningMatch(
                name=entry.name, source=entry.source,
                confidence=1.0, match_type="exact"
            ))
            continue

        # Stage 2: Jaro-Winkler fuzzy match
        jw_score = jaro_winkler(query, entry.name)
        if jw_score > 0.85:
            matches.append(ScreeningMatch(
                name=entry.name, source=entry.source,
                confidence=jw_score, match_type="fuzzy"
            ))
            continue

        # Stage 3: Phonetic match with Double Metaphone
        if phonetic_match(query, entry.name):
            # Use Levenshtein as secondary score
            lev_score = levenshtein_similarity(query, entry.name)
            matches.append(ScreeningMatch(
                name=entry.name, source=entry.source,
                confidence=max(0.65, lev_score), match_type="phonetic"
            ))
            continue

    # Stage 4: TF-IDF token matching (handles reordering, missing tokens)
    tfidf_results = tfidf_screen(query)
    for name, score in tfidf_results:
        if score > 0.5 and not any(m.name == name for m in matches):
            matches.append(ScreeningMatch(
                name=name, source=lookup_source(name),
                confidence=score, match_type="token"
            ))

    # Deduplicate by entity, keep highest confidence
    matches = deduplicate(matches)
    matches.sort(key=lambda m: m.confidence, reverse=True)
    return matches[:10]

This pipeline catches the full spectrum of name variations:

Exact match handles the fast path — normalized string equality. O(1) with proper indexing.
Jaro-Winkler catches typos and minor spelling variations ("Aleksadr" to "Aleksandr").
Double Metaphone catches phonetic variants across transliteration systems ("Gaddafi" to "Qadhafi").
TF-IDF handles name reordering, missing components, and suppresses common-name false positives through IDF weighting.

Why you should not build this yourself

The code above is a reasonable prototype. But the distance between "working prototype" and "production sanctions screening system" is enormous. Here is what the prototype does not handle:

Data ingestion and maintenance. A production system screens against 43 sanctions and watchlists: OFAC SDN, OFAC Consolidated, UN Security Council, UK HM Treasury, Australia DFAT, Canada OSFI, and a dozen more. Each list has its own data format, update schedule, and edge cases. The OFAC SDN list alone updates multiple times per week. You need automated ingestion pipelines that handle schema changes, parse XML/CSV/JSON, normalize entity data, and deduplicate across lists. Verifex maintains 1,000,000+ entities across all lists with daily automated syncs.

Cross-script name handling. Arabic names have dozens of valid transliterations. Chinese names can be romanized via Pinyin, Wade-Giles, or other systems. Russian names have different romanization conventions in different countries. Korean names can be Kim, Gim, or Ghim depending on the romanization standard. Handling all of these requires language-specific transliteration tables, not just generic phonetic algorithms.

Performance at scale. The naive pipeline above iterates over every entry in the database for every query. With 30,000+ entities and hundreds of aliases each, that is hundreds of thousands of comparisons per screen. At 100 screens per second, you need sub-10ms per comparison, which means pre-computed indexes, blocking strategies, and careful algorithm selection. The Verifex API delivers low-latency responses designed for real-time screening, including network latency.

False positive tuning. The thresholds in the prototype (0.85 for Jaro-Winkler, 0.5 for TF-IDF) are arbitrary. In production, these need to be tuned against labeled data, different for person vs. organization entities, and adjusted for name length and script. Read our technical deep-dive on fuzzy matching for more on how confidence scoring works in practice.

Compliance audit trail. Regulators require a complete audit log of every screen: what was searched, what was matched, what decision was made, and who made it. You need timestamped, append-only logs with the full match context. Our integration guide covers how to implement proper audit logging.

Using the Verifex REST API

If you want production-quality sanctions screening without building and maintaining the entire pipeline yourself, the Verifex API runs the full multi-stage pipeline in a single HTTP call — exact matching, Jaro-Winkler fuzzy matching, Double Metaphone phonetic matching, and Soft TF-IDF token matching, calibrated against 1,000,000+ entities across sanctions, PEP, and watchlist sources.

Official SDKs are available for Python, Node.js, Go, and Rust. You can also use the REST API directly with any HTTP client.

Screen any name in five lines of Python using requests:

python

import requests

API_KEY = "YOUR_API_KEY"
BASE    = "https://api.verifex.dev/v1"

result = requests.post(
    f"{BASE}/screen",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={"name": "Muammar Gaddafi", "type": "person"},
).json()

print(result["risk_level"])   # "critical"
for match in result["matches"]:
    print(f"{match['name']} ({match['source']}) - {match['confidence']}%")

Here is a more complete example showing how to handle results in a production application:

python

import requests

API_KEY = "YOUR_API_KEY"
BASE    = "https://api.verifex.dev/v1"

def screen_customer(name: str, customer_id: str) -> dict:
    """Screen a customer and return a compliance decision."""
    resp = requests.post(
        f"{BASE}/screen",
        headers={"Authorization": f"Bearer {API_KEY}"},
        json={"name": name, "type": "person"},
    )
    resp.raise_for_status()
    result = resp.json()

    # Log for audit trail
    log_screening(
        customer_id=customer_id,
        query=name,
        risk_level=result["risk_level"],
        match_count=result["total_matches"],
        request_id=result["request_id"],
        screened_at=result["screened_at"],
    )

    if result["risk_level"] == "critical":
        return {"approved": False, "action": "block", "reason": "sanctions_match"}
    elif result["risk_level"] in ("high", "medium"):
        return {"approved": False, "action": "review", "reason": "potential_match"}
    else:
        return {"approved": True, "action": "clear"}

decision = screen_customer("Aleksandr Petrov", "cust_12345")
# => {"approved": False, "action": "block", "reason": "sanctions_match"}

Response times are fast. The free tier includes 100 screens per month — enough for development and low-volume production use.

Summary

Building a fuzzy name matching system for sanctions screening requires layering multiple algorithms. Levenshtein distance catches typos. Jaro-Winkler handles prefix-similar names. Double Metaphone catches phonetic variants across transliteration systems. TF-IDF with IDF weighting handles name reordering and suppresses common-name false positives.

Each technique solves a specific failure mode of the simpler approaches. A production system needs all four running together in a pipeline, plus data ingestion across many sanctions, PEP, and watchlist sources, cross-script transliteration support, low latency, and a complete audit trail. That is a significant engineering investment to build and maintain.

The Verifex REST API gives you all of this in a single HTTP call. Five lines of Python, fast response times, 1,000,000+ entities, and sources across major sanctions regimes. The free tier takes five minutes to set up. Sometimes the best code is the code you do not have to write.

This guide is for technical and operational education. Verifex provides screening infrastructure and evidence records, not legal advice, transaction approval, or a replacement for your risk-based compliance program.

Get started with Verifex

Screen against OFAC, UN & UK sanctions lists in one API call. Free tier available.

Start screening free