Finding and Merging Duplicate Individuals
Find potential duplicate individuals in your tree using smart matching, compare their records side-by-side, and learn best practices for merging them.
How duplicates happen
Duplicate individuals appear in trees for several reasons:
- Multiple imports: Merging data from different sources without linking individuals
- Spelling variations: "John Smith" and "Jon Smyth" entered as separate people
- Incomplete merging: Partially merged trees leaving fragments
- Different name forms: Birth name vs. married name entered separately
- Research iterations: Adding someone tentatively, then finding them again under a different name
How the Duplicate Finder works
GEDminer compares individuals using multiple criteria:
- Name matching: Comparing given names and surnames with fuzzy matching
- Date alignment: Birth and death dates within reasonable ranges
- Place matching: Same or similar locations for events
- Parent matching: Individuals with the same parents are likely duplicates
- Sex consistency: Obvious mismatches are filtered out
Matches are assigned a confidence score based on how many criteria align.
Understanding confidence scores
The confidence percentage indicates match likelihood:
- 90-100%: Very likely duplicates - same name, dates, places, and parents
- 75-90%: Probable duplicates - most data matches with minor variations
- 60-75%: Possible duplicates - some matching, some differences to investigate
- Below 60%: Unlikely - shown for completeness but probably not duplicates
Always verify before merging, especially for common names.
Comparing records side-by-side
Click any duplicate pair to see a detailed comparison:
- Name: Full name as recorded for each
- Birth/Death dates and places: Exact records from each entry
- Parents: Father and mother names if recorded
- Spouses and children: Connected family members
- Sources: What evidence supports each record
Matching fields are highlighted to help you quickly assess the match.
Merging duplicates
GEDminer is an analysis tool - merging happens in your genealogy software:
- Note the IDs: Record the GEDCOM identifiers for both individuals
- Open your software: Find both records in your usual family tree program
- Review carefully: Check all facts, sources, and family links
- Merge: Use your software's merge function, choosing the best data for each field
- Re-export: Generate a new GEDCOM and re-analyse to verify
Avoiding false positives
Not every match is a duplicate:
- Common names: John Smith born 1850 might genuinely appear multiple times
- Siblings: Brothers with the same name (often due to infant death)
- Junior/Senior: Father and son with identical names
- Cousins: Same name, similar dates, same general area
When in doubt, research further before merging. It's easier to merge later than to unmerge incorrectly combined records.
Dismissing non-duplicates
When you've determined a pair isn't actually duplicate:
- Click Disregard: Removes the pair from your active list
- Add notes: In your genealogy software, note why they're different
- Consider documentation: If names are very similar, add explanatory notes to prevent future confusion