Entity Extraction
Automatically extract and analyze people, organizations, money amounts, and other key entities from your case documents.
What is Entity Extraction?
Entity extraction uses AI to automatically identify and categorize important information in your documents:
| Entity Type | Examples |
|---|---|
| People | John Smith, Dr. Sarah Martinez, Michael Chen |
| Organizations | Acme Corp, Global Holdings Inc, Department of Justice |
| Money | $500,000, $1.5 million, 125,000 shares |
| Dates | August 7, 2024, March 2024 |
| Emails | john@company.com, contact@acme.com |
| Phones | (555) 123-4567 |
How It Works
BillionLens uses Grounded Extraction with citation verification:
- Document Search - AI searches all indexed documents
- Entity Detection - Identifies entity mentions with citations
- Verification - Each entity is verified against source documents
- Description Generation - AI creates contextual descriptions
- Alias Detection - Groups name variations (e.g., "Mike" = "Michael Chen")
- Deduplication - Merges duplicate entities intelligently
Every extracted entity must be verified in source documents. This prevents hallucination and ensures accuracy for e-discovery.
Running Entity Extraction
Extract Entities
- Navigate to your case
- Click Entities card
- Click Extract Entities button
- Wait for extraction to complete
- View extracted entities organized by type
Progress Tracking
During extraction, you'll see:
- Progress percentage
- Current extraction phase
- Entity counts as they're discovered
Viewing Entities
Entity Tabs
Entities are organized by type:
| Tab | Description |
|---|---|
| All | All entity types combined |
| People | Person names |
| Organizations | Companies, agencies, institutions |
| Money | Dollar amounts and financial values |
| Dates | Important dates |
| Emails | Email addresses |
Entity Details
Click the Context link on any entity to see:
- Description - AI-generated explanation of the entity's role
- Also Known As - Alias names (for people)
- Source Documents - Which files mention this entity
Example Entity
John Smith (Person)
- Description: "John Smith is the Chief Technology Officer at Acme Corp, responsible for overseeing product development and engineering teams."
- Also Known As: J. Smith, Johnny
Alias Detection & Deduplication
Automatic Alias Detection
BillionLens automatically detects when different names refer to the same person:
Examples:
- "John Smith" = "Johnny Smith" = "J. Smith"
- "Michael Chen" = "Mike Chen" = "M. Chen"
- "Dr. Martinez" = "Sarah Martinez" = "S. Martinez"
How It Works
The AI uses several strategies:
- Middle names - "John Michael Smith" matches "Michael Smith"
- Nicknames - "Mike" is recognized as nickname for "Michael"
- Abbreviations - "J. Smith" matches "John Smith"
- Typos - "Jhon" matches "John"
Merged Entities
When aliases are detected:
- One canonical name is chosen (usually the most complete)
- Other variations are stored as "Also Known As"
- All document references are preserved
Entity Descriptions
AI-Generated Descriptions
Each entity gets a contextual description based on document content:
Good description:
"Michael Chen holds the roles of COO/Representative and Chief Executive Officer, and is also listed as a founder and initial member of the board of directors."
Not just quotes:
Descriptions explain the entity's role and significance, not just raw quotes from documents.
Description Sources
Descriptions are generated by:
- Finding all mentions of the entity in documents
- Understanding the context of each mention
- Synthesizing a concise explanation
- Verifying against source citations
Searching Entities
Filter by Type
Click the entity type tabs to filter:
- View only People
- View only Organizations
- View only Money amounts
- etc.
Search Box
Use the search box to find specific entities:
- Search for "Smith" to find all Smith entities
- Search for company names
- Search for specific amounts
Use Cases
Building Witness Lists
Extract PERSON entities to identify:
- Potential witnesses
- Parties involved
- Key decision makers
- People who signed documents
Identifying Key Players
Find ORGANIZATION entities to see:
- Companies involved
- Law firms
- Government agencies
- Banks and financial institutions
Financial Analysis
Extract MONEY entities to:
- Track investment amounts
- Identify financial transactions
- Find payment records
- Understand ownership stakes
Contact Discovery
Find EMAIL entities to:
- Build contact lists
- Identify key communicators
- Track correspondence patterns
Accuracy & Verification
Citation-Required Extraction
Unlike simple NER (Named Entity Recognition), BillionLens uses grounded extraction:
| Traditional NER | BillionLens Grounded Extraction |
|---|---|
| Pattern matching | AI with document search |
| May hallucinate | Citation-required |
| No context | Full document context |
| No descriptions | AI-generated descriptions |
Verification Process
Every entity must pass verification:
- Entity name must appear in source documents
- Citation must be retrievable
- Description must be grounded in document text
Rejected Entities
Entities that can't be verified are rejected:
- Prevents hallucination
- Ensures e-discovery accuracy
- Logs rejected entities for review
Best Practices
Review Important Entities
- Manually verify critical entity extractions
- Cross-reference with source documents
- Check for missing entities
Use Descriptions
- Read AI-generated descriptions for context
- Descriptions explain relationships and roles
- More useful than just names
Combine with Chat
- Use entity extraction to find candidates
- Use AI chat to explore relationships
- Ask follow-up questions about specific entities
Re-extract When Needed
- Re-run extraction if you add new documents
- Previous entities are cleared before re-extraction
- Ensures consistency
Next: Learn about Timeline Visualization.