Skip to main content

Entity Extract & Graph

Overview

Entity Extract & Graph reads one or more evidence files and extracts entities (for example people, organizations, locations, vehicles, and events) and their relationships. You use it to turn unstructured content into structured Records and an optional Link Chart you can review, edit, and save back into a Workspace. Inputs include documents, images, and audio files. Outputs include extracted entities/relationships, saved workspace records, and an exported graph. AI extraction is a starting point and requires human review, especially for relationships.


When to Use This Application

  • You need a first-pass list of people, organizations, places, and identifiers mentioned in a document set.
  • You want to turn a document (or folder of documents) into structured Records you can search and reuse.
  • You want to generate a Link Chart quickly to explore connections without manual data entry.
  • You need to compare evidence across multiple files and identify repeated entities and actions.
  • You want to link extracted entities to existing records to reduce duplicates and improve data quality.

Before You Begin

  • Confirm you have access to the Workspace where the source files live and where you plan to save outputs.
  • Prepare the evidence you want to analyze:
    • Documents (for example PDFs and text files)
    • Images
    • Audio files
  • If you plan to save results, decide which Workspace and folder path you will use for output.

Step-by-Step Walkthrough

Step 1 — Load your documents

Drag and drop content into the dropzone when the app opens. You can load:

  • One or more files
  • A folder (to process all files inside)
  • A workspace (to process workspace-level content) After files load, the app shows a file list with a status for each item:
  • — File loaded and ready to process
  • — File has no readable content
  • — File could not be read If a file shows or , review that the file is correctly indexed, if it is not, use the Force Re-Index Action.

Step 2 — Configure extraction options (optional)

Before extraction, you can set two options to control how the app reads and prioritizes content.

  • Extraction Goal: enter a short description of what you are looking for. This focuses the extraction on your investigative objective.
  • Examples:
    • “Focus on financial transactions.”
    • “Find supply chain actors.”
  • Leave this blank for a general extraction across all entity types.
  • Deep Mode: enable this option when you want a more exhaustive extraction. Deep Mode can surface more entities and relationships, but it can also increase noise. Use it when completeness matters more than precision.

Step 3 — Run the extraction

Select Extract Entities to start processing. The application:

  • Reads the loaded files
  • Extracts entities and relationships
  • Updates the graph visualization while extraction is running While extraction runs, you can monitor:
  • A live counter of entities and relationships found
  • The graph updating in real time To stop early, select Stop. Processing time depends on the number of files and their size.

Step 4 — Review entities

Open the Entities tab to review and refine extracted entities. Entities appear as cards (10 per page). Use filters to focus your review:

  • Filter by concept type (for example Persons only, Vehicles only)
  • Filter by keyword (search across entity data)

Each entity card shows:

  • The entity name and type (color-coded)
  • Saved status: saved, ⚠️ not saved
  • Relationship count
  • Link count to existing records

Expand a card to review details and make changes. Use the per-entity actions:

  • Edit (✏️) — Change the entity name or type
  • Delete (🗑️) — Remove the entity
  • Save (💾) — Save this entity to your workspace

Review and fix relationships

Inside the entity card, open the relationships section to validate links. You can:

  • Delete a relationship
  • Change the source entity, target entity, or relationship label using dropdowns
  • Link the extracted entity to an existing record using the search icon and a pasted record link

Treat relationships as suggestions. Confirm them against the evidence before saving.


Step 5 — Review the summary and graph

Open the Summary tab for a high-level view of the extraction. This tab includes:

  • Statistics: total entities, relationships, and links to existing records
  • Annotated text: the document text with highlighted entities
  • Interactive graph: a node-edge graph of entities and relationships, color-coded by entity type
  • Word clouds: frequent words and verbs

Use pan and zoom to navigate the graph. Saved entities appear in full color. Unsaved entities appear lighter.


Understanding the Output

After extraction completes, you work with two main outputs:

  • Entities list (cards): use this to validate and correct the extracted data. The saved status icons ( and ⚠️) help you track what has been written back to the workspace.
  • Summary views: use these to understand coverage and context:
    1. Statistics help you assess extraction size and completeness.
    2. Annotated text helps you verify where each entity was found.
    3. The interactive graph helps you spot clusters, key actors, and unexpected connections.
    4. Word clouds help you identify recurring topics and actions. Color-coding indicates entity type. Relationship labels indicate the relationship type the app inferred. Review relationship labels carefully before saving.

Saving and Exporting Results

Use the left sidebar save controls after you review the extraction. Start by selecting:

  • The target Workspace
  • A folder path inside that workspace Then choose one or both save options:
  • Save Entities & Relationships: saves extracted entities and relationships as permanent workspace records. Saved records become searchable and reusable across Octostar.
  • Create a Graph: exports the entities and relationships as an interactive Link Chart. You can set a custom filename before creating it. The chart is saved to the selected workspace folder.

Save incrementally when possible. You can save individual entities from the Entities tab before saving the full set.


Tips for Best Results

  • Write a specific Extraction Goal when you have a focused task. A narrow goal reduces noise.
  • Review relationships before saving. Relationship extraction is the most error-prone part of the output.
  • Use concept and keyword filters in Entities when the extraction is large.
  • Link extracted entities to existing records to reduce duplicates and improve consistency across cases.
  • Save high-confidence entities as you review them instead of waiting until the end.
  • If a file shows (no readable content), confirm the file contains selectable text or that OCR/transcription is enabled for your deployment.

Known Limitations

  • Results can vary between runs. Processing the same document twice may produce different entities or relationships.
  • Relationships require manual review. The app can infer incorrect or overly broad links.
  • Very large graphs may not render interactively. Graphs with more than 5,000 combined nodes and edges do not display as an interactive visualization.
  • Files may take a moment to be recognized after dropping them before extraction can begin.