Why this exists

Entis is an entity enrichment API that uses AI entity extraction to turn raw text into structured profiles of people, companies, and products. I was building daily digests about startups, investments, and tech news. Every article was full of names I did not recognize: companies I had never heard of, founders I could not place, products that might be relevant or might be noise. I kept stopping to Google each one. Open a tab, search, scan the results, go back to reading. Ten unknown names per article, ten interruptions.

The thought was simple: what if I could hover over any name and instantly see what it is? Not a Wikipedia link, but a structured profile. This is a Series B fintech founded in 2021, backed by Sequoia, 50 employees, here is their website. That startup mentioned in passing? It is actually a direct competitor to a tool I use. That person quoted in the article? They are an investor who also funded three other companies in this space.

That is what Entis does. It started as a tool for my own reading workflow and grew into a general-purpose entity enrichment engine.

What it does

You have a document. A news article, a research report, a conversation transcript. Inside it, there are names: people, companies, startups, brands, cryptocurrencies. Right now they are just words on a page. You know they matter, but to understand how, you would need to look each one up manually.

Entis reads that document and identifies every real-world entity in it. Not with pattern matching or keyword lists, but semantically, by understanding context. It knows that "Apple" in a tech article is a company, not a fruit. It recognizes a startup mentioned casually by name without the word "startup" anywhere near it.

Then it goes further. For each entity it finds, Entis returns a structured profile: website, social accounts, description, founding date, key people, funding info, whatever is relevant for that entity type. If the entity is already in its database, the profile comes back instantly. If not, Entis dispatches an AI research agent that searches the web, compiles the data, saves it for next time, and returns the enriched result.

You send in plain text. You get back a document where every important entity is annotated with live, structured context. A dry report becomes an interactive knowledge layer.

Results & Impact

Transforms unstructured text into structured intelligence. A document that would take hours to research manually, looking up each company, each person, each product, gets enriched in seconds.

Self-improving database with over 2,700 entities already catalogued: 1,370 products and startups, 1,133 persons, 184 companies, 73 brands. Every enrichment query that hits the web saves the result to PostgreSQL. The more you use it, the faster it gets. Common entities are served from cache instantly.

Supports 9 entity types with type-specific schemas: persons, companies, brands, startups, products, apps, browser extensions, platform plugins, and digital templates. Each type returns fields relevant to that category, not a generic blob.

Key features

Semantic entity extraction. AI-powered context understanding, not regex. Identifies entities by meaning: "Apple" is a company in a tech context, a fruit in a recipe. Handles informal mentions, abbreviations, and aliases.
Auto-enrichment. unknown entity? Entis dispatches an AI agent to research it on the web, compile structured data, save to database, and return the profile. Database grows with every query.
9 entity types. person, company, brand, startup, product, app, extension, plugin, template. Each with its own schema and type-specific fields. Extensible for new types.
Sentiment & relevance. beyond entity extraction: analyze sentiment of texts, score content relevance, summarize documents, analyze discussions. A toolkit for text intelligence.
URL metadata. extract structured metadata from any URL: title, description, Open Graph data, key content. Feed it a link, get back what matters.
REST API. clean versioned API (/v1/). Optional API key auth, rate limiting, request logging. Stateless design, works with or without the database.

How it works

Entis is a FastAPI service that sits between your application and AI. When you send text to the extraction endpoint, Claude analyzes the content and identifies entities with their types and context.

For enrichment, the flow is: check PostgreSQL for a cached profile. If found, return it. If not, or if the profile is incomplete, dispatch a research task. The AI agent uses web search (via Web Surfer) to find current information, structures it into the type-specific schema, saves to the database, and returns the result.

The database has separate tables for each major entity category (persons, companies, brands, products) with JSONB fields for flexible data storage. Relationships between entities are tracked in a dedicated table, so you can map connections like "person X founded company Y which makes product Z".

Stack: Python, FastAPI, PostgreSQL with JSONB, Claude API for AI research, Web Surfer for web data. Runs as a single uvicorn process.

Quick Start

Entis runs as a Python service. Database is optional.

# Install
cd entis
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env

# Run
uvicorn main:app --host 127.0.0.1 --port 8200

# Extract entities from text
curl -X POST http://localhost:8200/v1/enrich/entity \
  -H "Content-Type: application/json" \
  -d '{"text": "Elon Musk announced new Tesla features"}'

# Enrich a specific entity
curl -X POST http://localhost:8200/v1/enrich \
  -H "Content-Type: application/json" \
  -d '{"name": "Stripe", "type": "company"}'

Use Cases

For content platforms. Turn articles into interactive documents. Feed published articles through Entis. Every company, person, and product mentioned gets a structured profile. Readers see context without leaving the page. A name becomes a knowledge card.

For research teams. Map the landscape from a single report. Drop a market research document into Entis. Get back every startup, investor, and product mentioned, each with current data. What would take a junior analyst hours of Googling happens in one API call.

For AI agent pipelines. Give your agents world knowledge. An agent processing text can call Entis to understand who and what is being discussed. Instead of treating names as opaque strings, the agent gets structured context: this is a Series B fintech company founded in 2021 with 50 employees.

Lessons Learned

Type-specific schemas matter more than generic profiles. The first version stored everything as a flat JSON blob. A person and a company had the same fields. It was technically simpler but practically useless: when you look up a startup, you want funding, team size, launch date. When you look up a person, you want their role, their companies, their social profiles. Splitting into type-specific tables with type-specific fields made the output immediately useful instead of requiring manual parsing.

The cache is the product. The first few hundred queries are slow because everything hits the web. But once you have 2,700+ entities, most lookups are instant. The database becomes a curated knowledge base that grows with every use. After a few months of daily digests, the system knows the entire startup ecosystem you follow.

Semantic extraction beats regex by orders of magnitude. Early prototypes used pattern matching to find entity names. They missed informal mentions, abbreviations, and context-dependent names. AI-powered extraction understands that "the YC-backed team" refers to a startup even without naming it. The accuracy difference is not incremental, it is the difference between a useful tool and a frustrating one.

FAQ

What does Entis do? Entis takes raw text and identifies real-world entities in it: people, companies, startups, brands, products, cryptocurrencies. Then it enriches each entity with structured data from the web: websites, social profiles, descriptions, founding dates, funding info.

How does entity extraction work? Entis uses AI to understand context, not regex pattern matching. It recognizes that "Apple" in a tech article is a company, not a fruit. The extraction is semantic, meaning-based, which handles ambiguity and informal mentions.

Where does the enrichment data come from? First, Entis checks its PostgreSQL database for known entities. If missing or incomplete, it dispatches an AI research agent that searches the web, compiles structured data, saves it to the database, and returns the enriched profile. The database grows with every query.

What entity types are supported? Person, company, brand, startup, product, app, browser extension, platform plugin, and digital template. Each type has its own structured schema with type-specific fields.

Can Entis work without a database? Yes. The PostgreSQL database is optional. Without it, Entis still performs AI-powered enrichment for every request but does not cache results. With the database, known entities are served instantly and new discoveries are saved.