MEANS, CLASSIFY, TOPICS, SUMMARIZE — 100+ AI operators that work like native SQL. Connect any database. Query from DBeaver, DataGrip, or Tableau. No Python. No notebooks. No new tools.
SELECT name, region,
SENTIMENT(feedback) AS mood
FROM customer_reviews
WHERE feedback IMPLIES
'might cancel soon'
ORDER BY mood;| name | region | mood |
|---|---|---|
| Sarah K. | Northeast | -0.82 |
| James R. | West | -0.71 |
Start with a single semantic filter. By end of week, you're running verified AI aggregations across your entire warehouse.
-- Point at your database
CREATE CONNECTION prod
TYPE postgres HOST 'db.company.com'
DATABASE 'analytics';
-- Query it with meaning
SELECT *
FROM support_tickets
WHERE body MEANS
'billing dispute';SELECT
TOPICS(tweet, 4) AS topic,
CLASSIFY(tweet,
'political, not-political')
AS political,
COUNT(*) AS tweets,
AVG(likes) AS avg_likes
FROM twitter_archive
GROUP BY topic, political;CREATE SEMANTIC OPERATOR
compliance_risk(clause VARCHAR)
RETURNS VARCHAR
PROMPT 'Evaluate this contract clause
for regulatory risk. Return: low,
medium, or high with one sentence
explaining why.
Clause: {{ input.clause }}';
SELECT compliance_risk(clause):takes(3,
'Pick the most conservative')
AS risk_assessment
FROM contracts;Filtering, classification, summarization, parsing, validation, embeddings — all callable from SQL.
Define custom AI functions in SQL DDL — no Python, no deploy pipeline. CREATE it, then SELECT with it.
CREATE SEMANTIC OPERATOR
compliance_check(text VARCHAR)
RETURNS VARCHAR
PROMPT 'Check if this text violates
our compliance policy.
Return: pass, warn, or fail.
Text: {{ input.text }}';Then use it immediately:
SELECT clause, compliance_check(clause)
FROM contracts
WHERE compliance_check(clause)
MEANS 'fail';Scrape web pages, search the internet, parse RSS feeds, or connect any API — all from SQL. Results compose with semantic operators and JOIN with your own data. No ETL. No scripts. No new tools.
Connect any MCP server. Its tools become SQL functions.
-- Scrape competitor pricing, analyze semantically
WITH competitor AS (
SELECT * FROM WEB(
'https://competitor.com/pricing',
tier VARCHAR, price DECIMAL,
features TEXT
)
)
SELECT
tier, price,
features MEANS 'AI-powered' AS has_ai
FROM competitor
WHERE price < 100;-- Connect a service
CREATE MCP SERVER github
COMMAND 'npx'
ARGS '-y @anthropic/github-mcp';
-- Discover its tools with semantic search
SELECT tool_name, description
FROM mcp_server_tools('github')
WHERE description MEANS
'find bugs';
-- Query it — with semantic SQL
SELECT title, assignee,
CLASSIFY(body,
'bug, feature, support') AS type,
SENTIMENT(body) AS frustration
FROM github.issues(
'myorg/api', 'open')
WHERE body MEANS
'performance regression'
ORDER BY frustration ASC;DataRabbit speaks the PostgreSQL wire protocol. Connect from psql, DBeaver, DataGrip, Tableau, or anything that talks to Postgres. Queries run where your data lives.
PostgreSQL, MySQL, Snowflake, BigQuery, ClickHouse, S3, Parquet. Data never leaves your infrastructure.
Use your existing SQL client. Add operators like MEANS, CLASSIFY, or SUMMARIZE alongside standard SQL.
The system fingerprints data shapes — not individual values. A million phone numbers might have 10 formats. 10 LLM calls generate SQL expressions. The expressions run on every future row. No LLM needed.
LLM outputs are non-deterministic. DataRabbit has first-class primitives to make them reliable, auditable, and cost-efficient.
Run N model variations in parallel. An evaluator picks the best. No serial retry loops.
Fingerprints data structures, generates SQL expressions, caches the code not the values. A million rows, ~10 LLM calls. Cost drops toward zero over time.
Every query is embedded and summarized. Search past analyses by meaning. 'Did anyone look at churn by region?' surfaces the answer — and the SQL.
Flag any operator result as 'correct.' Future calls automatically include your validated examples. No ML pipeline. No fine-tuning. Just click 'this was right.'
Free tier includes 5,000 credits. No credit card. Connect any database in 60 seconds. If you already have a SQL client open, you're ready.
Credits scale with AI usage. Cache hits are free. The more you use it, the cheaper it gets.
Works with PostgreSQL, MySQL, Snowflake, BigQuery, ClickHouse, S3, Parquet. Connects via pgwire — use psql, DBeaver, DataGrip, Tableau, or any Postgres-compatible client.