Description
Reference architecture for a text-to-SQL agent built with the AI SDK.
npx boilerapp oss-data-analyst-agent-reference-architectureDocumentation
oss-data-analyst - Open Source AI Data Science Agent [Reference Architecture]
oss-data-analyst is an intelligent AI agent that converts natural language questions into SQL queries and provides data analysis. Built with the Vercel AI SDK, it features multi-phase reasoning (planning, building, execution, reporting) and streams results in real-time.
Note: This is a reference architecture. The semantic catalog and schemas included are simplified examples for demonstration purposes. Production implementations should use your own data models and schemas.
Features
- Multi-Phase AI Agent: Planning → Building → Execution → Reporting workflow
- Real-time Streaming: Live updates during query processing
- Smart Data Analysis: Automated insights and visualizations
- SQL Validation: Syntax checking and security policy enforcement
- Natural Language: Ask questions in plain English
- Modern UI: Built with Next.js, React, and TailwindCSS
- Extensible Tools: Easy to add custom tools and capabilitiets
Quick Start
Prerequisites
- Node.js 20.19.3+
- pnpm 8.15.0+
- AI Gateway API key
Installation
-
Clone the repository
git clone https://github.com/vercel/oss-data-analyst.git cd oss-data-analyst -
Install dependencies
pnpm install -
Set up environment variables
cp env.local.example .env.localEdit
.env.localand add your Vercel AI Gateway key -
Initialize the database
pnpm initDatabaseThis creates a SQLite database with sample data (Companies, People, Accounts)
-
Run the development server
pnpm dev -
Open your browser Navigate to
http://localhost:3000
Build for Production
pnpm build
pnpm start
Sample Schema
This repository includes a sample database schema with three main entities to demonstrate oss-data-analyst's capabilities:
Companies
Represents organizations in your database. Each company has:
- Basic information (name, industry, employee count)
- Business metrics (founded date, status)
- Example: Technology companies, Healthcare organizations, etc.
Accounts
Represents customer accounts or subscriptions tied to companies. Each account includes:
- Account identification (account number, status)
- Financial metrics (monthly recurring value, contract details)
- Relationship to parent company
- Example: Active subscriptions with monthly values ranging from $10k-$50k
People
Represents individual employees or contacts within companies. Each person has:
- Personal information (name, email)
- Employment details (department, title, salary)
- Relationship to their company
- Example: Engineers, Sales representatives, Managers across different departments
How It Works
oss-data-analyst uses a multi-phase agentic workflow:
-
Planning Phase
- Analyzes natural language query
- Searches semantic catalog for relevant entities
- Identifies required data and relationships
- Generates execution plan
-
Building Phase
- Constructs SQL query from plan
- Validates syntax and security policies
- Optimizes query structure
- Finds join paths between tables
-
Execution Phase
- Estimates query cost
- Executes SQL against database
- Handles errors with automatic repair
- Streams results
-
Reporting Phase
- Formats query results
- Generates visualizations (charts, tables)
- Provides natural language explanations
- Performs sanity checks on data
Extending oss-data-analyst
Customizing Prompts
Modify system prompts in src/lib/prompts/:
planning.ts- Planning phase behaviorbuilding.ts- SQL generation logicexecution.ts- Query execution handlingreporting.ts- Results interpretation
Example Queries
Try asking oss-data-analyst (using the sample database):
- "How many companies are in the Technology industry?"
- "What is the average salary by department?"
- "Show me the top 5 accounts by monthly value"
- "Which companies have the most employees?"
- "What is the total revenue for Active accounts?"
- "How many people work in Engineering?"
Using with Production Databases
The default setup uses SQLite for demonstration. To use with Snowflake or other databases:
- Update
src/lib/oss-data-analyst-agent-advanced.tsto import from./tools/executeinstead of./tools/execute-sqlite - Configure your database credentials in
.env.local - Update the semantic catalog in
src/lib/semantic/with your schema definitions
Troubleshooting
Database Not Found
- Run
pnpm initDatabaseto create and seed the database - Check that
data/oss-data-analyst.dbexists
AI Gateway API Errors
- Verify your API key is valid in
.env.local - Check API rate limits and credits
Build Errors
- Run
pnpm installto update dependencies - Check TypeScript errors with
pnpm run type-check - Clear
.nextfolder and rebuild
Prix
Gratuit