AI Data Warehouse Access with MCP and Trino

Every organization wants AI assistants to answer questions from their data. The problem is that data lives everywhere: PostgreSQL for transactions, MySQL for legacy systems, S3 for analytics, Kafka for streams. Writing custom integrations for each source is unsustainable. mcp-trino solves this by exposing Trino’s federated query engine to AI assistants through MCP.

What is MCP?

MCP (Model Context Protocol) is a standard for AI assistants to interact with external tools and data sources. Instead of every AI vendor building proprietary integrations, MCP provides a common interface. OpenAI, Google, and Microsoft adopted it in 2025.

An MCP server exposes capabilities through three primitives:

Tools: Functions the AI can call (execute SQL, list schemas)
Resources: Data the AI can read (query results, table definitions)
Prompts: Pre-built instructions for common tasks

The AI assistant spawns the MCP server as a subprocess, communicates over stdio, and calls tools as needed. No API keys, no network configuration, no cloud dependencies.

What is Trino?

Trino is a distributed SQL query engine. It does not store data. Instead, it connects to your existing data sources through connectors and lets you query them with standard SQL.

graph TD
    T[Trino] --> PG[PostgreSQL]
    T --> MY[MySQL]
    T --> S3[S3/Object Storage]
    T --> KA[Kafka]
    T --> IC[Iceberg]

One SQL query can join data across PostgreSQL, MySQL, and S3. Trino handles the federation, optimization, and execution. Your AI assistant does not need to know where data lives; it just writes SQL.

Common connectors include:

Relational: PostgreSQL, MySQL, SQL Server, Oracle
Data Lakes: Iceberg, Delta Lake, Hive, S3
Streaming: Kafka, Kinesis
NoSQL: MongoDB, Elasticsearch, Redis

mcp-trino: The Bridge

mcp-trino is an MCP server that exposes Trino to AI assistants. Install it, configure your Trino connection, and your AI can query your data warehouse.

Install:

go install github.com/txn2/mcp-trino@latest

Configure Claude Code:

claude mcp add --transport stdio mcp-trino -- \
  mcp-trino --host trino.example.com --port 8080 --user analyst

Your AI assistant now has access to every data source connected to your Trino cluster.

How It Works

graph LR
    AI[AI Assistant] -->|MCP stdio| MCP[mcp-trino]
    MCP -->|SQL/HTTP| TR[Trino]
    TR --> DS[Data Sources]

The AI assistant spawns mcp-trino as a subprocess. When it needs data, it calls the execute_sql tool with a query. mcp-trino sends the query to Trino over HTTP, formats the results, and returns them to the AI.

Core tools:

execute_sql: Run any SQL query against Trino
list_catalogs: Show available data sources
list_schemas: Show schemas in a catalog
list_tables: Show tables in a schema
describe_table: Get column definitions

Read-only by default. The AI can explore and query but not modify data.

Standalone Usage

Configure mcp-trino with environment variables or command-line flags:

export TRINO_HOST=trino.example.com
export TRINO_PORT=8080
export TRINO_USER=analyst
export TRINO_CATALOG=hive
export TRINO_SCHEMA=analytics

mcp-trino

Or pass flags directly:

mcp-trino --host trino.example.com --port 8080 --user analyst

Example Conversation

Here is Claude Code with mcp-trino answering questions from a data warehouse:

> What were the top 10 products by revenue last quarter?

I'll query the sales data to find the top products.

⏺ mcp-trino:execute_sql
  ⎿  Running: SELECT product_name, SUM(revenue) as total_revenue
              FROM hive.analytics.sales
              WHERE sale_date >= DATE '2025-10-01'
                AND sale_date < DATE '2026-01-01'
              GROUP BY product_name
              ORDER BY total_revenue DESC
              LIMIT 10

Here are the top 10 products by revenue for Q4 2025:

| Product              | Revenue      |
|---------------------|--------------|
| Enterprise License  | $2,340,000   |
| Professional Plan   | $1,890,000   |
| Team Subscription   | $1,245,000   |
...

The Enterprise License drove the most revenue at $2.34M,
followed by Professional Plan at $1.89M.

> How does that compare to the same quarter last year?

⏺ mcp-trino:execute_sql
  ⎿  Running: WITH current AS (...), previous AS (...)
              SELECT c.product_name, c.revenue, p.revenue as prev_revenue,
                     (c.revenue - p.revenue) / p.revenue * 100 as growth
              FROM current c JOIN previous p ON ...

Year-over-year comparison for Q4:

| Product              | Q4 2025      | Q4 2024      | Growth  |
|---------------------|--------------|--------------|---------|
| Enterprise License  | $2,340,000   | $1,950,000   | +20.0%  |
| Professional Plan   | $1,890,000   | $2,100,000   | -10.0%  |
...

Enterprise License grew 20% year-over-year. Professional Plan
declined 10%, which may warrant investigation into pricing or
competitive pressure.

The AI writes SQL, executes it against Trino, and interprets the results. No manual data exports, no dashboard building, no waiting for analyst availability.

mcp-trino as a Go Library

This is where mcp-trino differs from most MCP servers. It is not just a binary; it is a composable Go library. Import it into your own MCP server and extend it with custom middleware.

package main

import (
    "github.com/txn2/mcp-trino/pkg/trino"
    "github.com/mark3labs/mcp-go/server"
)

func main() {
    // Create the Trino MCP tools
    trinoTools := trino.NewTools(trino.Config{
        Host:    "trino.example.com",
        Port:    8080,
        User:    "analyst",
        Catalog: "hive",
        Schema:  "analytics",
    })

    // Create your MCP server
    s := server.NewMCPServer("my-data-server", "1.0.0")

    // Add Trino tools with your middleware
    for _, tool := range trinoTools.GetTools() {
        s.AddTool(tool, withAuditLog(withAuth(trinoTools.Handler(tool))))
    }

    // Add your own custom tools
    s.AddTool(myCustomTool, myCustomHandler)

    s.Start()
}

graph TD
    CS[Your Custom MCP Server] --> MT[mcp-trino library]
    CS --> YH[Your Handlers]
    MT --> AU[Auth Middleware]
    MT --> AD[Audit Middleware]
    MT --> FI[Query Filter]

What You Can Add

Authentication: Verify the user before executing queries.

func withAuth(next server.ToolHandler) server.ToolHandler {
    return func(ctx context.Context, req server.ToolRequest) (*server.ToolResponse, error) {
        user := getUserFromContext(ctx)
        if !isAuthorized(user) {
            return nil, errors.New("unauthorized")
        }
        return next(ctx, req)
    }
}

Audit logging: Record every query for compliance.

func withAuditLog(next server.ToolHandler) server.ToolHandler {
    return func(ctx context.Context, req server.ToolRequest) (*server.ToolResponse, error) {
        log.Printf("User %s executing: %s", getUser(ctx), req.Params["query"])
        return next(ctx, req)
    }
}

Query filtering: Block dangerous patterns or enforce row-level security.

func withQueryFilter(next server.ToolHandler) server.ToolHandler {
    return func(ctx context.Context, req server.ToolRequest) (*server.ToolResponse, error) {
        query := req.Params["query"].(string)
        if containsDangerousPattern(query) {
            return nil, errors.New("query blocked by policy")
        }
        return next(ctx, req)
    }
}

Custom tools: Add domain-specific operations alongside Trino.

You control the server. No forking, no patching, no waiting for upstream features.

Why Composable MCPs Matter

Most MCP servers are monolithic binaries. You configure them, run them, and hope they do what you need. When they do not, you fork the project or file a feature request.

Composable MCPs are building blocks. mcp-trino provides the Trino integration; you provide the policy layer. This matters for enterprise adoption:

Security: Add authentication without modifying mcp-trino source
Compliance: Audit every query without custom logging patches
Integration: Combine multiple MCP tools in one server
Customization: Add company-specific tools alongside standard ones

The MCP ecosystem is young. The projects that provide composable libraries will become infrastructure. The ones that only ship binaries will be replaced when requirements change.

Quick Trino Setup

If you do not have Trino running, start one locally with Docker:

docker run -d -p 8080:8080 --name trino trinodb/trino

This gives you a single-node Trino with memory and TPC-H connectors for testing. Point mcp-trino at localhost:8080 and you can immediately query sample data.

For production, see the Trino documentation for connector configuration. Common setups include:

PostgreSQL connector for transactional data
Hive/Iceberg connector for data lake tables
Kafka connector for streaming data

Each connector requires adding a properties file to Trino’s etc/catalog/ directory.

Why MCP is the New Digital Transformation

APIs were the last wave of integration. RESTful services let systems talk to each other, enabling the SaaS explosion of the 2010s. But APIs require developers to write integration code. Every new data source means new endpoints, new clients, new maintenance.

MCP is the AI-native interface layer. Instead of developers writing integration code, AI assistants call standardized tools. The integration happens at runtime, driven by natural language.

This changes the economics of data access:

Before: Analysts wait for engineers to build dashboards
After: Analysts ask questions directly to AI assistants with data access

mcp-trino is one piece of this shift. It gives AI assistants access to federated data without custom development. Combined with other MCP servers for CRM, ticketing, and observability, organizations can build AI assistants that understand their entire operation.

The companies building composable MCP infrastructure now will have significant advantages when AI assistants become the primary interface for enterprise data.

Resources

Note: This blog is a collection of personal notes. Making them public encourages me to think beyond the limited scope of the current problem I'm trying to solve or concept I'm implementing, and hopefully provides something useful to my team and others.

This blog post, titled: "AI Data Warehouse Access with MCP and Trino: Building composable MCP servers for enterprise data" by Craig Johnston, is licensed under a Creative Commons Attribution 4.0 International License.