Open-source sandbox for exploring everything Pixeltable can do
Pixelbot wires up tables, views, computed columns, embedding indexes, UDFs, tool calling, similarity search, version control, and model orchestration into a single full-stack app โ so we can stress-test Pixeltable and ship what we learn as cookbooks.
Chat โ Multimodal RAG agent
Semantic search across documents, images, video frames, and audio via .similarity() on embedding indexes. Tool calling with external APIs (NewsAPI, yfinance, DuckDuckGo). Inline image generation (Imagen 4.0 / DALL-E 3), video generation (Veo 3.0), and text-to-speech (OpenAI TTS with 6 voice options). Follow-up suggestions via Gemini structured output with response_schema. Personas with adjustable system prompts and LLM parameters. Persistent chat history and memory bank.
Prompt Lab โ Multi-model experimentation
Run the same prompt against Claude, Gemini, Mistral, and GPT-4o in parallel via ThreadPoolExecutor. Editable model IDs โ override presets or add custom models. Response time, word count, and character count metrics with "Fastest" highlight and normalized comparison bars. Every experiment stored in agents.prompt_experiments for replay.
Studio โ File explorer + data wrangler
- Documents: Auto-summaries (Gemini structured JSON), sentence-level chunks
- Images: PIL transforms with live preview, save or download
- Videos: Keyframe extraction, clip creation, text overlay, scene detection, transcriptions
- Audio: Transcriptions with sentence-level breakdown
- CSV: Inline CRUD, infinite undo via
table.revert(), version history viatable.get_versions() - Detection & Segmentation: On-demand DETR (ResNet-50/101) with SVG bounding boxes, DETR Panoptic segmentation with color-coded regions, ViT classification with confidence bars
- Search: Cross-modal semantic search via
.similarity()on embedding indexes - Embedding map: Interactive 2D UMAP projection of text/visual embedding spaces
Media Library โ Gallery + AI editing
Gallery for generated images and videos. Save to collection triggers CLIP embedding, keyframe extraction, transcription, and RAG indexing automatically. Reve AI editing via reve.edit() (natural language instructions) and reve.remix() (creative blending) with side-by-side preview.
Developer โ Export, API reference, SDK, MCP
- Export: Download any table as JSON, CSV, or Parquet with row-limit control and live preview
- API: Categorized endpoint browser with method badges and expandable curl examples
- SDK: Python code snippets โ connect, query, semantic search, export to Pandas, versioning
- Connect: MCP server config for Claude/Cursor, direct Python access, REST API examples
Database โ Catalog explorer
Tables and views grouped by type (Agent Pipeline, Documents, Images, Videos, Audio, Generation, Memory, Data Tables). Schema inspection with computed vs. insertable column badges. Paginated row browser with client-side search, row filter, and CSV download. Cross-table join panel (INNER/LEFT/CROSS) with table/column pickers and result preview.
Architecture โ Interactive diagram
React Flow diagram with 38 nodes and 40 edges in swim-lane layout. Click any node to highlight its connections. Covers the full data flow: document chunking, image CLIP, video dual pipeline, audio transcription, 11-step agent pipeline, generation, and feedback edges.
History & Memory
Searchable conversation history with workflow detail dialog and JSON export. Unified timeline across all timestamped Pixeltable tables. Memory bank with semantic search and manual entry.
Every row maps to a Pixeltable feature exercised in this app:
| Feature | Usage | Docs |
|---|---|---|
| Tables + multimodal types | Document, Image, Video, Audio, Json |
Tables |
| Computed columns | 11-step agent pipeline, thumbnails, summarization | Computed Columns |
| Views + iterators | DocumentSplitter, FrameIterator, AudioSplitter |
Iterators |
| Embedding indexes | E5-large-instruct, CLIP ViT-B/32 โ .similarity() |
Embedding Indexes |
@pxt.udf |
News API, financial data, context assembly | UDFs |
@pxt.query |
search_documents, search_images, search_video_frames |
RAG |
pxt.tools() + invoke_tools() |
Agent tool selection + execution | Tool Calling |
| Agent memory | Chat history + memory bank with embedding search | Memory |
| LLM integrations | Anthropic, Google, OpenAI, Mistral | Integrations |
| Reve AI | reve.edit() / reve.remix() for image editing |
Reve |
| PIL transforms | Resize, rotate, blur, sharpen, edge detect | PIL |
| Video UDFs | extract_frame, clip, overlay_text, scene_detect_content |
Video |
| Document processing | Gemini structured-JSON summarization, chunking | Chunking |
| CSV / tabular data | Dynamic table creation, inline CRUD, type coercion | CSV Import |
| Object detection | On-demand DETR with bounding box overlay | Detection |
| Panoptic segmentation | DETR Panoptic with color-coded segment regions | Segmentation |
| Text-to-speech | OpenAI TTS computed column with 6 voice options | TTS |
| Cross-table joins | table.join() with inner/left/cross modes |
Joins |
| Table versioning | tbl.revert(), tbl.get_versions() |
Versioning |
| Structured output | Gemini response_schema + Pydantic models |
Structured Output |
| Catalog introspection | pxt.list_tables(), tbl.columns(), tbl.count() |
Tables |
| Data export | JSON, CSV, Parquet via /api/export/ |
Export |
| MCP | Config for Claude, Cursor, AI IDEs | MCP |
Prerequisites: Python 3.10+, Node.js 18+
Required: ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY
Optional: MISTRAL_API_KEY, REVE_API_KEY, NEWS_API_KEY
All providers are swappable. Pixeltable supports local runtimes and 20+ integrations.
# Install
cd backend && python -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt
cd ../frontend && npm install
# Configure โ create backend/.env with your API keys
# Run
cd backend && python setup_pixeltable.py # first time only
python main.py # :8000
cd ../frontend && npm run dev # :5173 โ proxies /api to :8000Production: cd frontend && npm run build โ backend/static/, then python main.py serves at :8000.
backend/
โโโ main.py FastAPI app, CORS, static serving
โโโ config.py model IDs, system prompts, LLM parameters
โโโ models.py Pydantic request/response schemas
โโโ functions.py @pxt.udf and @pxt.query definitions
โโโ setup_pixeltable.py full schema (tables, views, columns, indexes)
โโโ routers/
โโโ chat.py 11-step agent workflow
โโโ studio.py transforms, detection, segmentation, CSV, Reve, embeddings
โโโ images.py Imagen/DALL-E/Veo generation, TTS
โโโ experiments.py parallel multi-model prompt runs
โโโ export.py JSON/CSV/Parquet for any table
โโโ database.py catalog introspection, timeline, joins
โโโ files.py upload, URL import
โโโ history.py conversation detail, debug export
โโโ memory.py memory bank CRUD
โโโ personas.py persona CRUD
frontend/src/
โโโ components/
โ โโโ chat/ agent UI, personas, image/video/voice modes
โ โโโ experiments/ prompt lab, model select, metrics
โ โโโ studio/ file browser, transforms, CSV, detection, segmentation, embedding map
โ โโโ developer/ export, API reference, SDK snippets, MCP config
โ โโโ database/ catalog browser, search, filter, download, joins
โ โโโ architecture/ React Flow diagram (38 nodes, swim lanes)
โ โโโ images/ media library, Reve edit/remix
โ โโโ history/ conversations, timeline
โ โโโ memory/ memory bank
โ โโโ settings/ persona editor
โโโ lib/api.ts typed fetch wrapper
โโโ types/index.ts shared interfaces
| Project | Description |
|---|---|
| Pixeltable | The core library โ declarative AI data infrastructure |
| Pixelagent | Lightweight agent framework with built-in memory |
| Pixelmemory | Persistent memory layer for AI apps |
| MCP Server | Model Context Protocol server for Claude, Cursor, AI IDEs |
Rough edges are expected. If you find a Pixeltable feature that's missing or awkward, open an issue or PR.
Apache 2.0 โ see LICENSE.