FlowGPT Project

The Challenge

Users needed a single, unified interface to interact with various LLM providers (OpenAI, Anthropic, etc.) without switching platforms. The system required handling varying API schemas, maintaining long-term conversational memory, and processing file uploads for RAG-based queries efficiently.

The Solution

I architected a platform that integrates multiple LLMs into a seamless conversational UI:

  • Unified Backend: Abstracted different LLM APIs into a single standardized interface.
  • Contextual Memory: Implemented a vector-store memory system to retrieve relevant past interactions.
  • Advanced RAG: Built a Retrieval Augmented Generation pipeline for file analysis and context-driven answers.
  • Autonomous Agents: Deployed specific agents for complex tasks like "Deep Research".

Data Storage Architecture

  • Chat History - JSON Blob via Turso:
    All user-generated chats are stored in a structured JSON blob format using Turso, a distributed edge database optimized for low-latency reads and writes.
  • - Format: chat_id, user_id, name, messages[], timestamp
  • - Justification: JSON format allows flexibility in storing metadata like token usage, model responses, and user edits.
  • Users & Sessions: SQL with Clerk:
    User authentication, session persistence, and metadata (like roles and permissions) are managed via Clerk over a traditional SQL-based database
  • - Schema: Users, Sessions, API keys, OAuth tokens
  • - Security: Built-in 2FA, passwordless auth, session encryption.
  • Files & Images: Object Storage via UploadThing: User-uploaded assets (files, images, media) are stored using UploadThing on object storage systems (e.g., AWS S3 under the hood).
  • - CDN-enabled for fast access
  • - Metadata (file type, size, access history) linked to SQL
  • Database: MongoDB, Redis
  • Cloud: AWS (EC2, S3), Docker

Challenges

  • Integrating multiple LLMs with varying APIs and response formats.
  • Optimizing real-time communication for low latency.
  • Managing contextual memory across long user sessions.

The Result

The FlowGPT platform achieved a 95% user satisfaction rate in beta testing, with an average response time of under 200ms. It successfully handled 10,000+ concurrent users and supported multilingual interactions, proving the scalability of the custom architecture.

Back to Portfolio