GOZORA Brain

AI-Powered Knowledge Operating System

"Capture Once. Remember Forever. Retrieve Instantly."

GOZORA is a personal, AI-powered knowledge operating system designed to eliminate information overload. It serves as a unified second brain that captures, synthesizes, and retrieves anything you consume.

Core Capabilities

Universal Ingestion: Send the bot any type of content—YouTube links, GitHub repos, web articles, PDFs, documents, voice notes, or text scribbles.
AI Curation & Synthesis: The background processing engine analyzes, structures, and scores incoming content into enriched "Knowledge Objects"—generating high-quality summaries, tagging topics, extracting action items, and evaluating signal-to-noise ratios.
Intelligent Editorial Delivery: Generates curated daily and weekly editions featuring editorial takes, "Build/Learn This Today" guides, open-source picks, and opportunity radars.
Instant Retrieval: Find exactly what you need instantly via hybrid and semantic natural language search.

System Viewport Preview

https://gozora.me/daily

The Challenge

GOZORA: Scalable Knowledge Engine

Key Challenges

Frictionless Ingestion:
- Synchronous Webhook Limits: Handle 30-minute media processing without timeouts
- Network & Rate Limits: Manage rate limits for third-party media downloads and parsing
Multi-Modal Content Extraction & Scraping:
- YouTube Processing: Extract subtitles, timestamps, and metadata without triggering bot-detection blocks
- Web Scraping: Extract clean main-text content from modern SPAs while discarding ads and navigation menus
High-Quality AI Synthesis & Curation:
- JSON Schema Enforcement: Extract complex nested data structures from LLM without parser errors
- Editorial Signal Scoring: Develop a consistent scoring algorithm (1–100) to measure signal-to-noise ratio and significance

Technical Solutions

Graph-Based Concept Mapping (Semantic Links):
- Entity Resolution: Identify and link new captures to existing concepts
- Relationship Extraction: Automate connection discovery between different files, links, or concepts
Dual-Engine Hybrid Search:
- Keyword Search (Lexical): Find exact phrases, URLs, or specific names
- Semantic Search (Vector): Match abstract concepts and conceptual intent
- Unified Ranking: Combine lexical and semantic similarity ranking for highly relevant results

Implementation Process

Phase 1: Architecture Design & Data Modeling
- Designed relational schema with Drizzle ORM for complex knowledge networks
- Configured PostgreSQL with pgvector for embeddings and optimized GIN indices for full-text search
Phase 2: Ingestion Bot & Event-Driven Processing
- Built Telegram bot with GramMY framework for webhook reception
- Set up Redis and BullMQ for asynchronous ingestion queue and worker design
Phase 3: AI Enrichment & Entity Resolution
- Implemented LLM prompt structures with schema enforcement for structured JSON parsing
- Integrated AI evaluation for signal scoring (1-100) and graph linking for entity resolution
Phase 4: Hybrid Search & Retrieval Logic
- Implemented query embedding generation and dual-query matching with Postgres full-text search and pgvector indexes
- Created reranking engine for precise results on abstract and exact-word queries
Phase 5: Automated Curation & Web Client
- Designed curation engine for daily captures, topic grouping, and structured editions
- Built React dashboard with Vite and Lucide React, connected to Hono API endpoints

GOZORA Brain

AI-Powered Knowledge Operating System

Core Capabilities

The Challenge

Key Challenges

Technical Solutions

The Solution

Decoupled Multi-Agent Ingestion Pipeline

The Implementation

The Results

Results

Key Achievements

Project Metrics

Verified Proof

Ready to achieve your own outcomes?