Knowledge & Context

Retrieval, memory, indexing, knowledge organization, and data connectivity.

91 Projects 6 Subcategory 47 Tags
TrackedUnavailableArchivedInactive

Memory layers, context compression, and state management for AI systems.

Acontext

A context data platform for self-learning agents to store, observe, and distill experiences.

-- Loading score

Agentic Context Engine

Agentic Context Engine (ACE) is a framework and implementation for enabling agents to learn from experience through structured context engineering.

-- Loading score

AgentMemory

Persistent memory layer for AI coding agents, enabling cross-session context retention based on real-world benchmarks.

-- Loading score

Basic Memory

A local-first knowledge-as-Markdown system that lets LLMs read and write your memory via the Model Context Protocol (MCP).

-- Loading score

Claude Mem

A Claude Code plugin that captures coding-session context, compresses it with AI, and injects relevant memory into future sessions.

-- Loading score

Letta

Platform for building stateful agents with advanced memory and self-improvement capabilities, supporting both local and cloud deployments.

-- Loading score

LocalRecall

LocalRecall provides a local memory layer and knowledge base management API for agents and RAG scenarios.

-- Loading score

Mem0

Mem0 is a scalable memory layer for AI agents that provides long-term, personalized, and efficient memory storage and retrieval.

-- Loading score

Memanto

An open-source memory layer for AI agents featuring a 7-layer memory architecture that supports short-term, long-term, and semantic memory with RAG-based retrieval.

-- Loading score

MemOS

MemOS is an open-source Memory OS that provides long-term memory capabilities for large language models (LLMs), improving context awareness and long-term consistency.

-- Loading score

memU

memU is an open-source memory framework for AI companions, offering high accuracy, fast retrieval, and low cost for personalized AI experiences.

-- Loading score

OpenHuman

OpenHuman is an open-source personal AI super intelligence assistant focused on privacy, simplicity, and power, featuring 118+ third-party integrations, local memory trees, an Obsidian wiki, and native voice interaction.

-- Loading score

Supermemory

A high-performance, scalable memory engine and app providing a Memory API for storing, retrieving, and interacting with content in the AI era.

-- Loading score

TencentDB Agent Memory

Tencent's local long-term memory system for AI agents, powered by a 4-tier progressive pipeline with zero external API dependencies.

-- Loading score

Vector stores, ANN engines, and similarity search.

Deep Lake

A database for AI optimized for storing, querying and versioning vectors and multimodal data (images, video, audio, text) for LLM and deep learning workflows.

-- Loading score

Faiss

A high-performance library for similarity search and clustering of dense vectors, suitable for large-scale vector retrieval.

-- Loading score

Infinity

An AI-native database that delivers hybrid search over dense vectors, sparse vectors, tensors, full-text and structured data.

-- Loading score

Milvus

Milvus is a high-performance vector database designed for large-scale unstructured data processing.

-- Loading score

pgvector

pgvector is an open-source PostgreSQL extension that adds vector data types and similarity search, supporting exact and approximate search (HNSW, IVFFlat) inside Postgres.

-- Loading score

Qdrant

Discover Qdrant, a high-performance vector search engine that enhances similarity search and scalable deployment for efficient data retrieval.

-- Loading score

SeekDB

An AI-native search database that unifies vector, text, and structured data in a single engine to enable hybrid search and in-database AI workflows.

-- Loading score

sqlite-vector

Integrates embedding storage and vector search into SQLite, providing a cross-platform lightweight vector database extension.

-- Loading score

Chunking, retrieval, reranking, and indexing pipelines.

Airweave

Airweave lets agents search any app by connecting to apps, productivity tools, databases and document stores and turning their contents into searchable knowledge bases.

-- Loading score

BISHENG

An open-source LLM DevOps platform for enterprise scenarios, offering workflows, RAG, model management and observability.

-- Loading score

Chroma

Chroma is an open-source embedding database for AI applications, enabling efficient search, storage, and retrieval for intelligent RAG systems.

-- Loading score

CocoIndex

A high-performance data processing and indexing framework for AI, supporting incremental processing and semantic indexing.

-- Loading score

DB-GPT

DB-GPT is an open-source framework focused on data-native applications, integrating RAG, Text2SQL, and multi-backend adapters to simplify building intelligent database-driven apps.

-- Loading score

DocsGPT

An open-source enterprise document agent platform combining RAG and multi-model support to provide citation-backed answers.

-- Loading score

Embedding Atlas

A tool that provides interactive visualizations for large embeddings, allowing you to visualize, cross-filter, and search embeddings and metadata.

-- Loading score

FastGPT

Discover FastGPT: a powerful platform for seamless data processing and AI workflow orchestration, enabling easy development of advanced question-answering systems.

-- Loading score

FinGPT

Open-source financial large language models with data pipelines, instruction tuning datasets, benchmarks and RAG toolkits.

-- Loading score

Firecrawl

The Web Data API for AI that turns entire websites into clean markdown or structured data for RAG and knowledge pipelines.

-- Loading score

Generative AI on Google Cloud

Sample code and notebooks demonstrating how to build and deploy generative AI workflows on Vertex AI and Gemini.

-- Loading score

GraphRAG

Discover GraphRAG, an open-source project by Microsoft Research for extracting structured knowledge from text, enhancing retrieval and enabling advanced temporal queries.

-- Loading score

Haystack

Haystack is an open-source framework for building retrieval-augmented generation (RAG) and semantic search applications by combining document stores, vector search, and LLMs.

-- Loading score

Khoj

A self-hostable 'second brain' platform that turns web pages and documents into a searchable knowledge base and supports custom agents and automations.

-- Loading score

LanceDB

Developer-friendly, embedded retrieval engine for multimodal AI. Search More; Manage Less.

-- Loading score

LangChain

A framework for building LLM-powered applications with composable components and rich integrations.

-- Loading score

LangChain4j

An open-source Java library that provides a unified API for integrating large language models and vector databases into enterprise Java applications.

-- Loading score

LEANN

Discover LEANN, the innovative AI platform that transforms your laptop into a powerful semantic search tool with zero cloud costs and full privacy.

-- Loading score

LightRAG

LightRAG is a lightweight Retrieval-Augmented Generation toolkit that supports document indexing, graph extraction, and deployable server/core modes.

-- Loading score

LlamaFarm

LlamaFarm is an open-source platform for deploying AI models, agents, vector databases, and RAG pipelines locally or remotely in minutes.

-- Loading score

LlamaIndex

LlamaIndex is a data framework for LLM applications that helps structure and connect private data sources to models for retrieval-augmented generation.

-- Loading score

LocalGPT

A private, on-premise document intelligence platform that combines hybrid retrieval and multi-model inference while keeping all data local.

-- Loading score

Marker

Converts PDF, image, PPTX, DOCX, XLSX, HTML, EPUB files to markdown, JSON, chunks, and HTML quickly and accurately.

-- Loading score

Memori

An open-source SQL-native memory engine that provides persistent, queryable context for Large Language Models.

-- Loading score

Memvid

Encode millions of text chunks into portable MP4 files for millisecond semantic search and offline-first AI memory.

-- Loading score

mgrep

A CLI-native semantic search tool for code, documents and media, with background indexing and agent integrations.

-- Loading score

MineContext

MineContext is a proactive, context-aware AI partner combining Context-Engineering with ChatGPT Pulse to improve dialogue coherence and retrieval in RAG scenarios.

-- Loading score

OpenViking

OpenViking is an open-source context database for AI Agents that unifies memories, resources, and skills with a filesystem paradigm for hierarchical retrieval and observability.

-- Loading score

PageIndex

PageIndex (by Vectify AI) is an open-source reasoning-based document index designed for high-accuracy retrieval over long documents.

-- Loading score

PandaWiki

PandaWiki is an open-source knowledge base system driven by large models, enabling fast building of intelligent documentation, FAQ and blog centers.

-- Loading score

Pathway LLM App

Production-ready templates for RAG and AI pipelines that support live data synchronization and large-scale document indexing.

-- Loading score

Perplexica

Perplexica is an open source AI-powered search engine positioned as an alternative to Perplexity AI.

-- Loading score

RAG-Anything

A multimodal document processing and Retrieval-Augmented Generation (RAG) system supporting unified parsing and intelligent retrieval of text, images, tables, formulas, and more.

-- Loading score

RAGFlow

An open-source RAG engine based on deep document understanding, supporting complex document parsing and knowledge Q&A

-- Loading score

SearXNG

A free, privacy-preserving internet metasearch engine that aggregates results from multiple search services and databases without user tracking.

-- Loading score

SemTools

A command-line toolkit for semantic search, embedding generation, and document parsing for local and CI workflows.

-- Loading score

text-embeddings-inference

Hugging Face's text-embeddings-inference provides an out-of-the-box text vectorization inference service, making it easy to build similarity search and semantic search applications.

-- Loading score

Tongyi DeepResearch

An open research agent and toolset for long-horizon information-seeking and agentic tasks, developed by Tongyi Lab (Alibaba-NLP).

-- Loading score

UltraRAG

A low-code RAG framework based on MCP, emphasizing visual orchestration and reproducible evaluation workflows.

-- Loading score

Unstructured

An open-source ETL solution to convert complex documents into clean, structured formats for language-model workflows.

-- Loading score

Vanna

Vanna is an open-source RAG framework that converts natural language questions into executable SQL and runs them against local databases.

-- Loading score

Vespa

Vespa is a distributed engine designed for online AI and big-data workloads. It excels at low-latency retrieval and inference, supporting vector search, custom scoring, and near-real-time indexing.

-- Loading score

Weaviate

Weaviate is an open-source, cloud-native vector database for storing objects and vectors, enabling scalable semantic search and structured filtering for AI applications.

-- Loading score

WeKnora

WeKnora — an open-source document understanding and retrieval framework from Tencent that combines LLMs and RAG for multimodal document search and knowledge graph construction.

-- Loading score

Wren AI

Open-source GenBI agent for querying databases in natural language and producing SQL, charts and AI-generated insights.

-- Loading score

OCR, parsing, extraction, and document understanding.

Docling

Docling: an open-source framework for document understanding and conversion, supporting PDFs, DOCX, images, audio and more.

-- Loading score

LangExtract

A Python library that uses LLMs to extract structured information from unstructured text and provides interactive visualization for review.

-- Loading score

MinerU

MinerU is a high-precision PDF document parsing tool that converts complex PDFs into machine-readable Markdown and JSON formats, supporting formula, table, image extraction and multilingual OCR.

-- Loading score

pdfly

A command-line tool to extract (meta)data from PDFs and manipulate PDF files at scale.

-- Loading score

pdfplumber

An open-source Python library built on pdfminer.six that exposes detailed PDF objects, table extraction, and visual debugging features.

-- Loading score

PyMuPDF

A high-performance Python library for data extraction, analysis, conversion, and manipulation of PDF and other documents.

-- Loading score

spaCy

A high-performance, production-ready open-source natural language processing library providing pretrained pipelines, training tools, and extensible language components.

-- Loading score

Stirling PDF

An open-source, self-hosted web PDF editor and processing platform that supports a wide range of PDF operations.

-- Loading score

Tesseract OCR

Tesseract is a powerful open-source Optical Character Recognition (OCR) engine supporting over 100 languages, widely used for text extraction and document digitization.

-- Loading score

Entity graphs, relationship modeling, and graph retrieval.

CodeGraph

Pre-indexed code knowledge graph for AI coding agents, supporting Claude Code, Codex, Cursor, and OpenCode with 100% local execution.

-- Loading score

DeepTutor

A multi-agent personalized learning system integrating RAG, knowledge graphs, and interactive visualizations.

-- Loading score

Understand Anything

Turn any code into an interactive knowledge graph you can explore, search, and ask questions about, with native support for Claude Code, Codex, Cursor, Copilot, and Gemini CLI.

-- Loading score

Data ingestion, connectors, and synchronization pipelines.

Airbyte

Open-source data movement platform for ELT pipelines and AI agents, moving data from APIs, databases, and files to warehouses, lakes, and AI applications.

-- Loading score

Crawl4AI

An open-source web crawler and scraper optimized for large language model workflows, producing clean Markdown and structured data with browser control and Docker deployment.

-- Loading score

Data Prep Kit

Data Prep Kit accelerates unstructured data preparation for LLM applications.

-- Loading score

DataTrove

DataTrove provides composable, platform-agnostic pipelines for large-scale text data processing, including extraction, filtering, deduplication and saving.

-- Loading score

Gravitino

A high-performance, geo-distributed and federated metadata lake for unified metadata access and governance of data and AI assets.

-- Loading score

MindsDB

AI's query engine - Platform for building AI that can answer questions over large scale federated data - The only MCP Server you'll ever need.

-- Loading score

OpenMetadata

A unified metadata platform for data discovery, observability and governance with rich connectors and collaboration features.

-- Loading score

pandas

pandas is an open-source Python library for structured data manipulation and analysis, a core dependency in ML and AI data preprocessing workflows.

-- Loading score

Pixeltable

A declarative data infrastructure for multimodal AI workloads that simplifies storage, indexing, and inference.

-- Loading score

Unity Catalog

An open, multimodal catalog for data and AI that provides unified governance, metadata management, and access control.

-- Loading score