Platform & Infrastructure

Cloud-native AI platform capabilities and infrastructure.

37 Projects 4 Subcategory 45 Tags

TrackedUnavailableArchivedInactive

Kubernetes-native AI infrastructure and scheduling.

Agent Substrate

Kubernetes-based system for managing agent workloads at scale, multiplexing many stateful actors onto fewer pods with sub-second activation and persistent state.

-- Loading score

Fluid

Elastic data abstraction and acceleration layer for BigData/AI applications on Kubernetes, enabling efficient data access through distributed caching.

-- Loading score

HolmesGPT

An AI agent platform for cloud-native environments that automates alert investigation, root cause analysis, and remediation suggestions.

-- Loading score

k8sgpt

An AI tool that provides diagnostic and analysis capabilities for Kubernetes, using LLM to locate and explain cluster issues.

-- Loading score

NVIDIA AI Cluster Runtime

NVIDIA AI Cluster Runtime (AICR) generates optimized, validated, and reproducible GPU-accelerated Kubernetes cluster configurations for AI training and inference.

-- Loading score

NVIDIA GPU Operator

NVIDIA GPU Operator automates deployment, configuration, and management of GPU components and drivers in Kubernetes.

-- Loading score

Olares

An open-source personal cloud OS built on Kubernetes, enabling self-hosted AI agents, local model serving, and private data sovereignty.

-- Loading score

Open WebUI

A scalable, feature-rich web interface for interacting with large language models, providing a ChatGPT-like experience with support for multiple models and customization options.

-- Loading score

Volcano

Volcano is a Kubernetes-native batch scheduling system (a CNCF project) that enhances kube-scheduler with advanced features for batch, HPC, and AI workloads.

-- Loading score

Data platforms, lakehouse stacks, and data services.

3FS

A high-performance distributed file system designed for AI training and inference workloads, optimizing parallel I/O and data locality to support large-scale training.

-- Loading score

AIPyApp

An open-source tool that integrates an interactive Python environment with LLMs for natural-language-driven Python execution and automation.

-- Loading score

Apache Doris

Apache Doris is an easy-to-use, high-performance unified analytics database for real-time and offline analysis.

-- Loading score

Apache Iceberg

A high-performance table format for huge analytic tables, offering snapshots, transactions and multi-engine compatibility. Widely used in AI data pipelines and ML feature stores.

-- Loading score

Apache Spark

A unified analytics engine for large-scale data processing, supporting batch, streaming and machine learning workloads.

-- Loading score

Apache Superset

An open-source data visualization and exploration platform supporting interactive dashboards, SQL-based analysis, and multiple data sources.

-- Loading score

cuDF

A GPU DataFrame library for accelerating data analysis and tabular computing with GPU acceleration.

-- Loading score

CVAT

CVAT is an industry-leading computer vision annotation tool suitable for annotation at any scale.

-- Loading score

Dagster

A cloud-native orchestration and development platform for data assets, with strong observability and a developer-friendly programming model.

-- Loading score

Dask

Dask is a Python library for parallel computing and task scheduling, suited for scaling NumPy, Pandas and machine learning workloads across clusters.

-- Loading score

Datachain

ETL, analytics, and versioning for unstructured data to build reproducible and auditable data pipelines.

-- Loading score

DataFlow

A data preparation and pipeline platform for domain training and retrieval-augmented generation.

-- Loading score

DocuTranslate

DocuTranslate is a lightweight document translation tool leveraging LLMs and multiple parsing engines.

-- Loading score

DuckDB

An analytical, in-process SQL database suited for interactive queries, ETL, and local analytics.

-- Loading score

Jupyter Notebook

Interactive computing environment widely used for data science and machine learning development.

-- Loading score

Label Studio

Label Studio is a multi-type data labeling and annotation tool with standardized output formats.

-- Loading score

Proton

Proton is a single-binary C++ high-performance SQL stream processing engine designed for real-time analytics and stream ETL.

-- Loading score

Unstract

A no-code LLM platform to convert unstructured documents into structured data and quickly launch APIs and ETL pipelines.

-- Loading score

Valkey

A high-performance distributed key-value database optimized for caching and real-time workloads.

-- Loading score

Security policy, access control, and compliance tooling.

Agent Governance Toolkit

A toolkit for policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering for autonomous AI agents, covering all 10 OWASP Agentic Top 10 risks.

-- Loading score

Deployment pipelines and operations tooling.

Platform & Infrastructure

Agent Substrate

Fluid

HolmesGPT

k8sgpt

NVIDIA AI Cluster Runtime

NVIDIA GPU Operator

Olares

Open WebUI

Volcano

3FS

AIPyApp

Apache Doris

Apache Iceberg

Apache Spark

Apache Superset

cuDF

CVAT

Dagster

Dask

Datachain

DataFlow

DocuTranslate

DuckDB

Jupyter Notebook

Label Studio

Proton

Unstract

Valkey

Agent Governance Toolkit

CSGHub

KitOps

LMDeploy

MLC LLM

MLflow

MLRun

MLX

workerd