Layer

Inference & Hosting

APIs, endpoints, and platforms for serving AI models with low latency and high throughput.

Tools in this layer

2 tools

106.0k

GGML

LLM inference in C/C++ with minimal setup and state-of-the-art performance.

Cloudflare

Fast affordable and global open-source AI inference.