Replicate

A platform for running and fine-tuning open models behind a simple hosted API.

Replicate hosts a large catalogue of open models — across text, image, audio, and video — and runs any of them behind a clean API, billing by the second of compute used.

You can also package and deploy your own models, making it a fast path from "found a model" to "calling it in production".

Where it's ideally used

A fit when you want to use or fine-tune open models, especially multimodal ones, through one hosted API.

Where it doesn't fit

Per-second compute billing can get expensive at sustained high volume, where dedicated serving is cheaper.