Replicate
Model servingA platform for running and fine-tuning open models behind a simple hosted API.
Replicate hosts a large catalogue of open models — across text, image, audio, and video — and runs any of them behind a clean API, billing by the second of compute used.
You can also package and deploy your own models, making it a fast path from "found a model" to "calling it in production".
Where it's ideally used
A fit when you want to use or fine-tune open models, especially multimodal ones, through one hosted API.
Where it doesn't fit
Per-second compute billing can get expensive at sustained high volume, where dedicated serving is cheaper.