LocalAI
Model servingAn open-source, OpenAI-compatible API you can run locally over many model backends.
LocalAI is a drop-in, OpenAI-compatible API that runs on your own hardware. Beyond text it covers image generation, speech, and embeddings, across several inference backends, with no GPU required.
Its appeal is being a single self-hosted endpoint that replaces a hosted API across multiple modalities.
Where it's ideally used
A fit when you want one self-hosted, OpenAI-compatible endpoint covering text, images, and audio.
Where it doesn't fit
Its breadth across modalities is unnecessary when you only need fast, focused text inference.