Comet Lab Atlas

Infrastructure layer

Observability & evals

Tracing, evaluation, and monitoring — knowing whether the AI is actually working.

11 tools 11 with full write-ups Open this layer in the explorer

Langfuse

Observability & evals

An open-source platform for tracing, evaluating, and monitoring LLM and agent applications.

Open sourceSelf-hostHosted API

Braintrust

Observability & evals

A commercial platform for evaluating, logging, and iterating on AI products.

ProprietaryHosted API

DeepEval

Observability & evals

An open-source "Pytest for LLMs" — a unit-testing framework for model outputs.

Open sourceSelf-host

Helicone

Observability & evals

An open-source observability platform that logs and analyzes LLM calls through a proxy.

Open sourceSelf-hostHosted API

LangSmith

Observability & evals

LangChain's platform for tracing, testing, and evaluating LLM and agent applications.

ProprietaryHosted API

Lunary

Observability & evals

An open-source observability and prompt-management platform for LLM applications.

Open sourceSelf-hostHosted API

OpenLLMetry

Observability & evals

An open-source set of OpenTelemetry extensions for standardized LLM observability.

Open sourceSelf-host

Opik

Observability & evals

Comet's open-source platform for tracing, evaluating, and monitoring LLM applications.

Open sourceSelf-hostHosted API

Phoenix

Observability & evals

Arize's open-source tool for tracing, evaluating, and debugging LLM and agent apps.

Source-availableSelf-host

promptfoo

Observability & evals

An open-source tool for testing, evaluating, and red-teaming prompts and models.

Open sourceSelf-host

Ragas

Observability & evals

An open-source framework for evaluating RAG pipelines on faithfulness and relevance.

Open sourceSelf-host