If you use public AI services such as OpenAI, Anthropic or Mistral, Sarus Arena is an agent you can easily deploy in your infrastructure to do:
- LLM evaluation: AB-testing, user-feedback evaluation, formula-based evaluation and LLM as a Judge
- LLM compliance: Request and response filtering and redacting (PII removal, guardrailing), evaluation-based routing
- LLM distillation: Train your own model based on the best evaluated responses