close

Towards Automated Evaluation of Multi-Turn Agentic AI Systems

We propose an LLM-based framework for automated evaluation of multi-turn Agentic AI systems. By simulating diverse user interactions and assessing sub-task performance, it allows for an innovative approach for systematic testing, which reduces the reliance on expensive manual annotation.

Towards Automated Evaluation of Multi-Turn Agentic AI Systems - Grunddaten

Kategorie

Innosuisse

Referenznummer

129.524 INNO-ICT

Projektstart

10.07.2025

Projektende

09.01.2026

Projektdauer

6 Months

Projektstatus

laufend

Bereich

HSG

We propose an LLM-based framework for automated evaluation of multi-turn Agentic AI systems. By simulating diverse user interactions and assessing sub-task performance, it allows for an innovative approach for systematic testing, which reduces the reliance on expensive manual annotation.

Towards Automated Evaluation of Multi-Turn Agentic AI Systems - Grunddaten

Weitere Projekte dieser Förderorganisation

Weitere Projekte dieser Hochschule

north