Towards Automated Evaluation of Multi-Turn Agentic AI Systems

We propose an LLM-based framework for automated evaluation of multi-turn Agentic AI systems. By simulating diverse user interactions and assessing sub-task performance, it allows for an innovative approach for systematic testing, which reduces the reliance on expensive manual annotation.

Towards Automated Evaluation of Multi-Turn Agentic AI Systems

Kategorie

Referenznummer

Projektstart

Projektende

Projektdauer

Projektstatus

Bereich

Weitere Projekte dieser Förderorganisation

OST

Laser Metal Deposition und On-site Wärmebehandlung von Peltonturbinenschaufeln

OST

Aligned Fibers for Ultrasonically Bioactivated Implants - AFUBI

OST

Paloma Membrane: Toolbox for functional PFAS-free microporous membranes using water-soluble Templates

OST

Fireblock 2.0-Next process generation and optimized material

OST

KODI NX-AI: Adaptive Produktionsoptimierung für KMU im Kontext von Industrie 5.0

OST

ModulAR: Ein digitales erweitertes Lehrmittel für den Physikunterricht

OST

RAIISE - Runway System for AI Integration in Swiss Small and Medium-Sized Enterprises

OST

Sharing-Workflow-Templates für Schweizer NPOs – Bedarfs- und Potenzialanalyse digitaler Prozesslösungen

OST

Data-Driven Default Risk - Green Energy Projects

OST

BIM-basierte KI-unterstützte Leistungsverzeichnisse und Kostenmanagement für Bauwerke im Bereich von Verkehrsinfrastruktur

Weitere Projekte dieser Hochschule

HSG

Financial Social Networks

HSG

Establishing Ecological Validity and Reliability of a Neurocognitive Supertask for Predicting Real-World Transdiagnostic Rigidity

HSG

Diversity, Equality & Inclusion project plan for the University of St.Gallen (UNISG DE&I project plan)

HSG

NAIF: National Approach for Interoperable repositories and Findable research results

HSG

Can't Start a Fire Without a Spark: Corporate Entrepreneurship for the Circular Economy

HSG

HSG OS Office: Open Science Office Universität St.Gallen

HSG

Co-Creation in (Digital) Social Innovation: exploring digital Co-Creation processes in the context of Climate Action

HSG

AI Literacy

HSG

Hochschulübergreifende Kooperation für eine zukunftsfähige Lehrer:innenbildung

HSG

St.Gallen Transformation Lab