Version 0.4.1 - 2025-11-13

Added

Added support for Penelope Langchain integration.
Added LangGraph metrics example.
Added multi-turn test synthesizer functionality.
Added scenarios feature for test case generation.
Added cost heuristic for Polyphemus benchmarking.
Added schema support for Hugging Face models.
Added SDK support for metric scope and test set type.
Added example workflow demonstrating MCPAgent usage.
Added schemas for search and extraction results within MCPAgent.
Added stop_on_error parameter to MCPAgent.
Added Endpoint entity with invoke method for easier API interaction.
Implemented structured output for tool calling via Pydantic schemas.
Implemented native Rhesis conversational metrics with Goal Achievement Judge.
Added core conversational metrics infrastructure, including Turn Relevancy and Goal Achievement.
Added goal-achievement-specific template with excellent defaults for metrics.
Added ConversationalJudge architecture demo.
Added comprehensive GoalAchievementJudge test cases.
Added optional chatbot_role support in conversational metrics.