Rhesis SDK API Reference

This section provides detailed API documentation for the Rhesis SDK.

class ExecutionMode(value)[source]

Bases: str, Enum

Execution mode for test set runs.

Aligns with backend ExecutionMode: - PARALLEL: Tests dispatched concurrently (default) - SEQUENTIAL: Tests run one at a time

PARALLEL = 'Parallel'
SEQUENTIAL = 'Sequential'
classmethod from_string(value)[source]

Normalize a string or enum to ExecutionMode.

Accepts lowercase or capitalized forms: “parallel”, “sequential”, “Parallel”, “Sequential”, or an ExecutionMode enum value.

Parameters:

value (Union[str, ExecutionMode])

Return type:

ExecutionMode

class TestType(value)[source]

Bases: str, Enum

Enum for test types.

These values align with the backend TypeLookup table: - SINGLE_TURN: Traditional single request-response tests - MULTI_TURN: Agentic multi-turn conversation tests using Penelope

SINGLE_TURN = 'Single-Turn'
MULTI_TURN = 'Multi-Turn'
class RhesisClient(api_key=None, base_url=None, project_id=None, environment=None)[source]

Bases: object

Rhesis client with observability and telemetry capabilities.

This is the main user-facing client for applications that need: - OpenTelemetry tracing (@observe decorator) - Remote function execution (@endpoint decorator) - Automatic instrumentation

Users should create this via RhesisClient.from_environment() for automatic configuration from environment variables.

Example

```python from rhesis.sdk import RhesisClient

# Recommended: environment-based initialization client = RhesisClient.from_environment()

# Or explicit configuration client = RhesisClient(

api_key=”your-key”, project_id=”your-project”, environment=”production”

)

param api_key:

type api_key:

Optional[str], default: None

param base_url:

type base_url:

Optional[str], default: None

param project_id:

type project_id:

Optional[str], default: None

param environment:

type environment:

Optional[str], default: None

__init__(api_key=None, base_url=None, project_id=None, environment=None)[source]

Initialize the Rhesis observability client.

Parameters:
  • api_key (Optional[str], default: None) – Optional API key. If not provided, will try to get it from module level variable or environment variable.

  • base_url (Optional[str], default: None) – Optional base URL. If not provided, will try to get it from module level variable or environment variable.

  • project_id (Optional[str], default: None) – Optional project ID for remote endpoint testing. If not provided, will try to get from RHESIS_PROJECT_ID environment variable.

  • environment (Optional[str], default: None) – Optional environment name. If not provided, will try to get from RHESIS_ENVIRONMENT environment variable (default: “development”).

classmethod from_environment()[source]

Create a RhesisClient from environment variables.

This is the recommended way to initialize the client in applications. Returns a DisabledClient if RHESIS_CONNECTOR_DISABLED is set or if required credentials (RHESIS_PROJECT_ID, RHESIS_API_KEY) are missing.

Environment Variables:

RHESIS_CONNECTOR_DISABLED: Set to ‘true’ to disable the connector RHESIS_PROJECT_ID: Required project ID RHESIS_API_KEY: Required API key RHESIS_ENVIRONMENT: Optional, defaults to ‘development’ RHESIS_BASE_URL: Optional, defaults to ‘http://localhost:8080

Return type:

Union[RhesisClient, DisabledClient]

Returns:

RhesisClient or DisabledClient instance

register_endpoint(name, func, metadata)[source]

Register a function as a remotely callable endpoint.

Parameters:
  • name (str) – Endpoint function name

  • func – Function callable

  • metadata (dict) – Additional metadata

Return type:

None

class DisabledClient(*args, **kwargs)[source]

Bases: object

No-op client implementation used when RHESIS_CONNECTOR_DISABLED is enabled.

This client accepts all initialization parameters and method calls but performs no actual operations. It’s used to allow code to run without connector/observability overhead in test and CI environments.

Enabled with: RHESIS_CONNECTOR_DISABLED=true|1|yes|on (case-insensitive)

When DisabledClient is active: - @endpoint and @observe decorators return the original function unmodified - No telemetry initialization occurs - No connector manager is created - All method calls are no-ops

__init__(*args, **kwargs)[source]

Accept any initialization parameters and register as default client.

property is_disabled: bool

Return True to indicate this is a disabled client.

property base_url: str

Return empty string for base_url property.

property project_id: str | None

Return None for project_id property.

property environment: str

Return empty string for environment property.

endpoint(name=None, request_mapping=None, response_mapping=None, serializers=None, span_name=None, observe=True, bind=None, **metadata)[source]

Decorator to register functions as Rhesis endpoints with observability.

This decorator registers functions as remotely callable Rhesis endpoints. It enables two features: 1. OBSERVABILITY (Default On): Traces all executions with OpenTelemetry 2. REMOTE TESTING: Enables remote triggering from Rhesis platform

Parameters:
  • name (Optional[str], default: None) – Optional function name for registration (defaults to function.__name__)

  • span_name (Optional[str], default: None) – Optional semantic span name (e.g., ‘ai.llm.invoke’, ‘ai.tool.invoke’) Defaults to ‘function.<name>’ if not provided. This allows power users to specify AI operation types for better observability.

  • observe (bool, default: True) – Enable tracing (default: True). Set to False to disable tracing while keeping remote testing capability.

  • request_mapping (Optional[dict], default: None) –

    Manual input mappings (Rhesis standard field → function param) Maps incoming API request fields to your function’s parameters. Standard Rhesis REQUEST fields: input, session_id Custom fields: Any additional fields in the request are passed through jinja2.Template syntax: Jinja2 ({{ variable_name }}) Example: {

    ”user_message”: “{{ input }}”, “conv_id”: “{{ session_id }}”, “policy_id”: “{{ policy_number }}” # Custom field

    } For complex types, mapping keys should match function parameter names: Example: {

    ”request”: {

    “messages”: [{“role”: “user”, “content”: “{{ input }}”}], “context”: {“conversation_id”: “{{ session_id }}”},

    }

    }

  • response_mapping (Optional[dict], default: None) –

    Manual output mappings (function output → Rhesis standard field) Maps your function’s return value to Rhesis API response fields. Standard Rhesis RESPONSE fields: output, context, metadata, tool_calls pathlib.Path syntax: Jinja2 or JSONPath ($.path.to.field) Example: {

    ”output”: “$.result.text”, “session_id”: “$.conv_id”, “context”: “$.sources”, “metadata”: “$.stats”

    }

  • serializers (Optional[dict], default: None) –

    Custom serializers for specific types (optional) Provides custom dump (object→dict) and load (dict→object) functions for types that don’t follow standard patterns (Pydantic, dataclass, etc.) Format: {Type: {“dump”: callable, “load”: callable}} Example: {

    MyClass: {

    “dump”: lambda obj: obj.to_custom_format(), “load”: lambda d: MyClass.from_custom(d),

    }

    }

  • bind (Optional[dict], default: None) –

    Infrastructure dependencies to inject into the function (optional) Binds parameters that won’t appear in the remote function signature. Useful for database connections, auth context, configuration, etc. Supports both static values and callables (evaluated at call time). Example: {

    ”db”: lambda: get_db_session(), # Fresh connection per call “config”: AppConfig(), # Static singleton “user”: lambda: get_current_user() # Runtime context

    } Bound parameters are: - Excluded from the registered function signature - Automatically injected when the function is called - Evaluated at call time if callable, used directly if static

  • **metadata – Additional metadata about the function

Return type:

Callable

Returns:

Decorated function

Examples

# Example 1: Auto-mapping (zero config - recommended) @endpoint() def chat(input: str, session_id: str = None):

# REQUEST: input, session_id auto-detected # RESPONSE: output, session_id auto-extracted return {“output”: “…”, “session_id”: session_id}

# Example 2: Manual mapping with custom naming @endpoint(

request_mapping={

“user_query”: “{{ input }}”, # Standard field “conv_id”: “{{ session_id }}”, # Standard field “docs”: “{{ context }}” # Standard field

}, response_mapping={

“output”: “$.result.text”, # Nested output “session_id”: “$.conv_id”, “context”: “$.sources”

}

) def chat(user_query: str, conv_id: str = None, docs: list = None):

return {“result”: {“text”: “…”}, “conv_id”: conv_id, “sources”: […]}

# Example 3: Custom fields with manual mapping @endpoint(

request_mapping={

“question”: “{{ input }}”, “policy_id”: “{{ policy_number }}”, # Custom field from request “tier”: “{{ customer_tier }}” # Custom field from request

}, response_mapping={

“output”: “$.answer”, “metadata”: “$.stats”

}

) def insurance_query(question: str, policy_id: str, tier: str):

# Custom fields (policy_number, customer_tier) must be in API request return {“answer”: “…”, “stats”: {“premium”: tier == “gold”}}

# Example 4: Opt-out of tracing (rare use case) @endpoint(observe=False) def simple_function(x: int) -> int:

# Registered for remote testing but NOT traced return x * 2

# Example 5: Binding infrastructure dependencies @endpoint(

bind={

“db”: lambda: get_db_session(), # Fresh session per call “config”: AppConfig(), # Static singleton

}

) def query_data(db, config, input: str) -> dict:

# db and config are injected, only input appears in remote signature results = db.query(config.table, input) return {“output”: format_results(results)}

# Example 6: Binding with resource dependencies (auto-cleanup) # Option A: Raw generator function def get_db():

‘’’Generator that yields database session with auto-cleanup’’’ with database.get_session() as session:

yield session # Cleanup happens automatically

# Option B: Context manager with bind_context (recommended) from rhesis.sdk.decorators import bind_context

@endpoint(
bind={

“db”: get_db, # Works with raw generators # Or use bind_context for context managers (clearer than partial) # “db”: bind_context(database.get_session_with_tenant, org_id, user_id), “user”: lambda: get_current_user_context(),

}

) async def authenticated_query(db, user, input: str) -> dict:

# Database connection is automatically closed after execution if not user.is_authenticated:

return {“output”: “Unauthorized”}

return {“output”: db.query_for_user(user.id, input)}

Field Separation:

REQUEST fields (function inputs): - input: User query/message (required in API request) - session_id: Conversation tracking (optional in API request) - custom fields: Any additional fields in the API request

RESPONSE fields (function outputs): - output: Main response text (extracted from function return) - context: Retrieved documents/sources - metadata: Response metadata/stats - tool_calls: Available tools/functions - session_id: Can also be in response to preserve conversation ID

Raises:

RuntimeError – If RhesisClient not initialized before using decorator

collaborate(name=None, request_mapping=None, response_mapping=None, serializers=None, span_name=None, observe=True, bind=None, **metadata)

Decorator to register functions as Rhesis endpoints with observability.

This decorator registers functions as remotely callable Rhesis endpoints. It enables two features: 1. OBSERVABILITY (Default On): Traces all executions with OpenTelemetry 2. REMOTE TESTING: Enables remote triggering from Rhesis platform

Parameters:
  • name (Optional[str], default: None) – Optional function name for registration (defaults to function.__name__)

  • span_name (Optional[str], default: None) – Optional semantic span name (e.g., ‘ai.llm.invoke’, ‘ai.tool.invoke’) Defaults to ‘function.<name>’ if not provided. This allows power users to specify AI operation types for better observability.

  • observe (bool, default: True) – Enable tracing (default: True). Set to False to disable tracing while keeping remote testing capability.

  • request_mapping (Optional[dict], default: None) –

    Manual input mappings (Rhesis standard field → function param) Maps incoming API request fields to your function’s parameters. Standard Rhesis REQUEST fields: input, session_id Custom fields: Any additional fields in the request are passed through jinja2.Template syntax: Jinja2 ({{ variable_name }}) Example: {

    ”user_message”: “{{ input }}”, “conv_id”: “{{ session_id }}”, “policy_id”: “{{ policy_number }}” # Custom field

    } For complex types, mapping keys should match function parameter names: Example: {

    ”request”: {

    “messages”: [{“role”: “user”, “content”: “{{ input }}”}], “context”: {“conversation_id”: “{{ session_id }}”},

    }

    }

  • response_mapping (Optional[dict], default: None) –

    Manual output mappings (function output → Rhesis standard field) Maps your function’s return value to Rhesis API response fields. Standard Rhesis RESPONSE fields: output, context, metadata, tool_calls pathlib.Path syntax: Jinja2 or JSONPath ($.path.to.field) Example: {

    ”output”: “$.result.text”, “session_id”: “$.conv_id”, “context”: “$.sources”, “metadata”: “$.stats”

    }

  • serializers (Optional[dict], default: None) –

    Custom serializers for specific types (optional) Provides custom dump (object→dict) and load (dict→object) functions for types that don’t follow standard patterns (Pydantic, dataclass, etc.) Format: {Type: {“dump”: callable, “load”: callable}} Example: {

    MyClass: {

    “dump”: lambda obj: obj.to_custom_format(), “load”: lambda d: MyClass.from_custom(d),

    }

    }

  • bind (Optional[dict], default: None) –

    Infrastructure dependencies to inject into the function (optional) Binds parameters that won’t appear in the remote function signature. Useful for database connections, auth context, configuration, etc. Supports both static values and callables (evaluated at call time). Example: {

    ”db”: lambda: get_db_session(), # Fresh connection per call “config”: AppConfig(), # Static singleton “user”: lambda: get_current_user() # Runtime context

    } Bound parameters are: - Excluded from the registered function signature - Automatically injected when the function is called - Evaluated at call time if callable, used directly if static

  • **metadata – Additional metadata about the function

Return type:

Callable

Returns:

Decorated function

Examples

# Example 1: Auto-mapping (zero config - recommended) @endpoint() def chat(input: str, session_id: str = None):

# REQUEST: input, session_id auto-detected # RESPONSE: output, session_id auto-extracted return {“output”: “…”, “session_id”: session_id}

# Example 2: Manual mapping with custom naming @endpoint(

request_mapping={

“user_query”: “{{ input }}”, # Standard field “conv_id”: “{{ session_id }}”, # Standard field “docs”: “{{ context }}” # Standard field

}, response_mapping={

“output”: “$.result.text”, # Nested output “session_id”: “$.conv_id”, “context”: “$.sources”

}

) def chat(user_query: str, conv_id: str = None, docs: list = None):

return {“result”: {“text”: “…”}, “conv_id”: conv_id, “sources”: […]}

# Example 3: Custom fields with manual mapping @endpoint(

request_mapping={

“question”: “{{ input }}”, “policy_id”: “{{ policy_number }}”, # Custom field from request “tier”: “{{ customer_tier }}” # Custom field from request

}, response_mapping={

“output”: “$.answer”, “metadata”: “$.stats”

}

) def insurance_query(question: str, policy_id: str, tier: str):

# Custom fields (policy_number, customer_tier) must be in API request return {“answer”: “…”, “stats”: {“premium”: tier == “gold”}}

# Example 4: Opt-out of tracing (rare use case) @endpoint(observe=False) def simple_function(x: int) -> int:

# Registered for remote testing but NOT traced return x * 2

# Example 5: Binding infrastructure dependencies @endpoint(

bind={

“db”: lambda: get_db_session(), # Fresh session per call “config”: AppConfig(), # Static singleton

}

) def query_data(db, config, input: str) -> dict:

# db and config are injected, only input appears in remote signature results = db.query(config.table, input) return {“output”: format_results(results)}

# Example 6: Binding with resource dependencies (auto-cleanup) # Option A: Raw generator function def get_db():

‘’’Generator that yields database session with auto-cleanup’’’ with database.get_session() as session:

yield session # Cleanup happens automatically

# Option B: Context manager with bind_context (recommended) from rhesis.sdk.decorators import bind_context

@endpoint(
bind={

“db”: get_db, # Works with raw generators # Or use bind_context for context managers (clearer than partial) # “db”: bind_context(database.get_session_with_tenant, org_id, user_id), “user”: lambda: get_current_user_context(),

}

) async def authenticated_query(db, user, input: str) -> dict:

# Database connection is automatically closed after execution if not user.is_authenticated:

return {“output”: “Unauthorized”}

return {“output”: db.query_for_user(user.id, input)}

Field Separation:

REQUEST fields (function inputs): - input: User query/message (required in API request) - session_id: Conversation tracking (optional in API request) - custom fields: Any additional fields in the API request

RESPONSE fields (function outputs): - output: Main response text (extracted from function return) - context: Retrieved documents/sources - metadata: Response metadata/stats - tool_calls: Available tools/functions - session_id: Can also be in response to preserve conversation ID

Raises:

RuntimeError – If RhesisClient not initialized before using decorator

create_observer(name='custom', base_attributes=None)[source]

Create a custom ObserveDecorator instance for domain-specific use cases.

This enables developers to create their own observability decorators with custom methods and default attributes, following the pattern: myproject.telemetry.decorators import my_custom_observer

Parameters:
  • name (str, default: 'custom') – Name for the custom observer (for debugging/logging)

  • base_attributes (Optional[dict], default: None) – Default attributes to apply to all spans from this observer

Return type:

ObserveDecorator

Returns:

New ObserveDecorator instance that can be extended with custom methods

Example

# myproject/telemetry/decorators.py from rhesis.sdk.decorators import create_observer

# Create domain-specific observer db_observer = create_observer(

name=”database”, base_attributes={“service.name”: “user-service”, “db.system”: “postgresql”}

)

# Add custom methods db_observer.add_method(“query”, “ai.database.query”, operation_type=”database.query”) db_observer.add_method(

“transaction”, “ai.database.transaction”, operation_type=”database.transaction”

)

# myproject/services/user.py from myproject.telemetry.decorators import db_observer

@db_observer.query(table=”users”, operation=”select”) def get_user(user_id: str):

return db.query(“SELECT * FROM users WHERE id = %s”, user_id)

class ObserverBuilder(name)[source]

Bases: object

Builder pattern for creating custom observers with fluent API.

This provides the most ergonomic way to create domain-specific observers.

Example

# myproject/telemetry/decorators.py from rhesis.sdk.decorators import ObserverBuilder

# Create API observer with fluent interface api_observer = (

ObserverBuilder(“api”) .with_base_attributes(service_name=”payment-service”, service_version=”1.2.0”) .add_method(“http_call”, “ai.api.http”, operation_type=”api.http”) .add_method(“webhook”, “ai.api.webhook”, operation_type=”api.webhook”) .add_method(“graphql”, “ai.api.graphql”, operation_type=”api.graphql”) .build()

)

# myproject/services/payment.py from myproject.telemetry.decorators import api_observer

@api_observer.http_call(method=”POST”, endpoint=”/charges”) def create_charge(amount: float):

return stripe.create_charge(amount)

@api_observer.webhook(event_type=”payment.succeeded”) def handle_payment_webhook(payload: dict):

return process_payment_success(payload)

Parameters:

name (str)

__init__(name)[source]
Parameters:

name (str)

with_base_attributes(**attributes)[source]

Add base attributes that will be applied to all spans.

Return type:

ObserverBuilder

add_method(method_name, span_name, operation_type=None, **default_attributes)[source]

Add a convenience method to the observer.

Parameters:
Return type:

ObserverBuilder

build()[source]

Build and return the configured observer.

Return type:

ObserveDecorator

bind_context(func, *args, **kwargs)[source]

Helper to bind a context manager or generator function with arguments.

This is a convenience wrapper for creating fresh context managers per function call in @endpoint bind parameters. It’s clearer than using functools.partial and makes the intent explicit.

Parameters:
  • func (Callable) – A context manager function (decorated with @contextmanager) or generator

  • *args (Any) – Positional arguments to pass to the function

  • **kwargs (Any) – Keyword arguments to pass to the function

Return type:

Callable

Returns:

A callable that creates a fresh context manager when invoked

Examples

# Binding a database session with tenant context @endpoint(bind={

“db”: bind_context(get_db_with_tenant_variables, org_id, user_id)

}) def my_function(db, input: str):

return {“output”: db.query(input)}

# Binding configuration with parameters @endpoint(bind={

“config”: bind_context(get_config, env=”production”, debug=False)

}) def my_function(config, input: str):

return {“output”: config.process(input)}

Note

This is equivalent to: lambda: func(*args, **kwargs) But more explicit and self-documenting.

Core Components

Client

Configuration

class TypeVar(name, *constraints, bound=None, covariant=False, contravariant=False)[source]

Bases: _Final, _Immutable, _TypeVarLike

Type variable.

Usage:

T = TypeVar('T')  # Can be anything
A = TypeVar('A', str, bytes)  # Must be str or bytes

Type variables exist primarily for the benefit of static type checkers. They serve as the parameters for generic types as well as for generic function definitions. See class Generic for more information on generic types. Generic functions work as follows:

def repeat(x: T, n: int) -> List[T]:

‘’’Return a list containing n references to x.’’’ return [x]*n

def longest(x: A, y: A) -> A:

‘’’Return the longest of two strings.’’’ return x if len(x) >= len(y) else y

The latter example’s signature is essentially the overloading of (str, str) -> str and (bytes, bytes) -> bytes. Also note that if the arguments are instances of some subclass of str, the return type is still plain str.

At runtime, isinstance(x, T) and issubclass(C, T) will raise TypeError.

Type variables defined with covariant=True or contravariant=True can be used to declare covariant or contravariant generic types. See PEP 484 for more details. By default generic types are invariant in all type variables.

Type variables can be introspected. e.g.:

T.__name__ == ‘T’ T.__constraints__ == () T.__covariant__ == False T.__contravariant__ = False A.__constraints__ == (str, bytes)

Note that only type variables defined in global scope can be pickled.

__init__(name, *constraints, bound=None, covariant=False, contravariant=False)[source]
get_api_key()[source]

Get the API key from module level variable or environment variable. Raises ValueError if no API key is found.

Return type:

str

get_base_url()[source]

Get the base URL from module level variable or environment variable. Falls back to default if neither is set.

Return type:

str

Command Line Interface

main()[source]
Return type:

None

Utilities

Utility functions for the Rhesis SDK.

count_tokens(text, encoding_name='cl100k_base')[source]

Count the number of tokens in a given text string using tiktoken.

Parameters:
  • text (str) – The input text to count tokens for

  • encoding_name (str, default: 'cl100k_base') – The name of the encoding to use. Defaults to cl100k_base (used by GPT-4 and GPT-3.5-turbo)

Returns:

The number of tokens in the text, or None if encoding fails

Return type:

Optional[int]

Examples

>>> count_tokens("Hello, world!")
4
>>> count_tokens("Complex text", encoding_name="p50k_base")
2
extract_json_from_text(text, fallback_to_partial=True)[source]

Extract JSON from text that may contain markdown, extra text, or malformed JSON.

Parameters:
  • text (str) – The text containing JSON

  • fallback_to_partial (bool, default: True) – Whether to attempt partial extraction if full parsing fails

Return type:

Dict[str, Any]

Returns:

Dict containing the parsed JSON

Raises:

ValueError – If no valid JSON can be extracted

extract_partial_json(text)[source]

Extract partial JSON when full parsing fails. Attempts to find and parse individual objects or arrays.

Parameters:

text (str) – The text containing malformed JSON

Return type:

Dict[str, Any]

Returns:

Dict with extracted data, may contain empty structures

extract_objects_from_array(array_content)[source]

Extract individual JSON objects from array content.

Parameters:

array_content (str) – String content of a JSON array

Return type:

List[Dict[str, Any]]

Returns:

List of parsed objects

safe_json_loads(text, default=None)[source]

Safely load JSON with fallback to default value.

Parameters:
  • text (str) – JSON string to parse

  • default (Optional[Any], default: None) – Default value if parsing fails

Return type:

Any

Returns:

Parsed JSON or default value

validate_test_case(test_case)[source]

Validate that a test case has the required structure.

Parameters:

test_case (Dict[str, Any]) – Dictionary representing a test case

Return type:

bool

Returns:

True if valid, False otherwise

clean_and_validate_tests(tests)[source]

Clean and validate a list of test cases.

Parameters:

tests (List[Dict[str, Any]]) – List of test case dictionaries

Return type:

List[Dict[str, Any]]

Returns:

List of valid test cases

get_file_content(file_path)[source]

Get file content with error handling.

Parameters:

file_path (str)

Return type:

str

ensure_directory_exists(directory_path)[source]

Ensure a directory exists, creating it if necessary.

Parameters:

directory_path (str)

Return type:

None

Module Structure

class rhesis.sdk.client.Client

The main client for interacting with the Rhesis API.