Skip to main content

Mankinds SDK

Choose your language:

Integrate responsible AI evaluation into your applications. The Mankinds SDK provides the tools to create systems, connect your data sources, generate test datasets, and run evaluations — programmable in JavaScript/TypeScript and Python.

from mankinds_sdk import MankindsClient
import os

client = MankindsClient(api_key=os.environ["MANKINDS_API_KEY"])

# Create a system, generate a dataset, evaluate
system = client.create_system("My AI Assistant", "...", endpoint={...})
client.generate_dataset(system["id"], num_scenarios=10)
result = client.evaluate(system["id"])
print(f"Score: {result['summary']['global_score']}%")

Getting Started

A few steps to integrate AI evaluation into your project.

Install the SDK
pip install mankinds-sdk
Get your API key

Create an account on app.mankinds.io and generate your API key from settings.

Initialize the client
import os
from mankinds_sdk import MankindsClient

client = MankindsClient(api_key=os.environ["MANKINDS_API_KEY"])
Run your first evaluation

Create a system, generate a test dataset, then run the evaluation:

# 1. Create a system with its endpoint
system = client.create_system(
"GDPR Compliance Assistant",
"Chatbot that advises companies on their GDPR obligations.",
endpoint={
"url": "https://api.example.com/chat",
"method": "POST",
"body": {"message": "{{input}}"},
"response": {"reply": "{{output}}"}
}
)

# 2. Generate a test dataset (10 scenarios)
dataset = client.generate_dataset(system["id"], num_scenarios=10)

# 3. Run the evaluation
result = client.evaluate(system["id"])
print(f"Global score: {result['summary']['global_score']}%")

Features

FeatureDescription
SystemsCreate, configure and manage your AI systems with automatic description validation
ConnectorsConnect your logs (files, Datadog) and databases (SQLite, PostgreSQL)
DatasetsProvide your reference scenarios or auto-generate them with AI
EvaluationsRun evaluations across 61 criteria (security, fairness, reliability…) and retrieve scores
AuthenticationSecure API key with fine-grained permission management

API Reference

MankindsClient

The main client for interacting with the Mankinds API.

Constructor

MankindsClient(api_key: str, base_url: str = None, timeout: int = 30)
ParameterTypeRequiredDefaultDescription
api_keystrRequiredMankinds API key (format: mk_...)
base_urlstrOptionalhttps://app.mankinds.ioAPI base URL
timeoutintOptional120Request timeout in seconds

Systems

createSystem

Creates a new AI system and automatically validates its description.

ParameterTypeRequiredDescription
namestrRequiredAI system name
descriptionstrRequiredDescription of the expected system behavior
endpointdictRequiredAPI endpoint configuration (url, method, body, response)
ReturnsTypeDescription
idstringUnique identifier of the created system
successbooleantrue if the description was validated
recommendationsarrayRecommendations to improve the description
Endpoint configuration

In your endpoint's body and response configurations, use the placeholders {{input}} and {{output}} to dynamically inject test scenarios.

system = client.create_system(
"GDPR Compliance Assistant",
"Chatbot that advises companies on their GDPR obligations and guides compliance.",
endpoint={
"url": "https://api.example.com/chat",
"method": "POST",
"body": {"message": "{{input}}"},
"response": {"reply": "{{output}}"}
}
)

print(f"System created: {system['id']}")
print(f"Description validated: {system['success']}")

getSystem

Retrieves system details.

ParameterTypeRequiredDescription
system_idstrRequiredUnique system identifier
ReturnsTypeDescription
idstringUnique identifier of the system
namestringSystem name
descriptionstringSystem description
is_description_validatedbooleantrue if the description has been validated
endpointobjectEndpoint configuration
system = client.get_system(system_id)

print(f"Name: {system['name']}")
print(f"Description validated: {system['is_description_validated']}")
print(f"Endpoint: {system['endpoint']}")

updateSystem

Updates an existing system.

ParameterTypeRequiredDescription
system_idstrRequiredUnique system identifier
namestrOptionalNew system name
descriptionstrOptionalNew description (triggers re-validation)
endpointdictOptionalNew endpoint configuration
ReturnsTypeDescription
successbooleantrue if the update was successful
recommendationsarrayRecommendations if description was re-validated
Endpoint validation

If you provide an endpoint, it must contain the required fields: url, method, body, response. Otherwise, the exception InvalidEndpointError will be raised.

result = client.update_system(
system_id,
name="GDPR Compliance Assistant v2",
description="Improved version with DPO questions support."
)

print(f"Validated: {result['success']}")

If a new description is provided, it is automatically re-validated. On failure, DescriptionNotValidatedError is raised with recommendations.


Connectors

Connectors allow you to connect your data sources (logs, databases).

One connector per category

Each system can only have one connector per category (logs, database).

Available Types

ConnectorCategoryDescription
FileConnectorlogsLog files (.log, .txt, .json)
DatadogConnectorlogsLogs from Datadog
SqliteConnectordatabaseSQLite database (.db, .sqlite)
PostgresqlConnectordatabaseRemote PostgreSQL database

addConnector

Adds a connector to the system.

ParameterTypeRequiredDescription
system_idstrRequiredUnique system identifier
connectorBaseConnectorRequiredConnector instance (FileConnector, SqliteConnector, etc.)
One connector per category

If a connector of the same category already exists, the exception ConnectorAlreadyExistsError will be raised. Delete the existing one first with deleteConnector().

from mankinds_sdk.connectors import FileConnector, SqliteConnector

# Logs
log_connector = FileConnector(
file_path="./logs/app.log",
name="Application Logs",
)
result = client.add_connector(system_id, log_connector)
print(f"Connector added: {result}")

# Database
db_connector = SqliteConnector(
file_path="./data/users.db",
name="Users Database",
)
client.add_connector(system_id, db_connector)

getConnectors

Lists all connectors for a system.

ParameterTypeRequiredDescription
system_idstrRequiredUnique system identifier
ReturnsTypeDescription
namestringConnector name
categorystringConnector category (logs, database)
typestringConnector type (file, sqlite, etc.)
connectors = client.get_connectors(system_id)

for c in connectors:
print(f"{c['name']} ({c['category']}): {c['type']}")

updateConnector

Updates an existing connector's configuration.

ParameterTypeRequiredDescription
system_idstrRequiredUnique system identifier
connectorBaseConnectorRequiredConnector instance with updated configuration
from mankinds_sdk.connectors import FileConnector

updated_connector = FileConnector(
file_path="./logs/new-app.log",
name="Updated Logs",
)
result = client.update_connector(system_id, updated_connector)
print(f"Connector updated: {result}")

deleteConnector

Deletes a connector from the system.

ParameterTypeRequiredDescription
system_idstrRequiredUnique system identifier
connectorBaseConnectorRequiredConnector instance to delete
result = client.delete_connector(system_id, log_connector)
print(f"Connector deleted: {result}")

Datasets

generateDataset

Creates and validates a reference scenario dataset. You can provide custom scenarios or request automatic generation.

ParameterTypeRequiredDefaultDescription
system_idstrRequiredUnique system identifier
num_scenariosintOptional10Number of scenarios to auto-generate (ignored if scenarios provided)
scenarioslist[dict]OptionalCustom scenarios with input (str) and outputs (list)
ReturnsTypeDescription
scenariosarrayList of validated scenarios
scenarios[].idstringUnique scenario identifier
scenarios[].inputobjectInput sent to the system
scenarios[].expected_outputsarrayList of expected responses
scenarios[].sourcestringScenario origin (user or generated)
Validated description required

The system description must be validated before generating a dataset. Otherwise, the exception DescriptionNotValidatedError will be raised.

# With custom scenarios
dataset = client.generate_dataset(system_id, scenarios=[
{"input": "Hello, how does this work?", "outputs": ["Welcome! I'm here to help."]},
{"input": "I want a refund", "outputs": ["I'll redirect you to our customer service."]},
])

print(f"{len(dataset['scenarios'])} scenarios validated")
# Automatic generation
dataset = client.generate_dataset(system_id, num_scenarios=20)
print(f"{len(dataset['scenarios'])} scenarios generated")

updateDataset

Updates the dataset with instructions or new scenarios.

ParameterTypeRequiredDescription
system_idstrRequiredUnique system identifier
orientationstrOptionalInstructions to refine the dataset
scenarioslist[dict]OptionalNew scenarios to replace existing ones
# Refine with instructions
dataset = client.update_dataset(
system_id,
orientation="Add more refund request cases"
)
print(f"{len(dataset['scenarios'])} scenarios after update")
Returns

The updated and re-validated dataset — same structure as generateDataset (see above).


Evaluations

evaluate

Runs a complete system evaluation.

ParameterTypeRequiredDefaultDescription
system_idstrRequiredUnique system identifier
profilestrOptionalrequiredEvaluation profile (see table below)
thematics_configdictOptionalCustom configuration per criterion (replaces profile)
waitboolOptionaltrueWait for evaluation to complete before returning
poll_intervalintOptional5Seconds between each status check
on_pollCallable[[str, int], None]OptionalCallback invoked on each status check (status, elapsed seconds)
ReturnsTypeDescription
run_idstringEvaluation run identifier
statusstringcompleted, failed, running, etc.
summary.global_scorenumberGlobal score as a percentage
summary.thematicsobjectDetailed scores by thematic (score, passed)
result = client.evaluate(
system_id,
profile="required",
wait=True,
poll_interval=5,
)

print(f"Status: {result['status']}")
print(f"Global score: {result['summary']['global_score']}%")

With wait=false, only run_id is returned immediately. Use getEvaluation(runId) to retrieve results later.

Evaluation Profiles

ProfileDescription
requiredRequired criteria adapted to the system's scope
extendedExtended criteria adapted to the system's scope
minimumFixed minimal profile (quick evaluation)
standardFixed standard profile (balance coverage/time)
maximumFixed complete profile (all criteria)

Custom Configuration (thematics_config)

For a custom evaluation, use thematics_config instead of profile:

Note: The main key of thematics_config must be the exact name of a thematic among those listed below (e.g., fairness_ethics, privacy_security, etc.).

result = client.evaluate(
system_id,
thematics_config={
"fairness_ethics": {
"gender": {"nb_tests": 5},
"age": {"nb_tests": 5}
}
},
wait=True,
)

Available Dimensions and Criteria

privacy_security — Data protection and security (28 criteria)

Privacy: pii_reuse, pii_request, pii_masking_detection, pii_in_logs, pii_in_db, pii_masking_db, pii_masking_logs, refusal_privacy

Security - Exfiltration: pii_exfiltration, tech_exfiltration, tech_exfiltration_logs, tech_exfiltration_db, internal_exfiltration, internal_exfiltration_logs, internal_exfiltration_db, context_exfiltration, context_exfiltration_db, context_exfiltration_logs, traces_exfiltration, traces_exfiltration_logs, traces_exfiltration_db, refusal_security

Security - Resistance: multiturn_resistance

reliability_performance — Reliability and attack resistance (6 criteria)

reproducibility, quality, prompt_injection, social_engineering, obfuscation, context_manipulation

fairness_ethics — Fairness and ethics (9 criteria)

Bias: age, ethnic, gender, health, identity, religious, socioeconomic

Ethics: traceability, human_escalation

explainability_transparency — Transparency and explainability (9 criteria)

justification, purpose_disclosure, ai_nature_disclosure, ai_self_disclosure, control_transparency, ambiguous_scope_clarification, refusal_scope, refusal_nonqualification, limitation_explanation

accountability_responsibility — Accountability and traceability (7 criteria)

usage_conformity, scope_creep_detection, opt_out_capabilities, decision_override, override_refusal_resistance, secure_logging_db, secure_logging_logs

sustainability — Environmental efficiency (2 criteria)

db_environmental_efficiency, log_environmental_efficiency

getEvaluation

Retrieves evaluation status or result.

ParameterTypeRequiredDescription
run_idstrRequiredEvaluation run identifier
result = client.get_evaluation(run_id)

print(f"Status: {result['status']}")
if result["status"] == "completed":
print(f"Score: {result['summary']['global_score']}%")
Returns

Same structure as evaluate (see above).


Errors

The SDK throws typed exceptions for easier error handling.

ExceptionDescription
CredentialsErrorMissing or invalid API key
AuthenticationErrorExpired or rejected API key (401)
NotFoundErrorResource not found (404)
ValidationErrorRequest validation failed (422)
RateLimitErrorToo many requests (429)
InvalidEndpointErrorMisconfigured endpoint
EndpointNotConfiguredErrorEvaluation without configured endpoint
DescriptionNotValidatedErrorDescription not validated
ConnectorAlreadyExistsErrorConnector already exists (same category)
ServerErrorServer error (500)

Each exception contains contextual information for easier debugging:

from mankinds_sdk.exceptions import ConnectorAlreadyExistsError

try:
client.add_connector(system_id, connector)
except ConnectorAlreadyExistsError as e:
print(f"Connector {e.existing_type} already exists")