Mankinds SDK
Integrate responsible AI evaluation into your applications. The Mankinds SDK provides the tools to create systems, connect your data sources, generate test datasets, and run evaluations — programmable in JavaScript/TypeScript and Python.
from mankinds_sdk import MankindsClient
import os
client = MankindsClient(api_key=os.environ["MANKINDS_API_KEY"])
# Create a system, generate a dataset, evaluate
system = client.create_system("My AI Assistant", "...", endpoint={...})
client.generate_dataset(system["id"], num_scenarios=10)
result = client.evaluate(system["id"])
print(f"Score: {result['summary']['global_score']}%")
Getting Started
A few steps to integrate AI evaluation into your project.
pip install mankinds-sdk
Create an account on app.mankinds.io and generate your API key from settings.
import os
from mankinds_sdk import MankindsClient
client = MankindsClient(api_key=os.environ["MANKINDS_API_KEY"])
Create a system, generate a test dataset, then run the evaluation:
# 1. Create a system with its endpoint
system = client.create_system(
"GDPR Compliance Assistant",
"Chatbot that advises companies on their GDPR obligations.",
endpoint={
"url": "https://api.example.com/chat",
"method": "POST",
"body": {"message": "{{input}}"},
"response": {"reply": "{{output}}"}
}
)
# 2. Generate a test dataset (10 scenarios)
dataset = client.generate_dataset(system["id"], num_scenarios=10)
# 3. Run the evaluation
result = client.evaluate(system["id"])
print(f"Global score: {result['summary']['global_score']}%")
Features
| Feature | Description |
|---|---|
| Systems | Create, configure and manage your AI systems with automatic description validation |
| Connectors | Connect your logs (files, Datadog) and databases (SQLite, PostgreSQL) |
| Datasets | Provide your reference scenarios or auto-generate them with AI |
| Evaluations | Run evaluations across 61 criteria (security, fairness, reliability…) and retrieve scores |
| Authentication | Secure API key with fine-grained permission management |
API Reference
MankindsClient
The main client for interacting with the Mankinds API.
Constructor
MankindsClient(api_key: str, base_url: str = None, timeout: int = 30)
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
api_key | str | Required | — | Mankinds API key (format: mk_...) |
base_url | str | Optional | https://app.mankinds.io | API base URL |
timeout | int | Optional | 120 | Request timeout in seconds |
Systems
createSystem
Creates a new AI system and automatically validates its description.
| Parameter | Type | Required | Description |
|---|---|---|---|
name | str | Required | AI system name |
description | str | Required | Description of the expected system behavior |
endpoint | dict | Required | API endpoint configuration (url, method, body, response) |
| Returns | Type | Description |
|---|---|---|
id | string | Unique identifier of the created system |
success | boolean | true if the description was validated |
recommendations | array | Recommendations to improve the description |
In your endpoint's body and response configurations, use the placeholders {{input}} and {{output}} to dynamically inject test scenarios.
system = client.create_system(
"GDPR Compliance Assistant",
"Chatbot that advises companies on their GDPR obligations and guides compliance.",
endpoint={
"url": "https://api.example.com/chat",
"method": "POST",
"body": {"message": "{{input}}"},
"response": {"reply": "{{output}}"}
}
)
print(f"System created: {system['id']}")
print(f"Description validated: {system['success']}")
getSystem
Retrieves system details.
| Parameter | Type | Required | Description |
|---|---|---|---|
system_id | str | Required | Unique system identifier |
| Returns | Type | Description |
|---|---|---|
id | string | Unique identifier of the system |
name | string | System name |
description | string | System description |
is_description_validated | boolean | true if the description has been validated |
endpoint | object | Endpoint configuration |
system = client.get_system(system_id)
print(f"Name: {system['name']}")
print(f"Description validated: {system['is_description_validated']}")
print(f"Endpoint: {system['endpoint']}")
updateSystem
Updates an existing system.
| Parameter | Type | Required | Description |
|---|---|---|---|
system_id | str | Required | Unique system identifier |
name | str | Optional | New system name |
description | str | Optional | New description (triggers re-validation) |
endpoint | dict | Optional | New endpoint configuration |
| Returns | Type | Description |
|---|---|---|
success | boolean | true if the update was successful |
recommendations | array | Recommendations if description was re-validated |
If you provide an endpoint, it must contain the required fields: url, method, body, response. Otherwise, the exception InvalidEndpointError will be raised.
result = client.update_system(
system_id,
name="GDPR Compliance Assistant v2",
description="Improved version with DPO questions support."
)
print(f"Validated: {result['success']}")
If a new
descriptionis provided, it is automatically re-validated. On failure,DescriptionNotValidatedErroris raised with recommendations.
Connectors
Connectors allow you to connect your data sources (logs, databases).
Each system can only have one connector per category (logs, database).
Available Types
| Connector | Category | Description |
|---|---|---|
FileConnector | logs | Log files (.log, .txt, .json) |
DatadogConnector | logs | Logs from Datadog |
SqliteConnector | database | SQLite database (.db, .sqlite) |
PostgresqlConnector | database | Remote PostgreSQL database |
addConnector
Adds a connector to the system.
| Parameter | Type | Required | Description |
|---|---|---|---|
system_id | str | Required | Unique system identifier |
connector | BaseConnector | Required | Connector instance (FileConnector, SqliteConnector, etc.) |
If a connector of the same category already exists, the exception ConnectorAlreadyExistsError will be raised. Delete the existing one first with deleteConnector().
from mankinds_sdk.connectors import FileConnector, SqliteConnector
# Logs
log_connector = FileConnector(
file_path="./logs/app.log",
name="Application Logs",
)
result = client.add_connector(system_id, log_connector)
print(f"Connector added: {result}")
# Database
db_connector = SqliteConnector(
file_path="./data/users.db",
name="Users Database",
)
client.add_connector(system_id, db_connector)
getConnectors
Lists all connectors for a system.
| Parameter | Type | Required | Description |
|---|---|---|---|
system_id | str | Required | Unique system identifier |
| Returns | Type | Description |
|---|---|---|
name | string | Connector name |
category | string | Connector category (logs, database) |
type | string | Connector type (file, sqlite, etc.) |
connectors = client.get_connectors(system_id)
for c in connectors:
print(f"{c['name']} ({c['category']}): {c['type']}")
updateConnector
Updates an existing connector's configuration.
| Parameter | Type | Required | Description |
|---|---|---|---|
system_id | str | Required | Unique system identifier |
connector | BaseConnector | Required | Connector instance with updated configuration |
from mankinds_sdk.connectors import FileConnector
updated_connector = FileConnector(
file_path="./logs/new-app.log",
name="Updated Logs",
)
result = client.update_connector(system_id, updated_connector)
print(f"Connector updated: {result}")
deleteConnector
Deletes a connector from the system.
| Parameter | Type | Required | Description |
|---|---|---|---|
system_id | str | Required | Unique system identifier |
connector | BaseConnector | Required | Connector instance to delete |
result = client.delete_connector(system_id, log_connector)
print(f"Connector deleted: {result}")
Datasets
generateDataset
Creates and validates a reference scenario dataset. You can provide custom scenarios or request automatic generation.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
system_id | str | Required | — | Unique system identifier |
num_scenarios | int | Optional | 10 | Number of scenarios to auto-generate (ignored if scenarios provided) |
scenarios | list[dict] | Optional | — | Custom scenarios with input (str) and outputs (list) |
| Returns | Type | Description |
|---|---|---|
scenarios | array | List of validated scenarios |
scenarios[].id | string | Unique scenario identifier |
scenarios[].input | object | Input sent to the system |
scenarios[].expected_outputs | array | List of expected responses |
scenarios[].source | string | Scenario origin (user or generated) |
The system description must be validated before generating a dataset. Otherwise, the exception DescriptionNotValidatedError will be raised.
# With custom scenarios
dataset = client.generate_dataset(system_id, scenarios=[
{"input": "Hello, how does this work?", "outputs": ["Welcome! I'm here to help."]},
{"input": "I want a refund", "outputs": ["I'll redirect you to our customer service."]},
])
print(f"{len(dataset['scenarios'])} scenarios validated")
# Automatic generation
dataset = client.generate_dataset(system_id, num_scenarios=20)
print(f"{len(dataset['scenarios'])} scenarios generated")
updateDataset
Updates the dataset with instructions or new scenarios.
| Parameter | Type | Required | Description |
|---|---|---|---|
system_id | str | Required | Unique system identifier |
orientation | str | Optional | Instructions to refine the dataset |
scenarios | list[dict] | Optional | New scenarios to replace existing ones |
# Refine with instructions
dataset = client.update_dataset(
system_id,
orientation="Add more refund request cases"
)
print(f"{len(dataset['scenarios'])} scenarios after update")
The updated and re-validated dataset — same structure as generateDataset (see above).
Evaluations
evaluate
Runs a complete system evaluation.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
system_id | str | Required | — | Unique system identifier |
profile | str | Optional | required | Evaluation profile (see table below) |
thematics_config | dict | Optional | — | Custom configuration per criterion (replaces profile) |
wait | bool | Optional | true | Wait for evaluation to complete before returning |
poll_interval | int | Optional | 5 | Seconds between each status check |
on_poll | Callable[[str, int], None] | Optional | — | Callback invoked on each status check (status, elapsed seconds) |
| Returns | Type | Description |
|---|---|---|
run_id | string | Evaluation run identifier |
status | string | completed, failed, running, etc. |
summary.global_score | number | Global score as a percentage |
summary.thematics | object | Detailed scores by thematic (score, passed) |
result = client.evaluate(
system_id,
profile="required",
wait=True,
poll_interval=5,
)
print(f"Status: {result['status']}")
print(f"Global score: {result['summary']['global_score']}%")
With
wait=false, onlyrun_idis returned immediately. UsegetEvaluation(runId)to retrieve results later.
Evaluation Profiles
| Profile | Description |
|---|---|
required | Required criteria adapted to the system's scope |
extended | Extended criteria adapted to the system's scope |
minimum | Fixed minimal profile (quick evaluation) |
standard | Fixed standard profile (balance coverage/time) |
maximum | Fixed complete profile (all criteria) |
Custom Configuration (thematics_config)
For a custom evaluation, use thematics_config instead of profile:
Note: The main key of
thematics_configmust be the exact name of a thematic among those listed below (e.g.,fairness_ethics,privacy_security, etc.).
result = client.evaluate(
system_id,
thematics_config={
"fairness_ethics": {
"gender": {"nb_tests": 5},
"age": {"nb_tests": 5}
}
},
wait=True,
)
Available Dimensions and Criteria
privacy_security — Data protection and security (28 criteria)
Privacy: pii_reuse, pii_request, pii_masking_detection, pii_in_logs, pii_in_db, pii_masking_db, pii_masking_logs, refusal_privacy
Security - Exfiltration: pii_exfiltration, tech_exfiltration, tech_exfiltration_logs, tech_exfiltration_db, internal_exfiltration, internal_exfiltration_logs, internal_exfiltration_db, context_exfiltration, context_exfiltration_db, context_exfiltration_logs, traces_exfiltration, traces_exfiltration_logs, traces_exfiltration_db, refusal_security
Security - Resistance: multiturn_resistance
reliability_performance — Reliability and attack resistance (6 criteria)
reproducibility, quality, prompt_injection, social_engineering, obfuscation, context_manipulation
fairness_ethics — Fairness and ethics (9 criteria)
Bias: age, ethnic, gender, health, identity, religious, socioeconomic
Ethics: traceability, human_escalation
explainability_transparency — Transparency and explainability (9 criteria)
justification, purpose_disclosure, ai_nature_disclosure, ai_self_disclosure, control_transparency, ambiguous_scope_clarification, refusal_scope, refusal_nonqualification, limitation_explanation
accountability_responsibility — Accountability and traceability (7 criteria)
usage_conformity, scope_creep_detection, opt_out_capabilities, decision_override, override_refusal_resistance, secure_logging_db, secure_logging_logs
sustainability — Environmental efficiency (2 criteria)
db_environmental_efficiency, log_environmental_efficiency
getEvaluation
Retrieves evaluation status or result.
| Parameter | Type | Required | Description |
|---|---|---|---|
run_id | str | Required | Evaluation run identifier |
result = client.get_evaluation(run_id)
print(f"Status: {result['status']}")
if result["status"] == "completed":
print(f"Score: {result['summary']['global_score']}%")
Same structure as evaluate (see above).
Errors
The SDK throws typed exceptions for easier error handling.
| Exception | Description |
|---|---|
CredentialsError | Missing or invalid API key |
AuthenticationError | Expired or rejected API key (401) |
NotFoundError | Resource not found (404) |
ValidationError | Request validation failed (422) |
RateLimitError | Too many requests (429) |
InvalidEndpointError | Misconfigured endpoint |
EndpointNotConfiguredError | Evaluation without configured endpoint |
DescriptionNotValidatedError | Description not validated |
ConnectorAlreadyExistsError | Connector already exists (same category) |
ServerError | Server error (500) |
Each exception contains contextual information for easier debugging:
from mankinds_sdk.exceptions import ConnectorAlreadyExistsError
try:
client.add_connector(system_id, connector)
except ConnectorAlreadyExistsError as e:
print(f"Connector {e.existing_type} already exists")