Anomaly Detection - Wood Wide AI SDK Documentation

Anomaly detection models learn what “normal” looks like from your training data and then flag rows in inference data that deviate from those patterns.

Training

Training fits the model to your data, learning the distribution of normal rows. No label_column is needed — anomaly detection is fully unsupervised. No validation metrics are computed for anomaly models.

import os, time, requests

api_key = os.getenv("WOODWIDE_API_KEY")
base_url = "https://api.woodwide.ai"
headers = {"Authorization": f"Bearer {api_key}"}

# Upload data
with open("transactions.csv", "rb") as f:
    resp = requests.post(
        f"{base_url}/datasets",
        headers=headers,
        files={"file": ("transactions.csv", f, "text/csv")},
        data={"dataset_name": "transactions"},
    )
dataset_id = resp.json()["dataset"]["id"]

# Train an anomaly detection model
resp = requests.post(
    f"{base_url}/models/train",
    headers=headers,
    json={
        "model_name": "fraud_detector",
        "model_type": "anomaly",
        "dataset_id": dataset_id,
    },
)
model_id = resp.json()["model"]["id"]

# Wait for training
while True:
    model = requests.get(
        f"{base_url}/models/{model_id}", headers=headers
    ).json()
    if model["status"] == "ready":
        break
    time.sleep(5)

Inference

Run inference on data you want to scan for anomalies. This can be the training data itself (to find outliers within it) or new data (to detect rows that deviate from the training distribution). The output format depends on the anomaly_format parameter:

Value	Description
`ids_only` (default)	Returns a compact list of row indices flagged as anomalous.
`per_row`	Returns a row for every input instance with an anomaly flag and score.

# Detect anomalies -- compact format (default)
with open("transactions.csv", "rb") as f:
    resp = requests.post(
        f"{base_url}/models/{model_id}/infer",
        headers=headers,
        files={"file": ("transactions.csv", f, "text/csv")},
        data={"output_type": "json", "anomaly_format": "ids_only"},
    )

results = resp.json()["data"]
print(results)  # {"anomalous_ids": [3, 17, 42]}

# Detailed per-row output
with open("transactions.csv", "rb") as f:
    resp = requests.post(
        f"{base_url}/models/{model_id}/infer",
        headers=headers,
        files={"file": ("transactions.csv", f, "text/csv")},
        data={"output_type": "json", "anomaly_format": "per_row"},
    )

results = resp.json()["data"]
print(results)

See Output Formats for the full output schema.

​Training

​Inference

Training

Inference