Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.woodwide.ai/llms.txt

Use this file to discover all available pages before exploring further.

Factor analysis models discover the latent factors (principal components) that explain variance in your data and generate human-readable descriptions for each factor. Unlike other model types, the output has one row per factor, not per input instance.

Training

Training fits the model to your data, learning a representation that can be used to extract factors. No label_column is needed. No validation metrics are computed for factor models.
Python
import os, time, requests

api_key = os.getenv("WOODWIDE_API_KEY")
base_url = "https://api.woodwide.ai"
headers = {"Authorization": f"Bearer {api_key}"}

# Upload data
with open("survey_responses.csv", "rb") as f:
    resp = requests.post(
        f"{base_url}/datasets",
        headers=headers,
        files={"file": ("survey_responses.csv", f, "text/csv")},
        data={"dataset_name": "survey_data"},
    )
dataset_id = resp.json()["dataset"]["id"]

# Train a factor analysis model
resp = requests.post(
    f"{base_url}/models/train",
    headers=headers,
    json={
        "model_name": "survey_factors",
        "model_type": "factors",
        "dataset_id": dataset_id,
    },
)
model_id = resp.json()["model"]["id"]

# Wait for training
while True:
    model = requests.get(
        f"{base_url}/models/{model_id}", headers=headers
    ).json()
    if model["status"] == "ready":
        break
    time.sleep(5)

Inference

Run inference to discover factors in a dataset. The factors and their descriptions are generated based on the inference dataset, using the representation learned during training. This means you can run factor analysis on different datasets to understand their structure through the lens of the trained model. Running inference on the training data itself is the most common use case — it tells you what latent factors explain your training data. Running on new data reveals how those factors manifest in a different dataset. The number of factors is automatically determined to capture at least 90% of the variance (up to 10 factors).
Python
with open("survey_responses.csv", "rb") as f:
    resp = requests.post(
        f"{base_url}/models/{model_id}/infer",
        headers=headers,
        files={"file": ("survey_responses.csv", f, "text/csv")},
        data={"output_type": "json"},
    )

results = resp.json()["data"]
print(results)
See Output Formats for the full output schema.