Embeddings

Embedding models produce a dense vector representation for each row in your data. These vectors capture the semantic structure of your tabular data and can be used for downstream tasks such as similarity search, clustering, or visualization.

Training

Training fits an embedding model to your dataset. No label_column is needed. No validation metrics are computed for embedding models.

Python

import os, time, requests

api_key = os.getenv("WOODWIDE_API_KEY")
base_url = "https://api.woodwide.ai"
headers = {"Authorization": f"Bearer {api_key}"}

# Upload data
with open("products.csv", "rb") as f:
    resp = requests.post(
        f"{base_url}/datasets",
        headers=headers,
        files={"file": ("products.csv", f, "text/csv")},
        data={"dataset_name": "products"},
    )
dataset_id = resp.json()["dataset"]["id"]

# Train an embedding model
resp = requests.post(
    f"{base_url}/models/train",
    headers=headers,
    json={
        "model_name": "product_embeddings",
        "model_type": "embedding",
        "dataset_id": dataset_id,
    },
)
model_id = resp.json()["model"]["id"]

# Wait for training
while True:
    model = requests.get(
        f"{base_url}/models/{model_id}", headers=headers
    ).json()
    if model["status"] == "ready":
        break
    time.sleep(5)

Inference

Run inference to generate embeddings. You can embed the training data or new data. The model will produce embeddings that are consistent with the representation learned during training.

Python

with open("products.csv", "rb") as f:
    resp = requests.post(
        f"{base_url}/models/{model_id}/infer",
        headers=headers,
        files={"file": ("products.csv", f, "text/csv")},
        data={"output_type": "json"},
    )

results = resp.json()["data"]
print(results)

See Output Formats for the full output schema.

Wood Wide AI

Capabilities

Guides

Training

Inference

Wood Wide AI

Capabilities

Guides

Documentation Index

​Training

​Inference

Training

Inference