Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.woodwide.ai/llms.txt

Use this file to discover all available pages before exploring further.

Embedding models produce a dense vector representation for each row in your data. These vectors capture the semantic structure of your tabular data and can be used for downstream tasks such as similarity search, clustering, or visualization.

Training

Training fits an embedding model to your dataset. No label_column is needed. No validation metrics are computed for embedding models.
Python
import os, time, requests

api_key = os.getenv("WOODWIDE_API_KEY")
base_url = "https://api.woodwide.ai"
headers = {"Authorization": f"Bearer {api_key}"}

# Upload data
with open("products.csv", "rb") as f:
    resp = requests.post(
        f"{base_url}/datasets",
        headers=headers,
        files={"file": ("products.csv", f, "text/csv")},
        data={"dataset_name": "products"},
    )
dataset_id = resp.json()["dataset"]["id"]

# Train an embedding model
resp = requests.post(
    f"{base_url}/models/train",
    headers=headers,
    json={
        "model_name": "product_embeddings",
        "model_type": "embedding",
        "dataset_id": dataset_id,
    },
)
model_id = resp.json()["model"]["id"]

# Wait for training
while True:
    model = requests.get(
        f"{base_url}/models/{model_id}", headers=headers
    ).json()
    if model["status"] == "ready":
        break
    time.sleep(5)

Inference

Run inference to generate embeddings. You can embed the training data or new data. The model will produce embeddings that are consistent with the representation learned during training.
Python
with open("products.csv", "rb") as f:
    resp = requests.post(
        f"{base_url}/models/{model_id}/infer",
        headers=headers,
        files={"file": ("products.csv", f, "text/csv")},
        data={"output_type": "json"},
    )

results = resp.json()["data"]
print(results)
See Output Formats for the full output schema.