Documentation Index
Fetch the complete documentation index at: https://docs.woodwide.ai/llms.txt
Use this file to discover all available pages before exploring further.
Embedding models produce a dense vector representation for each row in your data. These vectors capture the semantic structure of your tabular data and can be used for downstream tasks such as similarity search, clustering, or visualization.
Training
Training fits an embedding model to your dataset. No label_column is needed. No validation metrics are computed for embedding models.
import os, time, requests
api_key = os.getenv("WOODWIDE_API_KEY")
base_url = "https://api.woodwide.ai"
headers = {"Authorization": f"Bearer {api_key}"}
# Upload data
with open("products.csv", "rb") as f:
resp = requests.post(
f"{base_url}/datasets",
headers=headers,
files={"file": ("products.csv", f, "text/csv")},
data={"dataset_name": "products"},
)
dataset_id = resp.json()["dataset"]["id"]
# Train an embedding model
resp = requests.post(
f"{base_url}/models/train",
headers=headers,
json={
"model_name": "product_embeddings",
"model_type": "embedding",
"dataset_id": dataset_id,
},
)
model_id = resp.json()["model"]["id"]
# Wait for training
while True:
model = requests.get(
f"{base_url}/models/{model_id}", headers=headers
).json()
if model["status"] == "ready":
break
time.sleep(5)
Inference
Run inference to generate embeddings. You can embed the training data or new data. The model will produce embeddings that are consistent with the representation learned during training.
with open("products.csv", "rb") as f:
resp = requests.post(
f"{base_url}/models/{model_id}/infer",
headers=headers,
files={"file": ("products.csv", f, "text/csv")},
data={"output_type": "json"},
)
results = resp.json()["data"]
print(results)
See Output Formats for the full output schema.