Use this file to discover all available pages before exploring further.
Here are 4 sample patterns for combining Wood Wide AI tasks to answer questions that no single model can do alone.Each task type: prediction, clustering, anomaly detection, and factor analysis, is useful on its own. But the most powerful analyses chain them together. These recipes show you how.
This recipe helps you cluster your data first, then train a separate prediction model per cluster.Why try it? A single prediction model trained on all your data learns average patterns. However, the signal that predicts churn in your enterprise accounts looks nothing like the signal in your SMB accounts. Segmenting first lets each model focus on the patterns that actually matter for that group.
Upload your dataset and run Clustering. Use the columns that describe behavior or structure, not the outcome you want to predict. Let the model find the natural groupings.
2
Export each cluster as a separate dataset
Tag each row with its cluster label, then split into separate files, one per cluster.
3
Train a prediction model per cluster
Upload each cluster file and train a Prediction model on each one, using the same target column across all of them.
4
Run inference within each segment
When new data arrives, assign it to a cluster first (using the clustering model), then run inference with the matching prediction model.
Cluster customers by usage frequency, contract value, and product adoption metrics. Train a churn prediction model separately for each segment. High-usage enterprise customers and low-usage trial accounts have completely different churn patterns; one model for both will underperform on both.
Deal close probability by account type
Cluster open opportunities by deal size, sales cycle length, and number of stakeholders involved. Train a win/loss prediction model per cluster. A 500K multi−stakeholder deal has different predictive features than a 10K single-buyer deal.
Payment default risk by customer profile
Cluster accounts by payment history, average invoice size, and days-to-pay trends. Train a default prediction model per cluster. Seasonal businesses and steady recurring accounts default for different reasons at different times.
Equipment failure by operating condition
Cluster machines or devices by operating environment (load, temperature range, run hours). Train a failure prediction model per cluster. A machine running at 90% capacity fails differently than one running at 40%.
This recipe uses anomaly detection to flag unusual records, then run factor analysis on the flagged set to understand what they have in common.Why try it? Anomaly detection tells you whether something is off. Factor analysis tells you what pattern the anomalies share. Together they turn a list of outliers into an actionable finding.
Train on 12 months of monthly revenue data by account (ARR, expansion, contraction, churn, new bookings). Flag months where the pattern breaks. Run factor analysis on flagged months to identify whether the anomaly is driven by contraction in a specific segment, a drop in new bookings, or an unusual churn spike.
Transaction fraud pattern discovery
Train on normal transaction data (amount, frequency, merchant category, time of day, geography). Flag transactions with high anomaly scores. Run factor analysis on flagged transactions to surface the combination of features (such as unusually high amounts at unusual hours in unusual locations) that characterizes the fraud pattern.
Pipeline health breakdown
Train on healthy quarter-close pipeline data (deals by stage, average deal age, stage conversion rates, rep activity metrics). Flag quarters or territories where the pattern breaks. Factor analysis on the anomalies reveals whether the breakdown is concentrated in a specific stage, deal size band, or rep cohort.
Supply chain disruption diagnosis
Train on normal procurement and fulfillment data (lead times, order volumes, supplier fill rates, inventory levels). Flag disrupted periods. Factor analysis on the flagged records reveals whether disruptions cluster around specific suppliers, SKUs, or logistics routes.
This recipe helps you train one prediction model on all your data, run inference, then break out the predicted outcomes by cluster to compare risk or opportunity across groups.Why try it? A single model gives you a score per row. Slicing those scores by segment tells you where the risk or opportunity is concentrated, and by how much. This helps turn a straight model into a decision-making tool.
Upload your labeled data and train a Prediction model. This gives you a model that scores any new record.
2
Cluster your inference data
Upload your current (unlabeled) data and run Clustering to assign each record to a segment.
3
Run inference with the prediction model
Run your prediction model on the same dataset. Each record now has both a cluster label and a predicted outcome score.
4
Compare predicted outcomes across clusters
Group by cluster and look at the distribution of predicted scores. Which segments have the highest predicted churn? The highest predicted revenue? The most at-risk accounts?
Train a churn model on historical customer data. Cluster your current active accounts by spend, tenure, and engagement. Run inference and compare predicted churn rates across clusters. You might find that 70% of your predicted churn is concentrated in a single cluster e.g. mid-market accounts with declining usage in months 8–14.
Upsell opportunity by product usage pattern
Train a model to predict expansion revenue (accounts that upgraded in the past). Cluster current accounts by feature adoption and usage depth. Run inference and rank clusters by predicted expansion probability. Prioritize the highest-scoring cluster for outbound.
Credit risk concentration by borrower profile
Train a default prediction model on historical loan data. Cluster current borrowers by loan size, term, payment history, and utilization. Compare predicted default rates across clusters to identify where portfolio risk is concentrated before it surfaces in actuals.
Conversion rate by lead profile
Train a conversion model on historical closed/lost deal data. Cluster inbound leads by firmographic and behavioral attributes (company size, industry, pages visited, time to first action). Run inference and compare predicted conversion rates across clusters to prioritize outreach.
This recipe helps you train an anomaly detection model on clean historical data as your baseline, then run inference on each new period’s data to detect drift over time.Why try it? Anomaly detection is relative; it needs to know what normal looks like. By anchoring on a stable historical baseline, you can detect when current data starts behaving differently, before the difference shows up in your lagging indicators.
Upload a clean historical dataset as your baseline
Choose a period that represents normal operations (not a period with a lot of known incidents), seasonality outliers, or data quality issues. Upload it as your baseline dataset.
2
Train an anomaly detection model on the baseline
Run Anomaly Detection on the baseline. The model encodes what normal looks like for this data.
3
Run inference on each new period
Each week, month, or quarter, upload your new data and run inference with the baseline model. Higher anomaly scores = greater deviation from the baseline pattern.
4
Track anomaly scores over time
Compare aggregate anomaly scores across periods. A rising score signals drift. Combine with Recipe 2 (Detect → Diagnose) to identify which factors are driving the drift.
Baseline on 12 months of stable ARR data (new bookings, expansion, contraction, churn by cohort). Run inference on each subsequent month. A spike in anomaly score in month 15 might catch an unusual contraction pattern two months before it shows up in net revenue retention.
Transaction volume and value drift
Baseline on a representative period of transaction data (amounts, frequencies, category distributions, timing). Run inference on each new week or month. Useful for detecting shifts in customer purchasing behavior, seasonal anomalies outside expected ranges, or early signs of fraud pattern changes.
Sales pipeline velocity monitoring
Baseline on quarters where pipeline converted at expected rates (stage progression speed, deal age by size, close rate by rep). Run inference each quarter. Flags when pipeline behavior deviates — deals stalling in a stage they don’t normally stall in, or close rates dropping in a segment before it shows in quota attainment.
Operational metric drift
Baseline on normal operating periods for any numeric operational dataset — logistics (delivery times, fill rates, order volumes), finance (expense ratios, budget utilization, invoice timing), or SaaS usage (logins, feature calls, API volume). Run inference each period to catch shifts before they become incidents.
Not sure how to interpret your results or structure your next step? Paste your data summary into Claude or ChatGPT with one of these prompts.
Recipe 1: Segment, Then Predict
I'm running a two-step machine learning analysis on tabular data.Step 1: I clustered my dataset using an unsupervised clustering model. Each rownow has a cluster label (e.g., Cluster 0, Cluster 1, Cluster 2). Here is asummary of each cluster — the average values of key columns per cluster:[PASTE CLUSTER SUMMARY HERE]Step 2: I want to train a separate prediction model on each cluster to predict[TARGET COLUMN — e.g., churned, converted, defaulted].Please help me:1. Describe what each cluster likely represents in plain business terms based on the column averages2. Identify which columns are likely to be the strongest predictors of [TARGET COLUMN] within each cluster, and why they might differ across clusters3. Flag any clusters that may be too small or too homogeneous to train a reliable prediction model on4. Suggest whether any clusters should be merged before training
Recipe 2: Detect, Then Diagnose
I ran anomaly detection on my dataset and flagged a set of records with highanomaly scores. I then ran factor analysis on just the flagged records tounderstand what they have in common.Here is the factor analysis output — the columns with the highest loadingson each factor:[PASTE FACTOR ANALYSIS RESULTS HERE]The dataset contains the following columns:[LIST YOUR COLUMN NAMES AND WHAT THEY REPRESENT]Please help me:1. Interpret each factor in plain terms — what business concept does it represent?2. Explain what the combination of high-loading columns suggests about why these records were flagged as anomalies3. Suggest 2-3 hypotheses about the root cause of the anomaly pattern4. Recommend what additional data or context I should look at to confirm or rule out each hypothesis
Recipe 3: Predict, Then Slice by Segment
I trained a prediction model to predict [TARGET — e.g., churn, conversion,default] and ran inference on my current dataset. I also clustered the samedataset into segments. Each record now has both a predicted score and acluster label.Here is the distribution of predicted scores by cluster:[PASTE CLUSTER × PREDICTION SUMMARY — e.g., average predicted score,% above threshold, count per cluster]The clusters have these approximate characteristics:[PASTE CLUSTER SUMMARY OR DESCRIPTION]Please help me:1. Identify which cluster(s) represent the highest concentration of risk or opportunity based on the predicted scores2. Explain in plain terms why that cluster might score higher, based on its characteristics3. Suggest what action to take for each cluster — prioritize, monitor, investigate, or ignore4. Flag any clusters where the prediction scores seem surprising or inconsistent with the cluster profile, which might indicate a data issue
Recipe 4: Baseline, Then Monitor
I trained an anomaly detection model on a historical baseline datasetrepresenting normal operations. I've been running inference on each newperiod's data and tracking aggregate anomaly scores over time.Here are my anomaly scores by period:[PASTE PERIOD-BY-PERIOD ANOMALY SCORE SUMMARY]The dataset tracks the following metrics:[LIST YOUR COLUMN NAMES AND WHAT THEY REPRESENT]Known context: [ADD ANY KNOWN EVENTS — e.g., "we ran a promotion in March","a new pricing tier launched in Q3", "leave blank if none"]Please help me:1. Identify which periods show meaningful deviation from baseline and which are within normal variation2. Suggest whether the drift pattern looks like a sudden shift (single period spike) or a gradual trend (scores rising over multiple periods)3. Recommend which columns to investigate first based on the period where scores started rising4. Help me determine whether this drift is likely operational (a real change in behavior) or a data quality issue (a change in how data was recorded)