--- name: predicting-with-tabpfn description: > Predicts outcomes on structured/tabular data using TabPFN via the tabpfn-stg MCP server. Handles classification and regression on CSVs, spreadsheets, and inline tables. Use when the user wants to classify rows, predict numeric values, forecast, estimate, or score tabular data, or mentions TabPFN, Prior Labs, or "predict X from this data." Works with and without code execution. --- # Predicting with TabPFN Call `tabpfn` MCP tools to make predictions. Do not write ML code. ## Before any tool call, confirm with the user 1. **Target column** — which column to predict. Do not guess. 2. **Task type** — `classification` or `regression`. If ambiguous (e.g. numeric column with few distinct values), ask. 3. **Train/test split** — what's training data vs what needs predictions. If all targets are filled in, suggest 80/20 and confirm. ## Data quality: fix vs leave alone TabPFN handles most data issues natively. Do NOT pre-process unless the issue is in the "MUST fix" list. ### MUST fix (will cause errors or garbage results) - **NaN/null in target column**: Remove rows where the target is missing from training data. This is the #1 cause of failures. - **Target column in test set**: Test CSV must never contain the target column. - **Column mismatch**: Train and test must have identical feature columns in the same order (excluding target). - **Entirely empty columns**: Drop columns that are 100% null. - **Single-value target**: If the target has only one unique value, there's nothing to predict. Tell the user. - **Duplicate column names**: Rename or drop. ### Flag to user (won't crash but hurts quality) - **ID / index columns** (row numbers, UUIDs, auto-increment): Leak or add noise. Ask if they should be excluded. - **Free text columns** (descriptions, comments, long strings): TabPFN is for structured features. Suggest dropping. - **Timestamp columns**: Raw datetime strings won't help. With code execution, offer to extract numeric features (year, month, day_of_week). Without, flag it. - **Extreme class imbalance** (e.g. 99/1): TabPFN handles imbalance, but results may skew toward majority class. Mention when presenting results. - **High cardinality categoricals** (>100 unique values): Can handle, but performance may degrade. Flag if spotted. - **Near-duplicate rows**: Large numbers of identical or near-identical rows may distort predictions. Worth mentioning. ### Do NOT touch (TabPFN handles natively) - Missing values in features — pass as null, do not impute. - Mixed types in a column — do not encode. - Categorical features — do not one-hot or label encode. - Unnormalized numerics — do not standardize or scale. - Feature selection — TabPFN handles irrelevant features internally. ## Inline data (small tables already in chat) ``` tabpfn:fit_and_predict_inline( X_train=<2D array, feature columns only>, y_train=<1D array, aligned with X_train>, X_test=<2D array, same columns/order as X_train>, task_type="classification" | "regression", output_type="preds" | "mean" ) ``` Default `output_type`: `"preds"` for classification, `"mean"` for regression. Use `"probas"` or `"quantiles"` only if the user asks. ## File-based data **Step 1 — Upload each file:** ``` tabpfn:upload_dataset(filename="train.csv") → dataset_id, upload_url tabpfn:upload_dataset(filename="test.csv") → dataset_id, upload_url ``` PUT file bytes to each `upload_url`. Do not read file contents into conversation. If only one file: with code execution, split into train (with target) and test (without target) CSVs, then upload both. Without code execution, ask the user to split or paste a sample for inline use. **Step 2 — Fit and predict:** ``` tabpfn:fit_and_predict_from_dataset( train_dataset_id=, test_dataset_id=, target_column="", task_type="classification" | "regression", output_type="preds" | "mean" ) ``` ## Re-predicting with an existing model `fit_and_predict_*` calls return a `model_id`. Reuse it for new data: - `tabpfn:predict_inline(model_id=..., X_test=..., task_type=..., output_type=...)` - `tabpfn:predict_from_dataset(model_id=..., test_dataset_id=..., task_type=..., output_type=...)` ## output_type options - Classification: `"preds"` (default), `"probas"` - Regression: `"mean"` (default), `"median"`, `"mode"`, `"quantiles"`, `"full"` ## Constraints - Max cells: (train_rows + test_rows) × columns < 20,000,000 - `output_type="full"` regression: test samples must be < 500 - Best for datasets up to ~50k rows, up to ~2000 features