# TAE Pinmap Parser — Complete Rule Reference > **Version:** 1.0.0 | **Organization:** Tai-Action | **Date:** 2026-03-31 > > **Abstract:** Excel-based hardware pin requirement parser for Tai-Action MCU series. Converts non-standardized customer pin specification Excel documents into strongly-typed validated JSON using Pydantic v2. Covers ADC, HRPWM, Communication (UART/SPI/CAN), Timer, GPIO, and System pin subsystems with embedded image extraction. > > **Note:** This is an auto-compiled version. The canonical source for each rule is the individual file in `rules/`. When in doubt, refer to the source files. --- ## Table of Contents 1. [Pinmap Parser Installation and Setup](#1-pinmap-parser-installation-and-setup) (HIGH) 2. [Excel Pinmap Parsing Workflow](#2-excel-pinmap-parsing-workflow) (HIGH) 3. [Output JSON Schema Reference](#3-output-json-schema-reference) (MEDIUM) 4. [Pinmap Parser Troubleshooting](#4-pinmap-parser-troubleshooting) (HIGH) --- ## 1. Pinmap Parser Installation and Setup **Impact: HIGH** — Without correct Python environment and dependencies, the parser cannot execute. openpyxl, Pydantic v2, and Pillow are all required. ### Prerequisites - Python 3.10+ - pip package manager ### Step 1: Install Dependencies ```bash pip install -r tools/requirements.txt ``` Required packages: | Package | Version | Purpose | |---------|---------|---------| | openpyxl | >= 3.1 | Excel (.xlsx) file reading | | pydantic | >= 2.0 | Data validation and JSON serialization | | Pillow | >= 10.0 | Embedded image extraction | ### Step 2: Verify Installation ```bash python -c "import openpyxl; import pydantic; import PIL; print('OK')" ``` ### Step 3: Basic Usage Test ```bash python tools/tae_pinmap_parser.py ``` Expected output: chip model, pin counts per subsystem, image count. ### Global Tool Registration (Optional) Create a symlink for CLI access: ```bash chmod +x tools/tae_pinmap_parser.py ln -s $(realpath tools/tae_pinmap_parser.py) ~/.local/bin/tae-pinmap-parser ``` After registration: ```bash tae-pinmap-parser input.xlsx # Output to ./input.json tae-pinmap-parser input.xlsx /tmp # Output to /tmp/input.json tae-pinmap-parser input.xlsx /tmp -o out.json # Output to /tmp/out.json ``` ### Virtual Environment (Recommended) ```bash python -m venv .venv source .venv/bin/activate # Linux/macOS pip install -r tools/requirements.txt ``` ### Running Tests ```bash # Set test Excel file path (if not using default) export TAE_TEST_EXCEL=/path/to/需求1.xlsx pytest tools/tests/test_acceptance.py -v ``` > **Note:** Tests require the sample Excel file (`需求1.xlsx`). Default path: `/home/leo/work/workspaces/TAE-CoCreate-Cases/case1/prd/需求1.xlsx`. Override via `TAE_TEST_EXCEL` environment variable. --- ## 2. Excel Pinmap Parsing Workflow **Impact: HIGH** — The core parsing pipeline transforms non-standardized customer Excel documents into validated structured JSON. Incorrect usage leads to missing or malformed output. ### Pipeline Overview ``` Read Excel → ffill merged cells → Clean data → Group by module → Pydantic validate → JSON output ``` ### Input Requirements Excel file (`.xlsx`) must contain a header row (row 1) with these exact Chinese column names: | Column Name | Description | |------------|-------------| | `芯片型号` | Chip model (e.g., TAE32G5800) | | `模块` | Module/subsystem name | | `引脚功能` | Pin function name | | `IO索引` | IO index (e.g., PA0, PB11) | | `引脚编号` | Pin number (integer) | | `编号` | Sequence ID | | `DMA使用` | DMA enabled (1.0 = true) | | `波特率` | Baud rate / frequency | | `功能备注` | Role / function description | The first worksheet is used. Data starts from row 2. ### Step 1: Basic Invocation ```bash python tools/tae_pinmap_parser.py input.xlsx # Output: ./input.json + ./images/ (if images present) ``` ### Step 2: Command-Line Options ``` tae-pinmap-parser [outdir] [-o output.json] [--no-images] ``` | Argument | Required | Description | |----------|----------|-------------| | `input` | Yes | Input Excel file path (.xlsx) | | `outdir` | No | Output directory (default: current dir) | | `-o, --output` | No | Custom output filename | | `--no-images` | No | Skip embedded image extraction | ### Step 3: Understand the Output Console output summary example: ``` Chip Model: TAE32G5800 ADC: 11 pins HRPWM: 10 pins Communication: 3 protocols - UART (UART2): 2 pins, rate=115200 - SPI (SPI0): 4 pins, rate=None - CAN (CAN1): 2 pins, rate=250000 Timer Capture: 2 pins GPIO Output: 5 pins GPIO Input: 7 pins System: 6 pins Images: 2 extracted to images ``` ### Step 4: Verify Output JSON Check that the output JSON contains `Chip_Model` and `Subsystems` with all 7 subsystem categories. ### Data Processing Details #### Forward Fill (ffill) `芯片型号` and `模块` columns use merged cells in Excel. The parser performs forward fill: when a cell is empty, it inherits the last non-empty value above it. `引脚功能` is NOT forward-filled. #### Module Normalization Raw module names are mapped to normalized keys: | Raw Value | Normalized Key | |-----------|---------------| | `ADC` | `ADC` | | `HRPWM` | `HRPWM` | | `COM_UART` | `COM_UART` | | `COM_SPI0` | `COM_SPI0` | | `CAN` | `CAN` | | `NRST` | `NRST` | | `BOOT` | `BOOT` | | `TIMER捕获` | `TIMER_CAPTURE` | | `GPIO` | `GPIO` | | `EMU` | `EMU` | | `OSC` | `OSC` | Matching is case-insensitive. Chinese characters (`TIMER捕获`) require exact raw match. #### Baud Rate / Frequency Parsing Supported formats: | Input | Parsed Value | |-------|-------------| | `10Khz` | 10000 | | `100KHZ` | 100000 | | `250Kbps` | 250000 | | `115200` | 115200 | | Empty/NaN | None | #### Communication Protocol Aggregation Communication rows (`COM_UART`, `COM_SPI0`, `CAN`) are clustered by `(Protocol, Instance)`: - **UART**: `UART2_TX` → Protocol=`UART`, Instance=`UART2` - **SPI**: Pin functions `CS/CLK/MISO/MOSI` with module `COM_SPI0` → Protocol=`SPI`, Instance=`SPI0` - **CAN**: `CAN1_TXD` → Protocol=`CAN`, Instance=`CAN1` Multiple pins with the same protocol+instance are grouped into a single `CommunicationConfig`. #### GPIO Direction Handling The GPIO module uses sub-header rows to separate output and input: - `IO_OUT` row: all following GPIO rows get `Direction="OUT"` - `IO_IN` row: all following GPIO rows get `Direction="IN"` > **Note:** The `IO_IN` sub-header row itself may carry pin data (e.g., PC1). It is treated as both a direction marker and a data row. #### Image Extraction When images are embedded in the Excel file: 1. Images are saved to `images/` directory next to the output JSON 2. Each image gets metadata: `filename`, `cell`, `anchor_type`, `format`, `width`, `height` 3. Images are associated to data rows by their anchor row position 4. Use `--no-images` to skip extraction --- ## 3. Output JSON Schema Reference **Impact: MEDIUM** — Understanding the output schema is essential for downstream consumers that process the parsed pin configuration JSON. ### Top-Level Structure ```json { "Chip_Model": "TAE32G5800", "Subsystems": { ... } } ``` | Field | Type | Description | |-------|------|-------------| | `Chip_Model` | `str` | Chip model identifier from Excel | | `Subsystems` | `Subsystems` | Container for all subsystem pin configurations | ### Subsystems Object | Field | Type | Description | |-------|------|-------------| | `Actuation_HRPWM` | `list[HRPWMConfig]` | HRPWM output pins with frequency | | `Sensing_ADC` | `list[ADCConfig]` | ADC input pins | | `Communication_and_Storage` | `list[CommunicationConfig]` | UART/SPI/CAN protocol clusters | | `Sensing_Timer` | `list[TimerCaptureConfig]` | Timer capture pins | | `GPIO_Output` | `list[GPIOConfig]` | Digital output pins | | `GPIO_Input` | `list[GPIOConfig]` | Digital input pins | | `System` | `list[SystemConfig]` | System pins (NRST, BOOT, EMU, OSC) | ### Data Models #### PinConfig (Base) Base model inherited by ADC, HRPWM, Timer, GPIO, System configs. | Field | Type | Optional | Description | |-------|------|----------|-------------| | `ID` | `int` | Yes | Sequence number from Excel | | `Pin_Function` | `str` | No | Pin function name (e.g., `ADC0_IN1`) | | `IO_Index` | `str` | Yes | IO port index (e.g., `PA0`) | | `Pin_Number` | `int` | Yes | Physical pin number | | `DMA_Enabled` | `bool` | Yes | DMA support (only present when true) | | `Role` | `str` | Yes | Function description / customer requirement | | `images` | `list[ImageInfo]` | Yes | Associated embedded images | #### HRPWMConfig (extends PinConfig) | Additional Field | Type | Optional | Description | |-----------------|------|----------|-------------| | `Frequency_Hz` | `int` | Yes | PWM frequency in Hz (parsed from mixed units) | #### ADCConfig (extends PinConfig) No additional fields. #### CommunicationConfig | Field | Type | Optional | Description | |-------|------|----------|-------------| | `Protocol` | `str` | No | Protocol type: `UART`, `SPI`, `CAN` | | `Instance` | `str` | Yes | Protocol instance: `UART2`, `SPI0`, `CAN1` | | `Pins` | `list[CommPin]` | No | Pins belonging to this protocol instance | | `Baud_Rate` | `int` | Yes | Baud rate in bps (e.g., 115200, 250000) | | `Role` | `str` | Yes | Aggregated role descriptions | #### CommPin | Field | Type | Optional | Description | |-------|------|----------|-------------| | `Pin_Function` | `str` | No | Pin function (e.g., `UART2_TX`, `CS`, `CLK`) | | `IO_Index` | `str` | Yes | IO port index | | `Pin_Number` | `int` | Yes | Physical pin number | #### TimerCaptureConfig (extends PinConfig) No additional fields. #### GPIOConfig (extends PinConfig) | Additional Field | Type | Optional | Description | |-----------------|------|----------|-------------| | `Direction` | `str` | Yes | `"OUT"` or `"IN"` | #### SystemConfig (extends PinConfig) | Additional Field | Type | Optional | Description | |-----------------|------|----------|-------------| | `Category` | `str` | No | One of: `NRST`, `BOOT`, `EMU`, `OSC` | #### ImageInfo | Field | Type | Optional | Description | |-------|------|----------|-------------| | `filename` | `str` | No | Extracted image filename (e.g., `img_0.png`) | | `cell` | `str` | Yes | Excel cell reference (e.g., `J9`) | | `anchor_type` | `str` | Yes | `one_cell`, `two_cell`, `absolute`, `string` | | `format` | `str` | Yes | Image format (e.g., `png`) | | `width` | `int` | Yes | Image width in pixels | | `height` | `int` | Yes | Image height in pixels | ### Serialization Notes - **`exclude_none=True`**: Optional fields with `None` value are omitted from JSON output - **`indent=2`**: JSON is formatted with 2-space indentation - **Encoding**: UTF-8, Chinese characters preserved (no ASCII escaping) --- ## 4. Pinmap Parser Troubleshooting **Impact: HIGH** — Parsing failures from non-standard Excel formats are the most common issue. A systematic troubleshooting flow prevents users from getting stuck. ### Decision Tree ``` Problem reported ├── Parser crashes on Excel file? │ └── Symptom 1: File Read Error ├── Output JSON missing subsystem data? │ └── Symptom 2: Missing Module Data ├── Field values incorrect or unexpected? │ └── Symptom 3: Data Quality Issues └── Images not extracted? └── Symptom 4: Image Extraction Failure ``` ### Symptom 1: File Read Error Parser throws an exception when reading the Excel file. **Check 1: File format** - Must be `.xlsx` (Office Open XML), not `.xls` or `.csv` - openpyxl does not support legacy `.xls` format **Check 2: File integrity** - File is not corrupted, password-protected, or locked by another process - Try opening in Excel/LibreOffice to verify **Check 3: Dependencies installed** ```bash pip install openpyxl>=3.1 pydantic>=2.0 Pillow>=10.0 ``` **Check 4: Header row format** - Row 1 must contain exact Chinese column names: `芯片型号`, `模块`, `引脚功能`, `IO索引`, `引脚编号`, `编号`, `DMA使用`, `波特率`, `功能备注` - Extra columns are ignored but required columns must be present ### Symptom 2: Missing Module Data Output JSON has empty arrays for some subsystem categories. **Check 1: Module column values** - The `模块` column must contain recognized module names - Supported values: `ADC`, `HRPWM`, `COM_UART`, `COM_SPI0`, `CAN`, `NRST`, `BOOT`, `TIMER捕获`, `GPIO`, `EMU`, `OSC` **Check 2: Forward fill behavior** - The parser uses forward fill on the `模块` column - The module name only needs to appear in the first row of each group - Subsequent rows in the same group must be **empty** (not a different value) - If a different value appears, it starts a new group **Check 3: Module name normalization** - Matching is case-insensitive for Latin characters - `TIMER捕获` requires exact Chinese characters (not `TIMER_CAPTURE` in Excel) - Unrecognized module names are used as-is (uppercased) but won't match any builder function **Check 4: Chip model column** - `芯片型号` must have a value in at least the first data row - If all cells in this column are empty, `Chip_Model` will be `None` ### Symptom 3: Data Quality Issues Field values in the output JSON are incorrect or unexpected. **Check 1: Baud rate / frequency not parsed** - Must match recognized patterns: `10Khz`, `100KHZ`, `250Kbps`, `115200` - Unrecognized formats (e.g., `10 KHz` with space, `10k`) return `None` - Check the raw Excel value for unexpected characters **Check 2: DMA values** - Only `1.0` or `1` are treated as `True` - Any other value or empty cell becomes `False` / omitted - Text values like "yes" or "Y" are NOT recognized **Check 3: Pin numbers** - Must be numeric in Excel; non-numeric values become `None` - Floating point values (e.g., `71.0`) are automatically converted to integers **Check 4: Communication pins not clustered** - Pin function must match patterns: `UART2_TX`, `CAN1_TXD` - SPI pins must be one of: `CS`, `CLK`, `MISO`, `MOSI` with module `COM_SPI0` - Unrecognized pin function formats are silently skipped **Check 5: Invisible characters** - The parser removes `\n`, extra spaces, and invisible characters - If the raw Excel value contains special Unicode characters, they may not be cleaned - The `Role` field uses comma+space to replace newlines ### Symptom 4: Image Extraction Failure No images extracted or images directory not created. **Check 1: `--no-images` flag** - Verify the flag is not set when images are expected **Check 2: Pillow installed** ```bash pip install Pillow>=10.0 ``` **Check 3: Images must be embedded** - Linked images (URLs) are not supported - Images must be embedded directly in the Excel file (Insert > Picture) **Check 4: Output directory writable** - The `images/` directory is created next to the output JSON - Verify write permissions on the parent directory ### Common Error Messages | Error | Likely Cause | Solution | |-------|-------------|----------| | `KeyError: '芯片型号'` | Header row missing or wrong column names | Verify Excel has exact Chinese headers in row 1 | | `ValidationError` from Pydantic | Data type mismatch after cleaning | Check raw Excel values for unexpected formats | | `FileNotFoundError` | Input path incorrect | Verify Excel file path exists | | `ModuleNotFoundError: openpyxl` | Dependencies not installed | `pip install -r tools/requirements.txt` | | `PermissionError` on images dir | No write permission | Check output directory permissions | | Empty JSON (all arrays `[]`) | No module values in Excel | Check `模块` column has values in first row of each group | | `zipfile.BadZipFile` | Corrupted or non-xlsx file | Verify file is valid `.xlsx` format |