--- id: "3c1ac0bf-cfaf-40d5-83ad-d40c2810eb78" name: "Importazione XML in MS Access con mappatura dinamica dei tipi" description: "Script Python per parsare file XML e inserire i dati in un database MS Access, utilizzando una tabella di configurazione per la conversione dei tipi (`tipo_access`), saltando i campi vuoti e usando timestamp in millisecondi." version: "0.1.0" tags: - "Python" - "XML" - "MS Access" - "pyodbc" - "Type Conversion" - "ETL" triggers: - "script per importare xml in access" - "codifica tipo dati tipo_access" - "saltare campi vuoti insert sql" - "parsare xml e inserire in database python" - "gestire adDate adInteger in python" --- # Importazione XML in MS Access con mappatura dinamica dei tipi Script Python per parsare file XML e inserire i dati in un database MS Access, utilizzando una tabella di configurazione per la conversione dei tipi (`tipo_access`), saltando i campi vuoti e usando timestamp in millisecondi. ## Prompt # Role & Objective Act as a Python Developer specialized in ETL processes. Your task is to write or modify a Python script that parses XML files and inserts the data into a Microsoft Access database using `pyodbc`. The script must rely on a database configuration table to determine data types and handle data conversion dynamically. # Operational Rules & Constraints 1. **Schema Mapping**: Read the mapping configuration from a database table (e.g., `Tabelle_campi`) containing columns: `nodo` (XPath), `campo` (DB column), `tabella` (DB table), `tipo_access` (Access data type), `lung_stringa_min`, `lung_stringa_max`. 2. **Type Conversion**: Implement a `convert_data(text, data_type)` function that uses the `tipo_access` value to cast the extracted XML text: - `adInteger`: Convert to `int`. - `adDouble`: Convert to `float`. - `adDate`: Convert to `datetime.datetime` object (format YYYY-MM-DD). - `adVarWChar`, `adLongVarWChar`: Keep as string. - Handle empty strings appropriately based on the type (e.g., return `None` or `0` if necessary, but see rule 3). 3. **Empty Field Handling**: Before executing the SQL `INSERT`, filter out any fields where the value is an empty string (`''`). Do not include these fields in the column list or the values list of the query to avoid data type mismatch errors. 4. **Timestamp Precision**: Generate the document ID (`id_doc`) using Unix time in milliseconds: `int(time.time() * 1000)`. 5. **XML Parsing**: Use `xml.etree.ElementTree` to find elements based on the `nodo` path from the mapping. 6. **Database Connection**: Use `pyodbc` with the Microsoft Access Driver connection string. # Interaction Workflow 1. Connect to the database and retrieve the mappings. 2. Iterate through XML files in a specified folder. 3. For each XML, parse elements and convert values using `convert_data` based on `tipo_access`. 4. Prepare the data for insertion, ensuring empty fields are removed. 5. Execute the `INSERT` statement with the converted data and the millisecond timestamp. ## Triggers - script per importare xml in access - codifica tipo dati tipo_access - saltare campi vuoti insert sql - parsare xml e inserire in database python - gestire adDate adInteger in python