--- name: xlsx description: "Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas" license: Proprietary. LICENSE.txt has complete terms --- # XLSX creation, editing, and analysis ## Overview A user may ask you to create, edit, or analyze the contents of an .xlsx file. You have different tools and workflows available for different tasks. ## CRITICAL: Use Formulas, Not Hardcoded Values **Always use Excel formulas instead of calculating values in Python and hardcoding them.** This ensures the spreadsheet remains dynamic and updateable. ### WRONG - Hardcoding Calculated Values ```python # Bad: Calculating in Python and hardcoding result total = df['Sales'].sum() sheet['B10'] = total # Hardcodes 5000 ``` ### CORRECT - Using Excel Formulas ```python # Good: Let Excel calculate the sum sheet['B10'] = '=SUM(B2:B9)' ``` ## Reading and analyzing data ### Data analysis with pandas ```python import pandas as pd # Read Excel df = pd.read_excel('file.xlsx') # Default: first sheet all_sheets = pd.read_excel('file.xlsx', sheet_name=None) # All sheets as dict # Analyze df.head() # Preview data df.info() # Column info df.describe() # Statistics # Write Excel df.to_excel('output.xlsx', index=False) ``` ## Creating new Excel files ```python from openpyxl import Workbook from openpyxl.styles import Font, PatternFill, Alignment wb = Workbook() sheet = wb.active # Add data sheet['A1'] = 'Hello' sheet['B1'] = 'World' sheet.append(['Row', 'of', 'data']) # Add formula sheet['B2'] = '=SUM(A1:A10)' # Formatting sheet['A1'].font = Font(bold=True, color='FF0000') sheet['A1'].fill = PatternFill('solid', start_color='FFFF00') sheet['A1'].alignment = Alignment(horizontal='center') # Column width sheet.column_dimensions['A'].width = 20 wb.save('output.xlsx') ``` ## Editing existing Excel files ```python from openpyxl import load_workbook # Load existing file wb = load_workbook('existing.xlsx') sheet = wb.active # or wb['SheetName'] for specific sheet # Working with multiple sheets for sheet_name in wb.sheetnames: sheet = wb[sheet_name] print(f"Sheet: {sheet_name}") # Modify cells sheet['A1'] = 'New Value' sheet.insert_rows(2) # Insert row at position 2 sheet.delete_cols(3) # Delete column 3 # Add new sheet new_sheet = wb.create_sheet('NewSheet') new_sheet['A1'] = 'Data' wb.save('modified.xlsx') ``` ## Best Practices ### Library Selection - **pandas**: Best for data analysis, bulk operations, and simple data export - **openpyxl**: Best for complex formatting, formulas, and Excel-specific features ### Working with openpyxl - Cell indices are 1-based (row=1, column=1 refers to cell A1) - Use `data_only=True` to read calculated values: `load_workbook('file.xlsx', data_only=True)` - **Warning**: If opened with `data_only=True` and saved, formulas are replaced with values and permanently lost - For large files: Use `read_only=True` for reading or `write_only=True` for writing ### Working with pandas - Specify data types to avoid inference issues: `pd.read_excel('file.xlsx', dtype={'id': str})` - For large files, read specific columns: `pd.read_excel('file.xlsx', usecols=['A', 'C', 'E'])` - Handle dates properly: `pd.read_excel('file.xlsx', parse_dates=['date_column'])` ## Code Style Guidelines **IMPORTANT**: When generating Python code for Excel operations: - Write minimal, concise Python code without unnecessary comments - Avoid verbose variable names and redundant operations - Avoid unnecessary print statements