# Import & Merge Guide Guide for importing content from various sources and merging memory slots with intelligent duplicate detection. ## Table of Contents 1. [Content Import System](#content-import-system) 2. [Memory Slot Merging](#memory-slot-merging) 3. [Supported File Formats](#supported-file-formats) 4. [Import Strategies](#import-strategies) 5. [Merge Strategies](#merge-strategies) 6. [Best Practices](#best-practices) 7. [Troubleshooting](#troubleshooting) ## Content Import System ### Overview The `memcord_import` tool enables importing content from various sources into memory slots, expanding beyond manual text entry to support: - **Text Files**: Markdown, plain text, documentation - **PDF Documents**: Research papers, reports, manuals - **Web Content**: Articles, blog posts, documentation pages - **Structured Data**: CSV datasets, JSON configurations ### Basic Import Syntax ```bash memcord_import source="" [options] ``` **Required Parameters:** - `source`: File path, URL, or data source **Optional Parameters:** - `slot_name`: Target memory slot (uses current slot if not specified) - `description`: Descriptive text for the imported content - `tags`: Array of tags for categorization - `group_path`: Hierarchical organization path ### Import Examples #### Text File Import ```bash # Import markdown documentation memcord_import source="./project-docs/README.md" slot_name="project_readme" tags=["docs","readme"] group_path="projects/alpha" # Import meeting notes memcord_import source="/notes/meeting_2025_01_15.txt" slot_name="meeting_notes" description="Weekly standup notes" tags=["meeting","standup"] ``` #### PDF Document Import ```bash # Import research paper memcord_import source="/research/paper.pdf" slot_name="research_lit" tags=["research","pdf","literature"] description="Key research paper on ML" # Import technical manual memcord_import source="./manuals/api_guide.pdf" slot_name="api_docs" tags=["manual","api","reference"] group_path="documentation/api" ``` #### Web Content Import ```bash # Import blog article memcord_import source="https://example.com/best-practices-guide" slot_name="best_practices" tags=["web","guide"] description="Industry best practices" # Import documentation page memcord_import source="https://docs.framework.com/getting-started" slot_name="framework_docs" tags=["docs","web","tutorial"] group_path="learning/frameworks" ``` #### Structured Data Import ```bash # Import CSV dataset memcord_import source="/data/sales_q1_2025.csv" slot_name="sales_data" tags=["data","csv","sales"] description="Q1 2025 sales metrics" # Import JSON configuration memcord_import source="./config/app_settings.json" slot_name="app_config" tags=["config","json"] group_path="configurations/app" ``` ### Import Metadata Every import automatically includes rich metadata: ```markdown === IMPORTED CONTENT === Source: /path/to/file.pdf Type: pdf Imported: 2025-01-15T10:30:00 Description: Research paper on machine learning ======================== [Original content follows...] ``` ## Memory Slot Merging ### Overview The `memcord_merge` tool consolidates multiple memory slots into a single, organized slot with: - **Duplicate Detection**: Configurable similarity thresholds - **Chronological Ordering**: Timeline-based content organization - **Metadata Consolidation**: Combined tags and groups - **Preview Mode**: See results before execution ### Basic Merge Syntax ```bash memcord_merge source_slots=["slot1","slot2"] target_slot="merged_slot" [options] ``` **Required Parameters:** - `source_slots`: Array of memory slots to merge (minimum 2) - `target_slot`: Name for the merged result **Optional Parameters:** - `action`: `preview` (default) or `merge` - `similarity_threshold`: 0.0-1.0 (default 0.8) - `delete_sources`: true/false (default false) ### Merge Workflow #### 1. Preview Phase ```bash # Preview merge to see statistics memcord_merge source_slots=["meeting1","meeting2","meeting3"] target_slot="project_meetings" action="preview" ``` **Preview Output:** ``` === MERGE PREVIEW: project_meetings === Source slots: meeting1, meeting2, meeting3 Total content length: 15,420 characters Duplicate content to remove: 7 sections Similarity threshold: 80.0% Merged tags (8): meeting, project, alpha, weekly, standup, urgent, decisions, action-items Merged groups (1): meetings/weekly Chronological order: - meeting1: 2025-01-08 09:00:00 - meeting2: 2025-01-15 09:00:00 - meeting3: 2025-01-22 09:00:00 ⚠️ WARNING: Target slot 'project_meetings' already exists and will be overwritten! Content preview: ========================================== === MERGED MEMORY SLOT === Created: 2025-01-22 14:30:00 Source Slots: meeting1, meeting2, meeting3 Total Sources: 3 ========================= --- From meeting1 (2025-01-08 09:00:00) --- Team Standup - Jan 8, 2025 [Content follows...] ========================================== To execute the merge, call memcord_merge again with action='merge' ``` #### 2. Execution Phase ```bash # Execute the merge memcord_merge source_slots=["meeting1","meeting2","meeting3"] target_slot="project_meetings" action="merge" ``` **Execution Output:** ``` ✅ Successfully merged 3 slots into 'project_meetings' Final content: 14,150 characters Duplicates removed: 7 sections Merged at: 2025-01-22 14:30:15 Source slots: meeting1, meeting2, meeting3 Tags merged: meeting, project, alpha, weekly, standup, urgent, decisions, action-items Groups merged: meetings/weekly ``` ### Advanced Merge Options #### Custom Similarity Threshold ```bash # More aggressive duplicate detection (70% similarity) memcord_merge source_slots=["draft1","draft2"] target_slot="final_doc" action="merge" similarity_threshold=0.7 # More conservative duplicate detection (90% similarity) memcord_merge source_slots=["notes1","notes2"] target_slot="combined_notes" action="merge" similarity_threshold=0.9 ``` #### Source Cleanup ```bash # Merge and delete source slots memcord_merge source_slots=["temp1","temp2","temp3"] target_slot="consolidated" action="merge" delete_sources=true ``` ## Supported File Formats ### Text Files - **Extensions**: `.txt`, `.md`, `.markdown`, `.rst`, `.log` - **Encoding**: UTF-8 (automatic detection) - **Size Limit**: 50MB per file - **Features**: Preserves formatting, handles large files ### PDF Documents - **Processing**: Page-by-page text extraction - **Library**: `pdfplumber` for robust extraction - **Features**: Page number headers, maintains structure - **Limitations**: Text-based PDFs only (no OCR) ### Web Content - **Protocols**: HTTP/HTTPS - **Processing**: Clean article extraction with `trafilatura` - **Features**: Removes ads/navigation, preserves main content - **Metadata**: Page title, content type, extraction method ### Structured Data - **JSON**: Configuration files, API responses, data exports - **CSV/TSV**: Datasets, reports, tabular data - **Processing**: `pandas` for robust data handling - **Features**: Schema detection, row/column statistics ## Import Strategies ### 1. Hierarchical Organization ```bash # Organize by project and type memcord_import source="./docs/api.md" slot_name="api_docs" group_path="projects/alpha/documentation" memcord_import source="./specs/requirements.pdf" slot_name="requirements" group_path="projects/alpha/specifications" ``` ### 2. Thematic Tagging ```bash # Tag by content themes memcord_import source="article1.pdf" slot_name="research1" tags=["ai","neural-networks","deep-learning"] memcord_import source="article2.pdf" slot_name="research2" tags=["ai","computer-vision","cnn"] ``` ### 3. Batch Import Workflows ```bash # Import multiple related files for file in docs/*.md; do memcord_import source="$file" slot_name="doc_$(basename $file .md)" tags=["docs","batch"] group_path="documentation/guides" done ``` ### 4. Source Type Specialization ```bash # Web content with source attribution memcord_import source="https://tech-blog.com/article" slot_name="tech_trends" tags=["web","trends","external"] description="External tech trends analysis" # Internal documentation memcord_import source="./internal/process.md" slot_name="internal_process" tags=["internal","process","confidential"] description="Internal process documentation" ``` ## Merge Strategies ### 1. Chronological Consolidation ```bash # Merge time-series content (meetings, logs, reports) memcord_merge source_slots=["jan_meetings","feb_meetings","mar_meetings"] target_slot="q1_meetings" action="merge" ``` ### 2. Thematic Consolidation ```bash # Merge by topic or theme memcord_merge source_slots=["api_docs1","api_docs2","api_reference"] target_slot="complete_api_docs" action="merge" ``` ### 3. Progressive Consolidation ```bash # Multi-stage merging for large datasets # Stage 1: Merge weekly reports memcord_merge source_slots=["week1","week2","week3","week4"] target_slot="month1" action="merge" memcord_merge source_slots=["week5","week6","week7","week8"] target_slot="month2" action="merge" # Stage 2: Merge monthly summaries memcord_merge source_slots=["month1","month2","month3"] target_slot="q1_summary" action="merge" ``` ### 4. Cleanup and Archival ```bash # Merge temporary slots and cleanup memcord_merge source_slots=["temp_notes1","temp_notes2","temp_drafts"] target_slot="archived_content" action="merge" delete_sources=true ``` ## Best Practices ### Import Best Practices 1. **Use Descriptive Slot Names** ```bash # Good memcord_import source="report.pdf" slot_name="q1_sales_report_2025" # Avoid memcord_import source="report.pdf" slot_name="report1" ``` 2. **Apply Consistent Tagging** ```bash # Consistent taxonomy memcord_import source="doc.pdf" tags=["finance","quarterly","report","2025"] ``` 3. **Organize with Group Paths** ```bash # Hierarchical organization memcord_import source="spec.md" group_path="projects/alpha/specifications" ``` 4. **Add Context with Descriptions** ```bash # Descriptive context memcord_import source="data.csv" description="Customer survey responses Q1 2025 - 1,500 respondents" ``` ### Merge Best Practices 1. **Always Preview First** ```bash # Preview before executing memcord_merge source_slots=["a","b"] target_slot="merged" action="preview" # Review output, then: memcord_merge source_slots=["a","b"] target_slot="merged" action="merge" ``` 2. **Adjust Similarity Thresholds** ```bash # For technical docs (conservative) memcord_merge ... similarity_threshold=0.9 # For meeting notes (aggressive) memcord_merge ... similarity_threshold=0.7 ``` 3. **Use Cleanup Strategically** ```bash # Only delete sources when confident memcord_merge ... delete_sources=true action="merge" ``` 4. **Meaningful Target Names** ```bash # Descriptive merge targets memcord_merge ... target_slot="project_alpha_complete_documentation" ``` ### Organization Best Practices 1. **Consistent Naming Conventions** - Use descriptive, date-stamped names - Follow project/team naming standards - Include version numbers for iterations 2. **Strategic Group Hierarchies** ``` projects/ ├── alpha/ │ ├── documentation/ │ ├── meetings/ │ └── specifications/ └── beta/ ├── research/ └── development/ ``` 3. **Tag Taxonomies** ```bash # Category tags: [type, priority, status, domain] tags=["meeting","high","active","frontend"] ``` ## Troubleshooting ### Import Issues #### File Not Found ``` Error: Source cannot be empty Error: File not found: /path/to/file.pdf ``` **Solution:** Verify file path and permissions #### Unsupported Format ``` Error: No suitable import handler found for source ``` **Solution:** Check supported formats, convert if necessary #### Web Content Extraction Failed ``` Import failed: No content could be extracted from URL ``` **Solutions:** - Check URL accessibility - Verify content is text-based - Try different URLs if paywall/login required #### Large File Handling ``` Import failed: File too large ``` **Solutions:** - Split large files into smaller sections - Use compression if applicable - Consider cloud storage with direct links ### Merge Issues #### Insufficient Source Slots ``` Error: At least 2 source slots are required for merging ``` **Solution:** Provide minimum 2 valid slot names #### Missing Source Slots ``` Error: Memory slots not found: slot1, slot3 ``` **Solution:** Verify slot names with `memcord_list` #### Target Slot Conflicts ``` ⚠️ WARNING: Target slot 'merged' already exists and will be overwritten! ``` **Solution:** - Use different target name, or - Proceed if overwrite is intentional #### Memory/Performance Issues ``` Merge operation failed: Memory allocation error ``` **Solutions:** - Reduce content size - Use higher similarity threshold - Merge in smaller batches ### Performance Optimization #### Large Content Handling ```bash # Use higher similarity thresholds for faster processing memcord_merge ... similarity_threshold=0.9 # Process in smaller batches memcord_merge source_slots=["batch1","batch2"] target_slot="intermediate1" memcord_merge source_slots=["batch3","batch4"] target_slot="intermediate2" memcord_merge source_slots=["intermediate1","intermediate2"] target_slot="final" ``` #### Web Import Optimization ```bash # Batch web imports to avoid rate limiting for url in $urls; do memcord_import source="$url" ... sleep 2 # Rate limiting done ``` #### Resource Management ```bash # Cleanup after major operations memcord_merge ... delete_sources=true # Remove temporary slots ``` This guide covers all aspects of using the import and merge features effectively. For additional help, refer to the [Tools Reference](tools-reference.md) for detailed parameter specifications and the [Examples](examples.md) for practical workflows.