{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "view-in-github"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/neon-aiart/chirp-whisper-link/blob/main/chirp-whisper-link%20v5.0.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Cen6EE6H_dGo"
      },
      "source": [
        "<details>\n",
        "<summary><b>🗾 使い方</b></summary>\n",
        "\n",
        "1. ⚙️ **ランタイムのタイプを変更**  \n",
        "  上部メニューの「ランタイム」→「ランタイムのタイプを変更」からハードウェアを選択します  \n",
        "   * **T4 GPU**: < おすすめ > 高速に処理できます  \n",
        "   * **CPU**: 時間がかかってもいい場合や、GPU枠を節約したい時  \n",
        "\n",
        "2. 🔌 **ランタイムに接続**  \n",
        "  右上の「接続」をクリックして準備します  \n",
        "\n",
        "3. 📑 **モードの選択**  \n",
        "  **mode** を設定します: パソコン内のファイルなら `Upload` 、GoogleDriveなら `GoogleDrive`を選択  \n",
        "\n",
        "4. 📂 **フォルダー名を入力 (GoogleDriveモード)**  \n",
        "  **drive_folder** にフォルダー名を入力します (初期値: `/Whisper/`)  \n",
        "\n",
        "5. ▶️ **再生ボタンを押して実行**  \n",
        "  セルの左側にある再生ボタンをクリックして実行します  \n",
        "\n",
        "6. 📤 **ファイルのアップロード (Uploadモードの場合)**  \n",
        "  途中で「ファイル選択」ボタンが表示されるので、ファイルを選択してください  \n",
        "\n",
        "7. 🔄 **続けて実行する場合**  \n",
        "  手順3に戻り、設定を変更して再度再生ボタンを押します  \n",
        "\n",
        "8. ⚠️ **終わったら接続解除 (ゼッタイ！)**  \n",
        "  **「ランタイムを接続解除して削除」** を必ず行ってください  \n",
        "\n",
        "</details>\n",
        "\n",
        "<details>\n",
        "<summary><b>🔑 Gemini API キーの設定</b></summary>\n",
        "\n",
        "Gemini 3 Flash による「下読み」機能を有効にするために設定が必要です  \n",
        "\n",
        "1. **APIキーを取得**: [Google AI Studio](https://aistudio.google.com/app/apikey) でキーを作成します  \n",
        "2. **Colabに登録**: 画面左側の **鍵アイコン（シークレット）** をクリック  \n",
        "3. **追加**: 名前を `GEMINI_API_KEY` とし、値を貼り付けます  \n",
        "4. **許可**: 「ノートブックからのアクセス」のチェックを **ON** にしてください  \n",
        "\n",
        "> [!TIP] APIキーがなくても動作します  \n",
        "> キーが設定されていない場合、Geminiによる抽出プロセスのみがスキップされ、通常のWhisper文字起こしとして動作します  \n",
        "\n",
        "### ⚠️ 無料枠での利用に関する注意 (Free Tier)  \n",
        "\n",
        "* **データの取り扱い**: 無料枠（Free Tier）でファイルをアップロードして解析する場合、**入力データが Google のモデル改善（学習）に利用される可能性**があります  \n",
        "* **機密情報の扱い**: 機密性の高い音声ファイルを扱う場合は、有料枠（Pay-as-you-go）への切り替え、またはAPIキーを設定せずに実行することを検討してください  \n",
        "\n",
        "</details>\n",
        "\n",
        "<details>\n",
        "<summary><b>🛠️ 各モードの詳細</b></summary>\n",
        "\n",
        "### 📥 Upload Mode  \n",
        "\n",
        "* **手軽な実行**: 実行中に表示されるボタンからファイルを選択するだけ  \n",
        "* **再利用機能**: `execute_file_exists` にチェックを入れると、最後にアップロードしたファイルを再利用できます（パラメータを調整して試したい時に便利！）  \n",
        "* **自動ダウンロード**: 完了後、結果ファイル（`.srt` / `.log`）がブラウザから自動でダウンロードされます  \n",
        "\n",
        "### ☁️ GoogleDrive Mode  \n",
        "\n",
        "* **事前準備**: 実行前に、処理したいファイルを Drive 内の指定フォルダ（初期値: `/Whisper/`）に入れておいてください  \n",
        "* **自動保存**: 生成されたファイルは、音源と同じ Drive フォルダ内に直接保存されます  \n",
        "* **一括処理**: フォルダ内の未実行ファイルのみを賢く選別して、まとめて文字起こしします  \n",
        "\n",
        "### 📄 出力ファイル  \n",
        "\n",
        "* **字幕ファイル (`.srt`)**: 動画編集や再生プレイヤーでそのまま使える標準形式（常に生成）  \n",
        "* **議事録ログ (`.log`)**: タイムスタンプが記録された、内容確認に最適なテキスト（オプション）  \n",
        "* **プレーンテキスト (`.txt`)**: タイムスタンプなしの純粋な本文テキスト（隠しオプション）  \n",
        "\n",
        "</details>\n",
        "\n",
        "<details>\n",
        "<summary><b>⚙️ 設定の詳細</b></summary>\n",
        "\n",
        "### 💫 モデルとプロンプト  \n",
        "\n",
        "* **`model_type`**  \n",
        "  * **`auto`**: ブラウザ言語を判定し、日本語なら `Kotoba-Whisper`、英語なら `turbo` を自動選択  \n",
        "  * **`turbo`**: 早くしてほしい時に  \n",
        "  * **`large-v3`**: ガンバってほしい時に  \n",
        "  * **`Kotoba-Whisper`**: `turbo`をベースにした高速・軽量な日本語特化モデル  \n",
        "  * **`Distil-Whisper`**: `large-v3` をベースにした推論速度向上版  \n",
        "\n",
        "* **`initial_prompt`**  \n",
        "  特定の固有名詞や専門用語の認識、句読点、漢字の変換ミスを防ぐために事前に伝えるヒント  \n",
        "  * **空欄の場合**: **Gemini 3 Flash** が音声を下読みし、最適なプロンプトを自動生成（APIキーが必要）  \n",
        "\n",
        "### 🔄 動作モード  \n",
        "\n",
        "* **`mode`**:  \n",
        "  * `Upload`: パソコン内のファイルを読み込む（手軽な単発処理）  \n",
        "  * `GoogleDrive`: 指定フォルダからファイルを読み込む（大量・一括処理）  \n",
        "  * `YouTube`: (棚上げ)\n",
        "* **`drive_folder`**:  \n",
        "  * Google Drive内の対象フォルダ名（初期値: `Whisper`）  \n",
        "* **`execute_file_exists`** (Uploadモード専用)  \n",
        "  * **ON**: アップロード済みの最新ファイルを再利用します  \n",
        "  * **OFF**: 常に新しいファイルをアップロードします  \n",
        "\n",
        "### 📄 出力オプション  \n",
        "\n",
        "* **`records_text_download`**: タイムスタンプ付きの議事録（.log）を保存します  \n",
        "* **`drive_batch_mode`** (GoogleDriveモード専用):  \n",
        "  * `未実行のみ一括処理`: まだ `.srt` が生成されていないファイルだけを探して実行します  \n",
        "  * `最新の１件のみ`: フォルダ内の最新ファイル１つだけを処理します  \n",
        "* **`plain_text_download`** (隠しオプション): タイムスタンプなしの純粋なテキスト本文（`.txt`）を保存します  \n",
        "\n",
        "### 🚀 効率化機能：既存ファイルの再利用  \n",
        "\n",
        "アップロード・ダウンロード済みの最新ファイルを再利用することで、パラメータ調整時の待ち時間を大幅に短縮できます  \n",
        "\n",
        "#### `mode`を`Upload`にする  \n",
        "  * **通常**: $\\text{File Upload (60s)} + \\text{Whisper (120s)} = 180\\text{s}$  \n",
        "  * **再利用モード**: $\\text{Whisper (120s)}$ only = **120s (33% OFF!)**  \n",
        "\n",
        "</details>\n",
        "\n",
        "<br>\n",
        "\n",
        "<details>\n",
        "<summary><b>🌎 How to Use</b></summary>\n",
        "\n",
        "1. ⚙️ **Change Runtime Type**  \n",
        "   Go to \"Runtime\" -> \"Change runtime type\" in the top menu and select your hardware.  \n",
        "   * **T4 GPU**: < Recommended > For high-speed processing.  \n",
        "   * **CPU**: If you don't mind it taking longer or want to save your GPU quota.  \n",
        "2. 🔌 **Connect to Runtime**  \n",
        "   Click \"Connect\" in the top right corner to prepare the environment.  \n",
        "3. 📑 **Select Mode**  \n",
        "   Set the **mode**: Select `Upload` for local files or `GoogleDrive` for Google Drive.  \n",
        "4. 📂 **Enter Folder Name (for GoogleDrive Mode)**  \n",
        "   Enter your target folder name in the **drive_folder** field. (default: `/Whisper/`)  \n",
        "5. ▶️ **Click the Play Button**  \n",
        "   Click the play button on the left side of the cell to start the process.  \n",
        "6. 📤 **Upload File (for Upload Mode)**  \n",
        "   When the \"Choose Files\" button appears during execution, select your audio file.  \n",
        "7. 🔄 **To Continue**  \n",
        "   Go back to step 3, adjust settings, and click the play button again.  \n",
        "8. ⚠️ **Disconnect (Crucial!)**  \n",
        "   Always select **\"Disconnect and delete runtime\"** from the menu when finished.  \n",
        "\n",
        "</details>\n",
        "\n",
        "<details>\n",
        "<summary><b>🗝️ Gemini API Key Setup</b></summary>\n",
        "\n",
        "Setup is required to enable the \"Pre-reading\" feature using Gemini 3 Flash.  \n",
        "\n",
        "1. **Get API Key**: Create your key at [Google AI Studio](https://aistudio.google.com/app/apikey).  \n",
        "2. **Register in Colab**: Click the **Key icon (Secrets)** on the left sidebar.  \n",
        "3. **Add Secret**: Set the Name to `GEMINI_API_KEY` and paste your key into the Value.  \n",
        "4. **Grant Access**: Toggle the \"Notebook access\" switch to **ON**.  \n",
        "\n",
        "> [!TIP] Works without an API Key  \n",
        "> If no key is set, the Gemini extraction process is skipped, and the tool functions as a standard Whisper transcription.  \n",
        "\n",
        "### ⚠️ Precautions (Free Tier)  \n",
        "\n",
        "* **Data Privacy**: When using the Free Tier, **your input data may be used by Google to improve their models (training)**.  \n",
        "* **Sensitive Information**: For highly confidential audio, consider switching to the Pay-as-you-go tier or running the tool without an API key.  \n",
        "\n",
        "</details>\n",
        "\n",
        "<details>\n",
        "<summary><b>🔧 Mode Details</b></summary>\n",
        "\n",
        "### 📥 Upload Mode  \n",
        "\n",
        "* **Easy Execution**: Simply select your file using the button that appears during execution.  \n",
        "* **Reuse Feature**: Checking `execute_file_exists` allows you to reuse the last uploaded file (useful for fine-tuning parameters!).  \n",
        "* **Auto-Download**: Result files (`.srt` / `.log`) are automatically downloaded to your browser upon completion.  \n",
        "\n",
        "### ☁️ GoogleDrive Mode  \n",
        "\n",
        "* **Preparation**: Before running, place your audio files in the designated Drive folder (default: `/Whisper/`).  \n",
        "* **Auto-Save**: Generated files are saved directly in the same Drive folder as the source audio.  \n",
        "* **Batch Processing**: Smartly identifies and processes only the files that haven't been transcribed yet.  \n",
        "\n",
        "### 📄 Outputs  \n",
        "\n",
        "* **Subtitle File (`.srt`)**: Standard format for video editing and players (always generated).  \n",
        "* **Transcription Log (`.log`)**: Text with timestamps, ideal for reviewing content (optional).  \n",
        "* **Plain Text (`.txt`)**: Pure transcript without timestamps (hidden option).  \n",
        "\n",
        "</details>\n",
        "\n",
        "<details>\n",
        "<summary><b>⚒️ Parameter Details</b></summary>\n",
        "\n",
        "### 💫 Model & Prompt  \n",
        "\n",
        "* **`model_type`**  \n",
        "  * **`auto`**: Detects browser language. Selects `Kotoba-Whisper` for Japanese and `turbo` for English.  \n",
        "  * **`turbo`**: Use when you want it fast.  \n",
        "  * **`large-v3`**: Use when you want the best possible accuracy.  \n",
        "  * **`Kotoba-Whisper`**: High-speed, lightweight model optimized for Japanese.  \n",
        "  * **`Distil-Whisper`**: A distilled version of `large-v3` with faster inference speed.  \n",
        "* **`initial_prompt`**  \n",
        "  A prompt provided in advance to improve recognition of proper nouns, technical terms, and punctuation.  \n",
        "  * **If empty**: **Gemini 3 Flash** analyzes the audio and automatically generates the optimal prompt (Requires API key).  \n",
        "\n",
        "### 🔄 Mode  \n",
        "\n",
        "* **`mode`**:  \n",
        "  * `Upload`: Process files from your computer (Single task).  \n",
        "  * `GoogleDrive`: Process files from a specific folder (Bulk/Batch task).  \n",
        "  * `YouTube`: (shelved)  \n",
        "* **`drive_folder`**: The target folder name in Google Drive (Default: `Whisper`).  \n",
        "* **`execute_file_exists`** (Upload mode only):  \n",
        "  * **ON**: Reuses the most recently uploaded file.  \n",
        "  * **OFF**: Always prompts for a new file upload.  \n",
        "\n",
        "### 📄 Outputs Options  \n",
        "\n",
        "* **`records_text_download`**: Saves a transcription log with timestamps (`.log`).  \n",
        "* **`drive_batch_mode`** (GoogleDrive mode only):  \n",
        "  * `Unprocessed Only`: Searches for and processes only files without `.srt`.  \n",
        "  * `Latest Only`: Processes only the single newest file in the folder.  \n",
        "* **`plain_text_download`** (Internal option): Saves the raw text body without timestamps (`.txt`).  \n",
        "\n",
        "### 🚀 Optimization: Reusing Existing Files  \n",
        "\n",
        "Reuse the most recently uploaded or downloaded file to significantly reduce wait times during parameter tuning.  \n",
        "\n",
        "#### switching `mode` to `Upload`  \n",
        "  * **Standard**: $\\text{File Upload (60s)} + \\text{Whisper (120s)} = 180\\text{s}$  \n",
        "  * **Reuse**: $\\text{Whisper (120s)}$ only = **120s (33% OFF!)**  \n",
        "\n",
        "</details>\n",
        "\n",
        "<br>"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "n1gBBE_c691M"
      },
      "outputs": [],
      "source": [
        "# @title 🐦 Chirp Whisper Link v5.0\n",
        "APP_VERSION = '5.0'\n",
        "LICENSE = 'PolyForm Noncommercial 1.0.0'\n",
        "LINK = 'github.com/neon-aiart/chirp-whisper-link'\n",
        "\n",
        "# @markdown ---\n",
        "model_type = \"auto\"            # @param [\"auto\", \"tiny\", \"base\", \"small\", \"medium\", \"large-v3\", \"turbo\", \"systran/faster-distil-whisper-large-v3\", \"kotoba-tech/kotoba-whisper-v2.0-faster\"]\n",
        "initial_prompt = \"\"            # @param {type:\"string\"}\n",
        "mode = \"GoogleDrive\"           # @param [\"Upload\", \"GoogleDrive\", \"YouTube(shelved)\"]\n",
        "drive_folder = \"Whisper\"       # @param {type:\"string\"}\n",
        "youtube_url = \"Youtube URL\"    # shelved @param {type:\"string\"}\n",
        "# @markdown ---\n",
        "# @markdown #### 🚀 GoogleDriveモード専用設定 (Advanced)\n",
        "drive_batch_mode = \"未実行のみ一括処理 (Unprocessed Only)\" # @param [\"未実行のみ一括処理 (Unprocessed Only)\", \"最新の１件のみ (Latest Only)\"]\n",
        "# @markdown ---\n",
        "# @markdown #### 🛠️ オプション (Options)\n",
        "language = \"ja\"\n",
        "condition_on_previous_text = True\n",
        "execute_file_exists = False    # @param {type:\"boolean\"}\n",
        "records_text_download = False  # @param {type:\"boolean\"}\n",
        "plain_text_download = False\n",
        "# @markdown ---\n",
        "\n",
        "import os, sys, time, torch, gc, subprocess, warnings, re, pathlib, logging\n",
        "from datetime import datetime\n",
        "from IPython.display import clear_output, Audio, display\n",
        "import numpy as np\n",
        "\n",
        "# 1. 環境判定とライブラリのインポート\n",
        "try:\n",
        "    from google.colab import drive, files, output, _shell, userdata\n",
        "    IS_COLAB = True\n",
        "except ImportError:\n",
        "    import locale\n",
        "    IS_COLAB = False\n",
        "\n",
        "# パスの設定\n",
        "if IS_COLAB:\n",
        "    ROOT_PATH = \"/content/\"\n",
        "else:\n",
        "    ROOT_PATH = \"./\"\n",
        "\n",
        "print(f\"🐦 Chirp Whisper Link v{APP_VERSION}\")\n",
        "print(f\"ねおん (neon-aiart) © 2026 | {LICENSE}\")\n",
        "print(f\"Official: {LINK}\")\n",
        "print(\"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\")\n",
        "\n",
        "# --- ffmpeg 存在チェック関数の定義 ---\n",
        "def check_ffmpeg_installed():\n",
        "    \"\"\"ローカル環境においてffmpegがインストールされているか確認する\"\"\"\n",
        "    if IS_COLAB:\n",
        "        return True  # Google Colabは標準搭載のためチェック不要\n",
        "\n",
        "    try:\n",
        "        # ffmpeg -version を実行して存在を確認\n",
        "        subprocess.run([\"ffmpeg\", \"-version\"], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)\n",
        "        return True\n",
        "    except FileNotFoundError:\n",
        "        return False\n",
        "\n",
        "# --- 実行チェック ---\n",
        "HAS_FFMPEG = check_ffmpeg_installed()\n",
        "if not HAS_FFMPEG:\n",
        "    print(\"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\")\n",
        "    print(\"❌ エラー: ffmpeg がシステムに見つかりません。\")\n",
        "    print(\"ローカル環境で実行するには、FFmpeg のインストールとパスの設定（環境変数）が必要です。\")\n",
        "    print(\"インストール後、ターミナル/コマンドプロンプトを再起動してから再度実行してください。\")\n",
        "    print(\"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\")\n",
        "\n",
        "# 2. ライブラリ準備\n",
        "packages = [\"faster-whisper\", \"transformers\", \"google-genai\"]\n",
        "\n",
        "def install_packages():\n",
        "    print(\"📦 ライブラリをインストール中...\")\n",
        "    # IS_COLAB かどうかで pip の叩き方を変える（!pip はノートブック専用のため）\n",
        "    if IS_COLAB:\n",
        "        _shell.Shell().run_line_magic('pip', f'install -U {\" \".join(packages)} -q --no-cache-dir')\n",
        "    else:\n",
        "        # ローカル環境では OS のコマンドとして実行\n",
        "        subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"-U\"] + packages)\n",
        "\n",
        "try:\n",
        "    from faster_whisper import WhisperModel\n",
        "    from google import genai\n",
        "except ImportError:\n",
        "    if HAS_FFMPEG:\n",
        "        install_packages()\n",
        "        from faster_whisper import WhisperModel\n",
        "        from google import genai\n",
        "from huggingface_hub.utils import disable_progress_bars\n",
        "disable_progress_bars()\n",
        "\n",
        "# Hugging Faceのトークン警告を非表示にする\n",
        "warnings.filterwarnings(\"ignore\", category=UserWarning, module=\"huggingface_hub\")\n",
        "# Hugging Faceのログレベルを「ERROR」以上に設定（Warningを無視する）\n",
        "logging.getLogger(\"huggingface_hub\").setLevel(logging.ERROR)\n",
        "# 環境変数で「トークンなしでOK」と明示的に伝える\n",
        "os.environ[\"HF_HUB_DISABLE_SYMLINKS_WARNING\"] = \"1\"\n",
        "\n",
        "# 3. タイムゾーン・言語設定\n",
        "if IS_COLAB:\n",
        "    try:\n",
        "        browser_languages = output.eval_js('navigator.languages')\n",
        "        # 一番上の優先言語を取得\n",
        "        primary_lang = browser_languages[0].lower() if browser_languages else \"en\"\n",
        "    except:\n",
        "        primary_lang = \"en\"\n",
        "else:\n",
        "    # ローカルならOSの言語設定を取得\n",
        "    loc = locale.getdefaultlocale()[0] # 'ja_JP' などが返る\n",
        "    primary_lang = 'ja' if loc and loc.startswith('ja') else 'en'\n",
        "\n",
        "# 第一言語が日本語（ja）で始まる場合のみ日本設定にする\n",
        "if primary_lang.startswith('ja'):\n",
        "    language = \"ja\"\n",
        "    os.environ['TZ'] = 'Asia/Tokyo'\n",
        "else:\n",
        "    language = \"en\"\n",
        "    os.environ['TZ'] = 'UTC'\n",
        "\n",
        "# Windows対策\n",
        "if os.name != 'nt':\n",
        "    try:\n",
        "        time.tzset() # タイムゾーンの設定を反映\n",
        "    except:\n",
        "        pass\n",
        "\n",
        "# 長いファイル名エラー対策\n",
        "def get_safe_filename(original_name, ext=\".srt\"):\n",
        "    # 拡張子を除いたベース名を取得\n",
        "    base = os.path.splitext(os.path.basename(original_name))[0]\n",
        "    # 特殊文字を置換 & 前方の50文字にカット（OSのパス制限対策）\n",
        "    safe_base = re.sub(r'[\\\\/:*?\"<>|]', '', base)[:50].strip()\n",
        "    timestamp = datetime.now().strftime(\"%Y%m%d_%H%M%S\")\n",
        "    return f\"{safe_base}_{timestamp}{ext}\"\n",
        "\n",
        "# 4. モデルのロード (Faster版)\n",
        "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
        "\n",
        "if model_type == \"auto\":\n",
        "    selected_model = \"kotoba-tech/kotoba-whisper-v2.0-faster\" if language == \"ja\" else \"turbo\"\n",
        "else:\n",
        "    selected_model = model_type\n",
        "\n",
        "if 'model' not in globals() or globals().get('current_model_type') != selected_model:\n",
        "    if HAS_FFMPEG:\n",
        "        print(f\"📦 Whisperモデル({selected_model})を読み込み中...\")\n",
        "        # Faster-Whisper特有の呼び出し方\n",
        "        model = WhisperModel(selected_model, device=device, compute_type=\"float16\" if device == \"cuda\" else \"int8\", download_root=\"./models\")\n",
        "        current_model_type = selected_model\n",
        "        clear_output()\n",
        "        print(f\"✅ モデル({selected_model})のロードが完了しました\")\n",
        "else:\n",
        "    print(f\"⚡ モデル({selected_model})はロード済みです\")\n",
        "\n",
        "# メモリ掃除\n",
        "gc.collect()\n",
        "if torch.cuda.is_available():\n",
        "    torch.cuda.empty_cache()\n",
        "\n",
        "# 5. 入力ソースの選択\n",
        "target_files = [] # 処理対象フルパスのリスト\n",
        "valid_ext = ('.mp3', '.mp4', '.wav', '.m4a', '.webm', '.ogg', '.flac')\n",
        "\n",
        "if mode == \"GoogleDrive\":\n",
        "    if IS_COLAB:\n",
        "        # 1. ドライブをマウント\n",
        "        if not os.path.exists('/content/drive'):\n",
        "            print(\"Google Driveをマウントしています...\")\n",
        "            drive.mount('/content/drive')\n",
        "\n",
        "        # 2. 検索パスの設定\n",
        "        target_path = f\"/content/drive/MyDrive/{drive_folder}\"\n",
        "    else:\n",
        "        # ローカルなら、PC内の特定のパスをDrive代わりにする\n",
        "        target_path = os.path.join(ROOT_PATH, drive_folder)\n",
        "\n",
        "    if not os.path.exists(target_path): os.makedirs(target_path)\n",
        "\n",
        "    all_drive_files = sorted([os.path.join(target_path, f) for f in os.listdir(target_path) if f.lower().endswith(valid_ext)], key=os.path.getmtime, reverse=True)\n",
        "\n",
        "    if drive_batch_mode == \"未実行のみ一括処理 (Unprocessed Only)\":\n",
        "        existing_srts = [f for f in os.listdir(target_path) if f.endswith('.srt')]\n",
        "        for f_path in all_drive_files:\n",
        "            f_base = os.path.splitext(os.path.basename(f_path))[0][:20] # 前方一致判定用\n",
        "            if not any(f_base in s for s in existing_srts):\n",
        "                target_files.append(f_path)\n",
        "        print(f\"📂 未実行ファイル: {len(target_files)}件を検出しました\")\n",
        "    else:\n",
        "        if all_drive_files: target_files = [all_drive_files[0]]\n",
        "\n",
        "elif mode == \"YouTube\":\n",
        "    try:\n",
        "        import yt_dlp\n",
        "    except ImportError:\n",
        "        print(\"📦 YouTube処理用ライブラリをインストール中...\")\n",
        "        if IS_COLAB:\n",
        "            _shell.Shell().run_line_magic('pip', f'install -U yt-dlp -q --no-cache-dir')\n",
        "        else:\n",
        "            subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"-U\", \"yt-dlp\"])\n",
        "        import yt_dlp\n",
        "\n",
        "    if youtube_url == \"Youtube URL\" or not youtube_url.startswith(\"http\"):\n",
        "        print(\"⚠️ 有効なYouTube URLを入力してください\")\n",
        "    else:\n",
        "        ydl_opts = {\n",
        "            'format': 'bestaudio/best',\n",
        "            'outtmpl': 'youtube_audio.%(ext)s',\n",
        "            'noplaylist': True,\n",
        "            'quiet': True,\n",
        "            'no_warnings': True,\n",
        "        }\n",
        "\n",
        "        max_retries = 3  # 最大3回試行\n",
        "        for attempt in range(max_retries):\n",
        "            try:\n",
        "                print(f\"📺 YouTubeから音声を抽出中... (試行 {attempt + 1}/{max_retries})\")\n",
        "                with yt_dlp.YoutubeDL(ydl_opts) as ydl:\n",
        "                    info = ydl.extract_info(youtube_url, download=True)\n",
        "                    target_files = [ydl.prepare_filename(info)]\n",
        "\n",
        "                    # .webm などに変わっている可能性を考慮\n",
        "                    if not os.path.exists(target_files[0]):\n",
        "                        target_files = [info.get('requested_downloads', [{}])[0].get('filepath', target_files[0])]\n",
        "                break  # 成功したらループを抜ける\n",
        "            except Exception as e:\n",
        "                if attempt < max_retries - 1:\n",
        "                    print(f\"⚠️ 失敗しました。5秒後に再試行します... ({e})\")\n",
        "                    time.sleep(5)  # 少し間を置くのがコツ\n",
        "                else:\n",
        "                    print(f\"❌ {max_retries}回試行しましたが失敗しました\")\n",
        "                    target_files = [] # Ensure file_name is None if download fails\n",
        "\n",
        "else: # Upload / Local\n",
        "    # フォルダ内の対象ファイルをリストアップ（更新日時が新しい順）\n",
        "    local_files = sorted(\n",
        "        [f for f in os.listdir('.') if f.endswith(valid_ext)],\n",
        "        key=os.path.getmtime, reverse=True\n",
        "    )\n",
        "\n",
        "    if local_files and execute_file_exists:\n",
        "        # ファイルが存在し、かつ execute_file_exists が True の場合のみ既存ファイルを使用\n",
        "        target_files = [local_files[0]]\n",
        "    else:\n",
        "        if IS_COLAB:\n",
        "            print(\"📂 文字起こしするファイルをアップロードしてください\")\n",
        "            uploaded = files.upload()\n",
        "            if uploaded:\n",
        "                target_files = [list(uploaded.keys())[0]]\n",
        "        else:\n",
        "            if local_files: target_files = [local_files[0]]\n",
        "\n",
        "if not HAS_FFMPEG:\n",
        "    target_files = []\n",
        "\n",
        "# --- Gemini 解析セクション ---\n",
        "def get_ai_initial_prompt(file_path):\n",
        "    try:\n",
        "        if IS_COLAB:\n",
        "            api_key = userdata.get('GEMINI_API_KEY')\n",
        "        else:\n",
        "            api_key = os.getenv('GEMINI_API_KEY')\n",
        "        if not api_key: return \"\"\n",
        "\n",
        "        # 新SDKの初期化\n",
        "        client = genai.Client(api_key=api_key)\n",
        "        print(f\"💫 Gemini 3 Flash が音声を分析中...\")\n",
        "\n",
        "        # 1. アップロード\n",
        "        p = pathlib.Path(file_path)\n",
        "        mime_map = {\".mp3\": \"audio/mpeg\", \".wav\": \"audio/wav\", \".m4a\": \"audio/mp4\", \".ogg\": \"audio/ogg\", \".flac\": \"audio/flac\"}\n",
        "        # 拡張子からMIME取得（なければデフォルトで mpeg）\n",
        "        current_mime = mime_map.get(p.suffix.lower(), \"audio/mpeg\")\n",
        "\n",
        "        with p.open('rb') as f:\n",
        "            audio_file = client.files.upload(\n",
        "                file=f,\n",
        "                config={\n",
        "                    'display_name': 'temp_audio',\n",
        "                    'mime_type': current_mime\n",
        "                }\n",
        "            )\n",
        "\n",
        "        # 2. 処理待ち\n",
        "        while audio_file.state.name == \"PROCESSING\":\n",
        "            time.sleep(5)\n",
        "            audio_file = client.files.get(name=audio_file.name)\n",
        "\n",
        "        # 3. 解析\n",
        "        instruction = \"\"\"\n",
        "            Analyze this audio and create an 'initial_prompt' to improve transcription accuracy.\n",
        "\n",
        "            【Instructions】\n",
        "            1. Extract proper nouns (names, companies, products), technical terms, and speaker-specific speech patterns.\n",
        "            2. Keep the original spelling and language as heard in the audio.\n",
        "               (e.g., Do not forcibly translate native proper nouns into English if they are in another language.)\n",
        "            3. Output the result as a comma-separated list within 244 tokens.\n",
        "\n",
        "            Return ONLY the comma-separated keywords for the prompt. No intro or outro.\n",
        "        \"\"\"\n",
        "\n",
        "        response = client.models.generate_content(\n",
        "            model=\"gemini-3-flash-preview\",\n",
        "            contents=[instruction, audio_file]\n",
        "        )\n",
        "\n",
        "        # Gemini側のファイルを削除して掃除\n",
        "        client.files.delete(name=audio_file.name)\n",
        "\n",
        "        return response.text.strip()\n",
        "    except Exception as e:\n",
        "        print(f\"⚠️ Gemini解析スキップ: {e}\")\n",
        "        return \"\"\n",
        "\n",
        "# 文字起こしメインループ\n",
        "if target_files:\n",
        "\n",
        "    for idx, file_path in enumerate(target_files):\n",
        "        current_fname = os.path.basename(file_path)\n",
        "        print(f\"\\n🚀 [{idx+1}/{len(target_files)}] 実行中 (Fasterモード): {current_fname}\")\n",
        "\n",
        "        # ユーザーが指定していない場合のみ、Geminiに助けてもらう\n",
        "        current_prompt = initial_prompt # グローバルの設定をコピー\n",
        "        if not current_prompt:\n",
        "            generated_prompt = get_ai_initial_prompt(file_path)\n",
        "            if generated_prompt:\n",
        "                current_prompt = generated_prompt\n",
        "                # --- 表示用：50文字ごとに改行を入れてプリント ---\n",
        "                display_text = \"\\n   \".join([current_prompt[i:i+50] for i in range(0, len(current_prompt), 50)])\n",
        "                print(f\"✨ 生成されたプロンプト:\\n   {display_text}\")\n",
        "\n",
        "        # 文字起こし実行\n",
        "        segments, info = model.transcribe(\n",
        "            file_path,\n",
        "            initial_prompt=current_prompt,\n",
        "            language=None,\n",
        "            condition_on_previous_text=condition_on_previous_text,\n",
        "            beam_size=5,\n",
        "            chunk_length=30,                  # 30秒ずつ区切って処理\n",
        "            # --- ここからが精度のための追加設定 ---\n",
        "            vad_filter=True,                  # 余計なノイズや無音をカット\n",
        "            vad_parameters=dict(min_silence_duration_ms=500),\n",
        "            no_speech_threshold=0.6,          # 喋っていない場所を無理に訳さない\n",
        "            compression_ratio_threshold=2.4,  # 変なループ（同じ言葉の繰り返し）を防ぐ\n",
        "            log_prob_threshold=-1.0,          # 自信がない時に適当なことを言わせない\n",
        "            max_new_tokens=128,               # 1つの字幕の最大文字数を制限\n",
        "            repetition_penalty=1.2,           # ループ（繰り返し）をより厳しく抑制\n",
        "        )\n",
        "\n",
        "        results = []\n",
        "        full_text = \"\"\n",
        "        last_t = 0\n",
        "\n",
        "        # 進捗バーをこのファイル用に作成\n",
        "        from tqdm.notebook import tqdm\n",
        "        pbar = tqdm(total=info.duration, unit=\"sec\", desc=f\"Progress: {current_fname[:20]}\")\n",
        "\n",
        "        for segment in segments:\n",
        "            results.append(segment)\n",
        "            full_text += segment.text\n",
        "            # 進捗バーを更新\n",
        "            pbar.update(segment.end - last_t)\n",
        "            last_t = segment.end\n",
        "\n",
        "        pbar.n = pbar.total\n",
        "        pbar.refresh()\n",
        "        pbar.close()\n",
        "\n",
        "        # 出力ファイル名の生成 (元の名 + 日時 + 各拡張子)\n",
        "        srt_name = get_safe_filename(file_path, \".srt\")\n",
        "        log_name = get_safe_filename(file_path, \".log\")      # 議事録用\n",
        "        txt_name = get_safe_filename(file_path, \".txt\")      # プレーンテキスト用\n",
        "\n",
        "        # SRTとLOGの作成\n",
        "        srt_content = \"\"\n",
        "        log_content = \"\"\n",
        "        for i, seg in enumerate(results):\n",
        "            def f_ts(s):\n",
        "                td = time.gmtime(s)\n",
        "                ms = int((s - int(s)) * 1000)\n",
        "                return f\"{time.strftime('%H:%M:%S', td)},{ms:03d}\"\n",
        "            srt_content += f\"{i+1}\\n{f_ts(seg.start)} --> {f_ts(seg.end)}\\n{seg.text.strip()}\\n\\n\"\n",
        "            log_content += f\"[{f_ts(seg.start)[:8]}] {seg.text.strip()}\\n\"\n",
        "\n",
        "        # --- 保存とダウンロード ---\n",
        "\n",
        "        # 1. 保存先ディレクトリの決定\n",
        "        # GoogleDriveモードなら指定フォルダへ、それ以外ならルートパスへ\n",
        "        save_dir = target_path if mode == \"GoogleDrive\" else ROOT_PATH\n",
        "\n",
        "        # 2. ファイル書き出し（全環境共通）\n",
        "        # 字幕(SRT)は常に保存\n",
        "        with open(os.path.join(save_dir, srt_name), \"w\", encoding=\"utf-8\") as f:\n",
        "            f.write(srt_content)\n",
        "\n",
        "        # 議事録(LOG)\n",
        "        if records_text_download:\n",
        "            with open(os.path.join(save_dir, log_name), \"w\", encoding=\"utf-8\") as f:\n",
        "                f.write(log_content)\n",
        "\n",
        "        # プレーン(TXT)\n",
        "        if plain_text_download:\n",
        "            with open(os.path.join(save_dir, txt_name), \"w\", encoding=\"utf-8\") as f:\n",
        "                f.write(full_text)\n",
        "\n",
        "        # 3. 完了メッセージの表示\n",
        "        if mode == \"GoogleDrive\":\n",
        "            print(f\"✅ Driveに保存完了: {srt_name}\")\n",
        "        else:\n",
        "            if not IS_COLAB:\n",
        "                # ローカル環境の場合\n",
        "                print(f\"✅ カレントディレクトリに保存完了: {srt_name}\")\n",
        "\n",
        "        # 4. ブラウザダウンロード処理（ColabかつGoogleDriveモード以外のみ実行）\n",
        "        if IS_COLAB and mode != \"GoogleDrive\":\n",
        "            # 常にSRTをダウンロード\n",
        "            files.download(srt_name)\n",
        "\n",
        "            # オプションに応じてLOGとTXTも確実にダウンロード\n",
        "            if records_text_download:\n",
        "                files.download(log_name)\n",
        "            if plain_text_download:\n",
        "                files.download(txt_name)\n",
        "\n",
        "            print(f\"✅ ダウンロード完了: {srt_name}\")\n",
        "\n",
        "        # メモリ掃除\n",
        "        del results, srt_content, log_content, full_text\n",
        "        gc.collect()\n",
        "        if torch.cuda.is_available():\n",
        "            torch.cuda.empty_cache()\n",
        "\n",
        "    # 結果表示\n",
        "    # clear_output()\n",
        "\n",
        "    # --- 🐤 Chirp Sound （完了通知音） ---\n",
        "    sample_rate = 44100 # Hz\n",
        "\n",
        "    # 音を構成する要素の数\n",
        "    num_sparkles = 3\n",
        "    # 各きらめき音の長さ\n",
        "    sparkle_duration = 0.08 # 秒\n",
        "    # 周波数の全体的な開始範囲と終了範囲\n",
        "    base_start_frequency = 2000 # Hz\n",
        "    base_end_frequency = 4000 # Hz (全体的に上昇するような効果)\n",
        "    # 急速な減衰率\n",
        "    decay_rate = 20\n",
        "\n",
        "    all_data = []\n",
        "    for i in range(num_sparkles):\n",
        "        # 各きらめき音のタイムベクトル\n",
        "        t_sparkle = np.linspace(0, sparkle_duration, int(sparkle_duration * sample_rate), endpoint=False)\n",
        "\n",
        "        # 各きらめき音の開始周波数と終了周波数を計算\n",
        "        # これにより、全体のきらめき音が上昇するシーケンスになります\n",
        "        f_start_current_sparkle = base_start_frequency + (base_end_frequency - base_start_frequency) * (i / num_sparkles)\n",
        "        f_end_current_sparkle = base_start_frequency + (base_end_frequency - base_start_frequency) * ((i + 1) / num_sparkles)\n",
        "\n",
        "        # 各きらめき音内でわずかな上昇スイープを導入し、きらめき効果を強調\n",
        "        # 実際の周波数は f_start_current_sparkle から f_end_current_sparkle まで変化します\n",
        "        instantaneous_frequency = np.linspace(f_start_current_sparkle, f_end_current_sparkle, len(t_sparkle))\n",
        "\n",
        "        # 波形を生成\n",
        "        sparkle_wave = np.sin(2 * np.pi * instantaneous_frequency * t_sparkle)\n",
        "\n",
        "        # 指数関数的減衰を適用\n",
        "        envelope_sparkle = np.exp(-decay_rate * t_sparkle)\n",
        "        sparkle_data = sparkle_wave * envelope_sparkle\n",
        "\n",
        "        all_data.append(sparkle_data)\n",
        "\n",
        "        # 各きらめき音の間に非常に短い無音を挿入（最後の音以外）\n",
        "        # これにより、それぞれの音がはっきりと聞こえ、混ざり合うのを防ぎます\n",
        "        if i < num_sparkles - 1:\n",
        "            silence_duration = 0.02 # 秒\n",
        "            silence = np.zeros(int(silence_duration * sample_rate))\n",
        "            all_data.append(silence)\n",
        "\n",
        "    data = np.concatenate(all_data)\n",
        "\n",
        "    # ボリュームを正規化（クリッピング防止と音量調整）\n",
        "    # データが空の場合はエラーにならないように処理\n",
        "    data = data / np.max(np.abs(data)) * 0.8 if len(data) > 0 else np.array([0.0])\n",
        "\n",
        "    display(Audio(data, rate=sample_rate, autoplay=True))\n",
        "\n",
        "    print(\"\\n🎉 すべての処理が完了しました！\")\n",
        "\n",
        "    print(\"\\n\" + \"!\"*40)\n",
        "    print(\"⚠️ ATTENTION: PLEASE DISCONNECT RUNTIME\")\n",
        "    print(\"⚠️ 接続解除を忘れないでください！残り時間が削られます。\")\n",
        "    print(\"!\"*40)\n",
        "\n",
        "    # YouTube用の一時ファイル削除\n",
        "    if mode == \"YouTube\" and target_files and os.path.exists(target_files[0]):\n",
        "        os.remove(target_files[0])\n",
        "else:\n",
        "    print(\"⚠️ 処理対象のファイルがありませんでした。\")"
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "authorship_tag": "ABX9TyPqNtfKuat0scv8xPpZ4jd7",
      "gpuType": "T4",
      "include_colab_link": true,
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}