subaligner
[![Build Status](https://github.com/baxtree/subaligner/actions/workflows/ci-pipeline.yml/badge.svg?branch=master)](https://github.com/baxtree/subaligner/actions/workflows/ci-pipeline.yml?query=branch%3Amaster) ![Codecov](https://img.shields.io/codecov/c/github/baxtree/subaligner) [![python](https://img.shields.io/badge/python-3.9%20%7C%203.10%20%7C%203.11%20%7C%203.12-blue)](https://www.python.org/) [![Documentation Status](https://readthedocs.org/projects/subaligner/badge/?version=latest)](https://subaligner.readthedocs.io/en/latest/?badge=latest) [![GitHub license](https://img.shields.io/github/license/baxtree/subaligner)](https://github.com/baxtree/subaligner/blob/master/LICENSE) [![PyPI](https://badge.fury.io/py/subaligner.svg)](https://badge.fury.io/py/subaligner) [![Docker Pulls](https://img.shields.io/docker/pulls/baxtree/subaligner)](https://hub.docker.com/r/baxtree/subaligner) [![Citation](https://zenodo.org/badge/DOI/10.5281/zenodo.5603083.svg)](https://doi.org/10.5281/zenodo.5603083) ## Supported Formats Subtitle: SubRip, TTML, WebVTT, (Advanced) SubStation Alpha, MicroDVD, MPL2, TMP, EBU STL, SAMI, SCC and SBV. Video/Audio: MP4, WebM, Ogg, 3GP, FLV, MOV, Matroska, MPEG TS, WAV, MP3, AAC, FLAC, etc. :information_source: Subaligner relies on file extensions as default hints to process a wide range of audiovisual or subtitle formats. It is recommended to use extensions widely acceppted by the community to ensure compatibility. ## Dependant package Required by the basic installation: [FFmpeg](https://www.ffmpeg.org/)
Install FFmpeg
apt-get install ffmpeg
brew install ffmpeg
## Basic Installation
Install from PyPI
pip install -U pip && pip install wheel
pip install subaligner
Install from source
git clone git@github.com:baxtree/subaligner.git && cd subaligner
pip install -U pip
pip install .
:information_source: It is highly recommended creating a virtual environment prior to installation. ## Installation with Optional Packages Supporting Additional Features
Install dependencies for enabling translation and transcription
pip install 'subaligner[llm]'
Install dependencies for enabling forced alignment
pip install --no-build-isolation 'subaligner[stretch]'
Install dependencies for setting up the development environment
pip install --no-build-isolation 'subaligner[dev]'
Install all extra dependencies
pip install --no-build-isolation 'subaligner[harmony]'
Note that `subaligner[stretch]`, `subaligner[dev]` and `subaligner[harmony]` require [eSpeak](https://espeak.sourceforge.net/) to be pre-installed:
Install eSpeak
apt-get install espeak libespeak1 libespeak-dev espeak-data
brew install espeak
Also, if Python 3.12+ is used, you will need to install the following patch for those extras to fully function:
Install patched aeneas
pip install --no-build-isolation -r requirements-py312-patched.txt
## Container Support If you prefer using a containerised environment over installing everything locally:
Run subaligner with a container
docker run -v `pwd`:`pwd` -w `pwd` -it baxtree/subaligner bash
For Windows users, you can use Windows Subsystem for Linux ([WSL](https://learn.microsoft.com/en-us/windows/wsl/install)) to install Subaligner. Alternatively, you can use [Docker Desktop](https://docs.docker.com/docker-for-windows/install/) to pull and run the image. Assuming your media assets are stored under `d:\media`, open built-in command prompt, PowerShell, or Windows Terminal:
Run the subaligner container on Windows
docker pull baxtree/subaligner
docker run -v "/d/media":/media -w "/media" -it baxtree/subaligner bash
## Usage
Single-stage alignment (high-level shift with lower latency)
subaligner -m single -v video.mp4 -s subtitle.srt
subaligner -m single -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt
Dual-stage alignment (low-level shift with higher latency)
subaligner -m dual -v video.mp4 -s subtitle.srt
subaligner -m dual -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt
Generate subtitles by transcribing audiovisual files
subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf small -o subtitle_aligned.srt
subaligner -m transcribe -v video.mp4 -ml zho -mr whisper -mf medium -o subtitle_aligned.srt
Pass in a global prompt for the entire audio transcription
subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf turbo -ip "your initial prompt" -o subtitle_aligned.srt
Use the full subtitle content as a prompt
subaligner -m transcribe -v video.mp4 -s subtitle.srt -ml eng -mr whisper -mf turbo -o subtitle_aligned.srt
Use the previous subtitle segment as the prompt when transcribing the following segment
subaligner -m transcribe -v video.mp4 -s subtitle.srt --use_prior_prompting -ml eng -mr whisper -mf turbo -o subtitle_aligned.srt
(For details on the prompt crafting for transcription, please refer to [Whisper prompting guide](https://cookbook.openai.com/examples/whisper_prompting_guide).)
Alignment on segmented plain texts (double newlines as the delimiter)
subaligner -m script -v video.mp4 -s subtitle.txt -o subtitle_aligned.srt
subaligner -m script -v https://example.com/video.mp4 -s https://example.com/subtitle.txt -o subtitle_aligned.srt
Generate JSON raw subtitle with per-word timings
subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf turbo -ip "your initial prompt" --word_time_codes -o raw_subtitle.json
subaligner -m script -v video.mp4 -s subtitle.txt --word_time_codes -o raw_subtitle.json
Alignment on multiple subtitles against the single media file
subaligner -m script -v video.mp4 -s subtitle_lang_1.txt -s subtitle_lang_2.txt
subaligner -m script -v video.mp4 -s subtitle_lang_1.txt subtitle_lang_2.txt
Alignment on embedded subtitles
subaligner -m single -v video.mkv -s embedded:stream_index=0 -o subtitle_aligned.srt
subaligner -m dual -v video.mkv -s embedded:stream_index=0 -o subtitle_aligned.srt
Translative alignment with the ISO 639-3 language code pair (src,tgt)
subaligner --languages
subaligner -m single -v video.mp4 -s subtitle.srt -t src,tgt
subaligner -m dual -v video.mp4 -s subtitle.srt -t src,tgt
subaligner -m script -v video.mp4 -s subtitle.txt -o subtitle_aligned.srt -t src,tgt
subaligner -m dual -v video.mp4 -s subtitle.srt -tr helsinki-nlp -o subtitle_aligned.srt -t src,tgt
subaligner -m dual -v video.mp4 -s subtitle.srt -tr facebook-mbart -tf large -o subtitle_aligned.srt -t src,tgt
subaligner -m dual -v video.mp4 -s subtitle.srt -tr facebook-m2m100 -tf small -o subtitle_aligned.srt -t src,tgt
subaligner -m dual -v video.mp4 -s subtitle.srt -tr whisper -tf small -o subtitle_aligned.srt -t src,tgt
Transcribe audiovisual files and generate translated subtitles
subaligner -m transcribe -v video.mp4 -ml src -mr whisper -mf small -tr helsinki-nlp -o subtitle_aligned.srt -t src,tgt
Shift subtitle manually by offset in seconds
subaligner -m shift --subtitle_path subtitle.srt -os 5.5
subaligner -m shift --subtitle_path subtitle.srt -os -5.5 -o subtitle_shifted.srt
Run batch alignment against directories
subaligner_batch -m single -vd videos/ -sd subtitles/ -od aligned_subtitles/
subaligner_batch -m dual -vd videos/ -sd subtitles/ -od aligned_subtitles/
subaligner_batch -m dual -vd videos/ -sd subtitles/ -od aligned_subtitles/ -of ttml
Run alignments with pipx
pipx run subaligner -m single -v video.mp4 -s subtitle.srt
pipx run subaligner -m dual -v video.mp4 -s subtitle.srt
Run the module as a script
python -m subaligner -m single -v video.mp4 -s subtitle.srt
python -m subaligner -m dual -v video.mp4 -s subtitle.srt
Run alignments with the docker image
docker pull baxtree/subaligner
docker run -v `pwd`:`pwd` -w `pwd` -it baxtree/subaligner subaligner -m single -v video.mp4 -s subtitle.srt
docker run -v `pwd`:`pwd` -w `pwd` -it baxtree/subaligner subaligner -m dual -v video.mp4 -s subtitle.srt
docker run -it baxtree/subaligner subaligner -m single -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt
docker run -it baxtree/subaligner subaligner -m dual -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt
![](figures/screencast.gif) The aligned subtitle will be saved at `subtitle_aligned.srt`. To obtain the subtitle in raw JSON format for downstream processing, replace the output file extension with `.json`. For details on CLIs, run `subaligner -h` or `subaligner_batch -h`, `subaligner_convert -h`, `subaligner_train -h` and `subaligner_tune -h` for additional utilities. `subaligner_1pass` and `subaligner_2pass` are shortcuts for running `subaligner` with `-m single` and `-m dual` options, respectively. ## Advanced Usage You can train a new model with your own audiovisual files and subtitle files,
Train a custom model
subaligner_train -vd VIDEO_DIRECTORY -sd SUBTITLE_DIRECTORY -tod TRAINING_OUTPUT_DIRECTORY
Then you can apply it to your subtitle synchronisation with the aforementioned commands. For more details on how to train and tune your own model, please refer to [Subaligner Docs](https://subaligner.readthedocs.io/en/latest/advanced_usage.html). For larger media files taking longer to process, you can reconfigure various timeouts using the following:
Options for tuning timeouts
## Anatomy Subtitles can be out of sync with their companion audiovisual media files for a variety of causes including latency introduced by Speech-To-Text on live streams or calibration and rectification involving human intervention during post-production. A model has been trained with synchronised video and subtitle pairs and later used for predicating shifting offsets and directions under the guidance of a dual-stage aligning approach. First Stage (Global Alignment): ![](figures/1st_stage.png) Second Stage (Parallelised Individual Alignment): ![](figures/2nd_stage.png) ## Acknowledgement This tool wouldn't be possible without the following packages: [librosa](https://librosa.github.io/librosa/) [tensorflow](https://www.tensorflow.org/) [scikit-learn](https://scikit-learn.org) [pycaption](https://pycaption.readthedocs.io) [pysrt](https://github.com/byroot/pysrt) [pysubs2](https://github.com/tkarabela/pysubs2) [aeneas](https://www.readbeyond.it/aeneas/) [transformers](https://huggingface.co/transformers/) [whisper](https://openai.com/index/whisper/). Thanks to Alan Robinson and Nigel Megitt for their invaluable feedback.