# Token Streamer Token Streamer is a Python tool that streams token-by-token completions from a locally hosted language model (LLM), providing real-time feedback for interactive applications. This tool is designed to help developers create dynamic and responsive systems while keeping all processing offline for enhanced privacy. ## Features - Streams tokens generated by a locally hosted LLM in real-time. - Supports adjustable streaming speed. - Optionally saves the generated output to a file. - Provides a command-line interface (CLI) for ease of use. ## Installation To use Token Streamer, you need to install the required dependencies. You can do this using pip: ```bash pip install transformers rich pytest ``` ## Usage ### CLI Usage Run the tool from the command line with the following options: ```bash python token_streamer.py --model-path --input [--stream-speed ] [--output-file ] ``` - `--model-path`: Path to the locally hosted model. - `--input`: Input prompt for the model. - `--stream-speed`: (Optional) Delay in seconds between streaming tokens. Default is 0.5 seconds. - `--output-file`: (Optional) File path to save the output. ### Example ```bash python token_streamer.py --model-path ./models/gpt2 --input "Once upon a time" --stream-speed 0.2 --output-file output.txt ``` ## Testing To run the tests, use pytest: ```bash pytest test_token_streamer.py ``` The tests include: 1. Basic functionality test. 2. Test with output file saving. 3. Error handling test. ## License This project is licensed under the MIT License.