# LLM Edge Deployer LLM Edge Deployer is a Python library and CLI tool designed to streamline the process of deploying optimized LLMs on edge hardware. It provides utilities to convert models to hardware-efficient formats like ONNX, export them, and run compatibility checks for edge accelerators such as NVIDIA TensorRT. ## Features - Convert models to ONNX format. - Check compatibility with edge devices (e.g., NVIDIA TensorRT, OpenVINO). - Run test inference on ONNX models with sample input data. ## Installation Install the required dependencies using pip: ```bash pip install onnx onnxruntime optimum numpy pytest ``` ## Usage ### CLI Run the tool from the command line: ```bash python llm_edge_deployer.py --input_model \ --target_device \ --test_sample \ --output_model ``` - `--input_model`: Path to the optimized model file. - `--target_device`: Target edge device type (e.g., `tensorrt`, `openvino`). - `--test_sample`: Path to the test sample JSON file. - `--output_model`: (Optional) Path to save the converted ONNX model. Default is `output_model.onnx`. ### Library You can also use the tool as a Python library: ```python from llm_edge_deployer import convert_to_onnx, check_device_compatibility, run_test_inference # Convert model to ONNX output_model_path = convert_to_onnx("path/to/input_model", "path/to/output_model.onnx") # Check device compatibility check_device_compatibility("tensorrt") # Run test inference result = run_test_inference("path/to/output_model.onnx", "path/to/test_sample.json") print("Inference result:", result) ``` ## Testing Run the tests using pytest: ```bash pytest test_llm_edge_deployer.py ``` ## License This project is licensed under the MIT License.