# LLM Code Eval LLM Code Eval is a Python tool designed to evaluate code generated by large language models (LLMs) for esoteric programming tasks. The tool runs the generated code using interpreters for specific esoteric languages, checks for correctness, and logs detailed results. This is useful for validating generated programs and profiling LLM performance. ## Features - Executes code in esoteric programming languages using specified interpreters. - Compares the output of the code against expected results. - Logs detailed results in JSON format. ## Installation This tool requires Python 3.7 or higher. No additional Python packages are required. ## Usage Run the tool from the command line: ```bash python llm_code_eval.py --code --interpreter --expected-output [--log ] ``` ### Arguments - `--code`: Path to the generated code file. - `--interpreter`: Path to the esoteric language interpreter. - `--expected-output`: Path to the file containing the expected output. - `--log`: (Optional) Path to save the execution log as a JSON file. ## Testing The tool includes a comprehensive test suite using `pytest`. To run the tests, install `pytest` and execute: ```bash pytest test_llm_code_eval.py ``` The tests use mocking to simulate file system and subprocess behavior, ensuring they do not require actual files or interpreters. ## License This project is licensed under the MIT License.