--- date: "2025-09-12T00:00:00Z" categories: - linkedin - llms description: "LLM-assisted grading dramatically compresses assessment design, scoring, and analysis cycles while keeping quality close to human evaluators." keywords: ["LLM evaluation", "grading automation", "AI education", "assessment design", "scoring", "Tools in Data Science"] --- My _Tools in Data Science_ course uses LLMs for assessments. We use LLMs to 1. Suggest project ideas (I pick), e.g. https://chatgpt.com/share/6741d870-73f4-800c-a741-af127d20eec7 2. Draft the project brief (we edit), e.g. https://docs.google.com/document/d/1VgtVtypnVyPWiXied5q0_CcAt3zufOdFwIhvDDCmPXk/edit 3. Propose scoring rubrics (we tweak), e.g. https://chatgpt.com/share/68b8eef6-60ec-800c-8b10-cfff1a571590 4. Score code against the rubric (we test), e.g. https://github.com/sanand0/tds-evals/blob/5cfabf09c21c2884623e0774eae9a01db212c76a/llm-browser-agent/process_submissions.py 5. Analyze the results (we refine), e.g. https://chatgpt.com/share/68b8f962-16a4-800c-84ff-fb9e3f0c779a This changed our assessments process. It's easier _and_ better. Earlier, TAs took 2 **weeks** to evaluate 500 code submissions. In the example above, it took 2 **hours**. Quality held up: LLMs match my judgement as closely as TAs do but run fast and at scale. LLM-graded reviews aren't just a cost hack. They're a **scale** and **quality** lever. 1. We create new assessments fast. The example took ~2 hours to ideate. 2. We run, analyze and iterate just as fast. This full loop now takes ~2 hours. I _no longer have an excuse_ to teach outdated content. Prompts & code: https://github.com/sanand0/tds-evals/tree/main/llm-browser-agent ![](https://files.s-anand.net/images/2025-09-12-tds-llm-evaluation-linkedin.jpg) [LinkedIn](https://www.linkedin.com/posts/sanand0_my-%F0%9D%98%9B%F0%9D%98%B0%F0%9D%98%B0%F0%9D%98%AD%F0%9D%98%B4-%F0%9D%98%AA%F0%9D%98%AF-%F0%9D%98%8B%F0%9D%98%A2%F0%9D%98%B5%F0%9D%98%A2-%F0%9D%98%9A%F0%9D%98%A4%F0%9D%98%AA%F0%9D%98%A6%F0%9D%98%AF%F0%9D%98%A4%F0%9D%98%A6-activity-7369317805403369473-TjAV)