# Thom Lake AI | Deep Learning | Machine Learning Austin, Texas · [thom.l.lake@gmail.com](mailto:thom.l.lake@gmail.com) · [thomlake.github.io](https://thomlake.github.io) I work on language-model systems for open-ended problems. My work brings together agentic systems, simulation, principled evaluation, and human–computer interaction. *Research interests: agent runtimes, evaluation, interactive environments, recommender systems, and post-training for reasoning, coherence, and memory management* ## Selected Publications - From Distributional to Overton Pluralism: Investigating Large Language Model Alignment, NAACL 2025 [arXiv](https://arxiv.org/abs/2406.17692) - ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models, NeurIPS 2025 [arXiv](https://arxiv.org/abs/2505.13444) - Flexible Job Classification with Zero-Shot Learning, RecSys in HR'22 [arXiv](https://arxiv.org/abs/2209.12678) - Large-scale Collaborative Filtering with Product Embeddings, 2018 [arXiv](https://arxiv.org/abs/1901.04321) ## Experience ### Indeed — Austin, Texas #### AI Technical Fellow (2025–present) I lead science for employer-facing AI systems, spanning LLM ranking and decision support systems, conversational assistants, and asynchronous agents. As a senior IC working across dozens of teams, I help shape system architecture, context management strategy, and evaluation methodology for high-stakes AI systems using traditional metrics, model-based assessments (LLM-as-a-Judge), and interactive environments. #### Principal Data Scientist (2023–2025) I led early LLM work for hiring workflows, with emphasis on fine-tuning, prompt design, and evaluation. Key applications included candidate assessment, document summarization, and personalized employer outreach, where model improvements increased application starts by 20%. I also worked across ML and data science teams to establish evaluation practices for generative AI systems beyond standard closed-form benchmarks. #### Staff Data Scientist (2021–2022) I helped teams shift from bespoke feature engineering to fine-tuning pre-trained language models for ranking, retrieval, parsing, and taxonomy. I led a taxonomy modernization effort that reduced new-market rollout from years to months. In parallel, I built neural recommender systems and introduced fairness evaluation into production workflows. #### Senior Data Scientist (2019–2020) I joined as Indeed's first deep learning specialist and helped modernize core hiring systems by adopting pre-trained language models and representation learning techniques. I also introduced GPU-backed experimentation workflows that made these methods practical for applied ML teams. ### Amazon — Austin, Texas #### Machine Learning Scientist (2016–2019) I worked on homepage and mobile personalization systems that rank content from multiple recommendation strategies. I developed an attention-based collaborative filtering method that became the top-performing candidate source by engagement, worked on contextual bandit methods for mobile ranking, and helped build shared representation learning infrastructure for millions of products. ### Atlas Wearables — Austin, Texas #### Lead Data Scientist (2014–2016) I built on-device machine learning for wearable fitness devices, including exercise recognition, repetition counting, and form analysis. My work included one-shot learning, clustering, and inference pipelines optimized for resource-constrained embedded hardware. ### Zoetis — Kalamazoo, Michigan #### Data Scientist (2013–2014) I built data pipelines to normalize messy semi-structured external sources using statistical NLP, heuristics, and human-in-the-loop feedback. I also worked on genotype search and probabilistic inference. ### Western Michigan University — Kalamazoo, Michigan #### Research Assistant (2010–2013) I developed neural network models for agricultural disease risk prediction, with emphasis on spatiotemporal generalization, loss design, and evaluation procedures suited to spatiotemporal data. ### Missouri University of Science and Technology — Rolla, Missouri #### NSF Undergraduate Research (2010) I worked on wireless sensor networks in simulation and distributed low-resource settings, including unsupervised outlier detection and dynamic tree-based routing. ## Research & Education ### University of Texas at Austin Doctoral Researcher, TAUR Lab (2023–2025) ### Western Michigan University MS Computer Science (2012–2015) BS Computer Science, Mathematics (2009-2012)