{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "provenance": [], "authorship_tag": "ABX9TyMflk/ikTyYALqe18fySLnX", "include_colab_link": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "source": [ "# Build a Smart Resume Ranker with Python and Natural Language Processing\n", "\n", "## Use AI to evaluate and score resumes against a job description\n", "\n", "| ![space-1.jpg](https://github.com/Tanu-N-Prabhu/Python/blob/master/Img/resume_ranker.png?raw=true) |\n", "|:--:|\n", "| Image Generated Using Canva|\n", "\n", "### Introduction\n", "Tired of manually skimming through resumes to match the right candidate? In today’s world, AI can do a first-level screening by ranking resumes based on their relevance to a job description. In this project, we'll build a smart resume ranking system using TF-IDF and cosine similarity, simple NLP techniques that can save a lot of time.\n", "\n", "---\n", "\n", "### What You'll Learn\n", "* How to parse text from resumes and job descriptions\n", "* Clean and preprocess resume content\n", "* Convert documents into numerical form using TF-IDF\n", "* Calculate similarity scores using cosine similarity\n", "* Rank resumes based on relevance\n", "\n", "---\n", "\n", "### Code Implementation\n" ], "metadata": { "id": "ywY_EoH2-Fvc" } }, { "cell_type": "code", "source": [ "import os\n", "from sklearn.feature_extraction.text import TfidfVectorizer\n", "from sklearn.metrics.pairwise import cosine_similarity\n", "\n", "# Sample job description\n", "job_desc = \"\"\"\n", "We are looking for a Python developer with experience in data analysis, pandas, NumPy,\n", "machine learning, and writing clean, maintainable code.\n", "\"\"\"\n", "\n", "# Sample resumes\n", "resumes = [\n", " \"Experienced Python developer with knowledge of pandas, NumPy, and machine learning models.\",\n", " \"Software engineer skilled in Java, C++, and cloud services like AWS and Azure.\",\n", " \"Data scientist with strong Python skills, data cleaning, pandas, sklearn, and model evaluation.\",\n", " \"Content writer with a strong verbal and written skills.\",\n", " \"I have no experience in programming\",\n", " \"Python developer with experience in data analysis, pandas, NumPy, machine learning, and writing clean, maintainable code.\"\n", "]\n", "\n", "# Combine all documents\n", "documents = [job_desc] + resumes\n", "\n", "# TF-IDF Vectorization\n", "vectorizer = TfidfVectorizer()\n", "tfidf_matrix = vectorizer.fit_transform(documents)\n", "\n", "# Calculate cosine similarity between job description and each resume\n", "cosine_similarities = cosine_similarity(tfidf_matrix[0:1], tfidf_matrix[1:]).flatten()\n", "\n", "# Rank resumes\n", "ranking = sorted(list(enumerate(cosine_similarities)), key=lambda x: x[1], reverse=True)\n", "\n", "# Display results\n", "print(\"Resume Ranking:\")\n", "for idx, score in ranking:\n", " print(f\"Resume {idx+1} - Relevance Score: {score:.2f}\")" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "igiNh_VR_h-b", "outputId": "4c79870b-731e-49bb-92e6-deeb8df537f5" }, "execution_count": 7, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Resume Ranking:\n", "Resume 6 - Relevance Score: 0.82\n", "Resume 1 - Relevance Score: 0.35\n", "Resume 3 - Relevance Score: 0.21\n", "Resume 5 - Relevance Score: 0.13\n", "Resume 2 - Relevance Score: 0.07\n", "Resume 4 - Relevance Score: 0.06\n" ] } ] }, { "cell_type": "markdown", "source": [ "### Explanation\n", "* TF-IDF captures how important each word is in each resume relative to the job description.\n", "\n", "* Cosine similarity then measures how closely the resume matches the job.\n", "\n", "* This is a scalable way to automate initial screening in HR workflows or freelance platforms.\n", "\n", "---\n", "\n", "### Understanding Resume Scores\n", "Each resume is compared to the job description using text similarity, and a Relevance Score between 0 and 1 is generated:\n", "\n", "* A score closer to 1 means very relevant (e.g., 0.95 = strong match)\n", "\n", "* A score around 0.5 means moderately relevant\n", "\n", "* A score below 0.3 means weak or poor match\n", "\n", "Think of it as a percentage match: just multiply by 100 to get an idea:\n", "\n", "* 0.38 → 38% match\n", "\n", "* 0.23 → 23% match\n", "\n", "* 0.15 → 15% match\n", "\n", "---\n", "\n", "### Conclusion\n", "AI doesn’t always have to be complicated. This simple NLP-based resume ranker shows how machine learning concepts can create real-world impact, especially in recruitment, HR tech, or freelance hiring systems. You can expand this further by adding PDF parsing, named entity recognition (NER), or using large language models for deeper understanding. Thanks for reading my article, let me know if you have any suggestions or similar implementations via the comment section. Until then, see you next time. Happy coding!\n", "\n", "---\n", "\n", "### Before you go\n", "* Be sure to Like and Connect Me\n", "* Follow Me : [Medium](https://medium.com/@tanunprabhu95) | [GitHub](https://github.com/Tanu-N-Prabhu) | [LinkedIn](https://ca.linkedin.com/in/tanu-nanda-prabhu-a15a091b5) | [Python Hub](https://github.com/Tanu-N-Prabhu/Python)\n", "* [Check out my latest articles on Programming](https://medium.com/@tanunprabhu95)\n", "* Check out my [GitHub](https://github.com/Tanu-N-Prabhu) for code and [Medium](https://medium.com/@tanunprabhu95) for deep dives!" ], "metadata": { "id": "FpxFJj3-BEFb" } } ] }