Hangrui Cao

I am a graduate student major in Computational Data Science at Carnegie Mellon University, School of Computer Science. I graduated with a bachelor degree in Computer Science, minor in Mathematics at University of Michigan and a dual degree in Electrical and Computer Engineering and Shanghai Jiao Tong University. My intersted fields include Machine Learning & Data Analysis, Full-stack Engineering, Database, Web & App Development

Links:

  • CV (Nov 2021)
  • Experience

    Carnegie Mellon University, Pittsburgh, PA, US

    M.S in Computational Data Science in School of Computer Science | GPA : 4.0/4.0 | May 2022 - Dec 2023(Expected)
    Selected Courses:
    • Intro to Computing Systems(A)
    • Search Engine (Ongoing)
    • Machine Learning (Ongoing)
    • Foundations of Computational Data Science(Ongoing)

    University of Michigan, Ann Arbor, US

    B.S.E in Computer Science | Minor in Mathematics | GPA: 3.98/4.00 | Sept 2020 - May 2022
    Artificial Intelligence:
    • Computer Vision (A)
    • Intro to Machine Learning (A)
    • Conversation AI (A)
    • Human Computer Interaction(A+)
    • Natural Language Processing(A)
    System & Programming
    • Database and Management System (A+)
    • Foundations of Computer Science (A)
    • Data Structure and Algorithms (A)
    • Intro to Computer Organization (A)
    • Computer Networks (A+)
    Mathematics
    • Numerical Analysis (A)
    • Matrix Theory (A)
    • Algorithms (A)

    Shanghai Jiao Tong University, Shanghai, China

    B.S.E in Electrical and Computer Engineering | GPA: 3.874/4.00 | Sept 2018 - Aug 2022
    Electrical and Computer Engineering
    • Programming and Data Structure (A+)
    • Logic Design (A)
    • Electronic Circuits(A)
    Engineering Foundations
    • Introduction to Engineering(A+)
    • Introduction to Computer and Programming (A+)
    • Probablistic Methods in Engineering (A)
    Mathematics & Physic
    • Discrete Mathematics (A)
    • Honors Mathematics II-IV (A-, A+)
    • Honors Physics I & II(A, A+)

    Ritsumeikan University

    Winter Program | Jan 2020 - Feb 2020

    Work Experience

    Deep Learning Software Intern at Intel

    [Public Code]

    • Implemented machine learning models such as XGBoost and ResNet with Scala and Java which enhanced training speed by 9.8%
    • Developed model inference pipeline with Python and OpenVino and designed YOLOv3 model to detect cigarettes in photos
    • Engineered and revised PPML module, docker file and unit tests to effectively support new graphene version 1.2RC and Intel-sgx
    • Deployed federated learning framework (FATE) in Intel-sgx and tested model inference for cluster serving with Flink & Spark

    Machine Learning Intern & Research Assistant at Transportation Research Institute, University of Michigan

    [ Introductary Slides]

    • Instructed by Professor Carol Flannagan.
    • Developed a light-weight multi-class CNN model to classify drivers' behavior with OpenCV and achieved 90.2% accuracy
    • Designed loss functions, confident learning and probability model to resolve the uncertainty problem “Shaky Ground Truth”
    • Implemented Bayesian CNN to consider probabilities distribution for weights and obtained Aleatoric uncertainty 0.0162
    • Investigate different eye gaze software and built a system to gather labeler and drivers’ eye gaze data with PyGaze

    Personal Projects

    I've participated in research projects related to machine learning, distributed systems, mobile computing, data analysis

    Birds of a Feather Help: Context-aware Client Selection for Efficient Federated Learning

    [Paper] Accepted by FL-AAAI-22 for oral presentation

    • Instructed by Professor Yifei Zhu.
    • Invent a novel neural combinatorial contextual bandit (NCCB) to intelligently select clients in each federated round meanwhile ensures the privacy requirement in federated learning. The method surpassed the state-of-art client selection(Oort) in terms of speed and final accuracy.

    Large data analysis and group work pattern recognition with Github Archive Dataset

    • Instructed by Professor Qiaozhu Mei.
    • Present a large-scale empirical study to how sentimental emoji and text usage relates to Github workers' behaviour with different metrics, Methods include GNN (for social graph embedding), LINE(Large-scale information network embedding), Regression, Cluster, NLP processing.

    Networked Control System Under Denial-of-Service Attack Simluation

    [Paper in CAC 2020]

    • Instructed by Professor Jing Wu.
    • Implemented a networked control system, simulation of DoS Attack in Networked Control System Using TrueTime

    Selected Course Projects

    Automatic Image Colorization with CNN and GAN

    [Course Paper] [Code]

    • I implemented GAN(Generative Adversial Networks) for image colorization and compared it with classification approaches in terms of PSNR and SSIM.

    Real-time Carbon Emission Evaluation and Optimization for Future Low-Carbon Buildings

    [Website]

    • In this project, we use reinforcement learning based method to control carbon emission and power consumption in Large Buildings
    • Researched multiple software for building carbon emission simulation
    • Implemented simulation, data processing and analyzing pipeline and built a real-time database backend
    • Dveloped a website with Real-time tracking dashboard function to track our strategies performance.

    Email Voice Assistant

    [Course Paper] [Code]

    • Designed interface for smart voice email control with React.js, and backend pipeline to process email request with Flask
    • Developed speech-to-text model with 7.8% WER and generated smart reply to users with Dialogflow and Rasa

    Contact Me

    Feel Free to leave your message here!