Vu Trong Chau — AI/ML Engineer

90%

Best Model Accuracy

2.26M

Records Processed

340ms

Inference Latency

94.5%

Best R² Score

01 — Experience

Work Experience

TechX

May 2026 – Present

AI Systems & LLM Engineering Intern

Architected a multi-LLM inference orchestration pipeline unifying Claude 3.5, GPT-4, Gemini 1.5 Pro, and Llama 3 via secure API gateway with retry logic and adaptive rate limiting, reducing end-to-end API failure rate and improving pipeline throughput.
Optimized on-device inference via 4-bit quantization, FP16 mixed precision, and CPU offloading using Hugging Face Transformers, reducing GPU memory footprint and cutting per-query inference latency.
Deployed GPU-accelerated inference services on GCP Vertex AI with Kubernetes-managed autoscaling, batch inference queuing, and response caching, achieving low latency at scale while reducing idle compute costs.

02 — Profile

About Me

I build AI systems that work in production. My focus is on transforming complex data and modern language models into scalable, reliable applications — from LLM-powered pipelines to full-stack AI backends. I specialize in retrieval-augmented generation, multi-agent architectures, and fine-tuned transformer models designed for real-world performance.

I engineer end-to-end ML lifecycles covering raw data ingestion, feature engineering, model training, hyperparameter optimization, quantization, and API deployment. On the retrieval side, I build intelligent search systems using vector databases (FAISS, Chroma) and semantic embeddings for context-aware reasoning over large-scale knowledge bases.

My MLOps experience spans Docker, Kubernetes, CI/CD pipelines, GCP Vertex AI, and AWS — with a strong focus on post-production monitoring, inference optimization, and responsible AI guardrails. With an M.S. in AI and a background in Business Analytics, I combine technical depth with a clear sense of how AI drives measurable impact.

03 — Expertise

Technical Skills

Programming Languages

Python SQL C/C++ JavaScript

ML / DL Frameworks

PyTorch TensorFlow Scikit-learn XGBoost Hugging Face Transformers LangChain LangGraph

Generative AI & LLMs

RAG Prompt Engineering Vector DBs (FAISS, Chroma) LLM Evaluation (RAGAS) Fine-tuning Quantization (4-bit, FP16) Agentic Pipelines

MLOps & Cloud Infrastructure

Docker Kubernetes GitHub Actions CI/CD GCP Vertex AI AWS (EC2, S3) FastAPI Triton Inference Server Prometheus Gunicorn Streamlit

Data Engineering & Tools

ETL Pipelines Pandas NumPy NLTK Tableau D3.js FRED/BLS/BEA APIs Stata

04 — Work

Selected Projects

GitHub ↗

Healthcare Chatbot

Agentic RAG Platform

Built a 5-agent LangGraph RAG pipeline over 500K+ medical records, benchmarking Linear SVM, Naive Bayes, and Logistic Regression — achieving 90.0% accuracy and 0.898 macro F1 with Linear SVM as the production model.
Integrated AWS S3 & FAISS vector storage with ETL pipelines reducing dataset noise by 35%; applied RAGAS to diagnose retrieval gaps and guide chunking strategy improvements.
Implemented responsible-AI guardrails with PII detection, hallucination filters, and emergency escalation routing — unsafe response rate reduced to <0.3% on a 2K red-team eval set.
Deployed Dockerized microservices with GitHub Actions CI/CD, Gunicorn, and Prometheus monitoring; full deployment pipeline completes in under 4 minutes.

PythonLangGraphLangChain FAISSGroqAWS DockerRAGAS

GitHub ↗

Credit Risk Analytics

End-to-End ML Platform

Built full ML lifecycle pipelines for loan-default prediction over 2.26M financial records (110K+ loans, 7 states), covering ingestion, feature engineering, training, and a Flask REST serving layer.
Trained XGBoost & Logistic Regression models achieving AUC-ROC 0.79, R²=0.91, and 89.7% accuracy; applied SMOTE oversampling and threshold tuning to lift minority-class F1 by 12 pp.
Automated ETL pipelines integrating FRED, BLS, and BEA macroeconomic APIs, producing a zero-null feature store of 40+ engineered features.

Scikit-learnXGBoostPandas FlaskFRED APITableau

GitHub ↗

Threat Detection

NLP Classification System

Fine-tuned Hugging Face Toxic-BERT (PyTorch) on 130K+ labeled texts, achieving 84.9% accuracy and F1-score 0.855; benchmarked TF-IDF + Logistic Regression baseline to quantify transformer lift of +9.3 F1 pp.
Designed a FastAPI inference service with async handling and SQLite result caching, serving real-time NLP classification at <80 ms median latency under concurrent load.
Established full MLOps CI/CD via GitHub Actions, Docker, and AWS EC2/S3 with automated model regression tests gating every deployment.

PyTorchHugging FaceFastAPI DockerAWSCI/CD

GitHub ↗

Sleep Quality Prediction

Health Analytics ML Service

Built end-to-end ML pipelines on 110K+ health records, benchmarking Random Forest, Gradient Boosting, Logistic Regression, and KNN across classification and regression tasks.
Achieved 82.4% accuracy and 0.747 macro F1 (Random Forest) on disorder classification; regression model reached R²=0.671, RMSE=0.737 (Logistic Regression) via 5-fold cross-validated model selection.
Deployed a Flask prediction API and real-time analytics dashboard serving personalized health recommendations, with model artifacts versioned per best-performing pipeline.

Scikit-learnPandasNumPy Flask

GitHub ↗

Global Population Prediction

Interactive Analytics Dashboard

Engineered a scalable ETL pipeline integrating 12 World Bank indicators across 195+ countries and 60+ years (1960–2023), producing a zero-null feature store for both ML training and D3.js frontend.
Benchmarked Linear Regression, RNN, and CNN forecasting models — Linear Regression achieved the strongest performance with R²=94.53%, MAE=3.11%, MSE=0.60%, outperforming both deep learning baselines.
Built an interactive D3.js dashboard with choropleth map, time-series analytics, and demographic comparison charts (birth rate, death rate, fertility, life expectancy, age distribution) across 195+ countries.

PythonScikit-learnPandas D3.jsETL PipelinesFlask

05 — Academic

Education

M.S. Computer Science — Artificial Intelligence (GPA: 3.5 / 4.0)

Troy University · Troy, Alabama · Jul 2025

Coursework: Machine Learning, Advanced AI, Analysis of Algorithms, Data Visualization, Business Analytics (MBA electives)

B.Eng. Electronic & Electrical Engineering (UK 2:1 Honours)

University of Sunderland · Sunderland, UK · Jul 2021

06 — Credentials

Certifications

LLM Application Engineering & Development

Simplilearn · Oct 2025

View ↗

Generative AI with Large Language Models

DeepLearning.AI · Sep 2025

View ↗

Data Science Methodology

IBM · Sep 2025

View ↗

07 — Communication

Languages

English

Vietnamese

Vu TrongChau

Work Experience

About Me

Technical Skills

Selected Projects

Education

Certifications

Languages

Vu Trong
Chau