Available for hire

Vu Trong
Chau

AI / ML Engineer · LLM Systems · MLOps · GenAI

Building systems that think, retrieve, and decide at scale, from multi-agent RAG pipelines and fine-tuned transformers to production ML infrastructure on GCP and AWS.

Vu Trong Chau
Vu Trong Chau
AI / ML Engineer
90%
Best Model Accuracy
2.26M
Records Processed
340ms
Inference Latency
94.5%
Best R² Score

Work Experience

TechX
May 2026 – Present
AI Systems & LLM Engineering Intern
  • Architected a multi-LLM inference orchestration pipeline unifying Claude 3.5, GPT-4, Gemini 1.5 Pro, and Llama 3 via secure API gateway with retry logic and adaptive rate limiting, reducing end-to-end API failure rate and improving pipeline throughput.
  • Optimized on-device inference via 4-bit quantization, FP16 mixed precision, and CPU offloading using Hugging Face Transformers, reducing GPU memory footprint and cutting per-query inference latency.
  • Deployed GPU-accelerated inference services on GCP Vertex AI with Kubernetes-managed autoscaling, batch inference queuing, and response caching, achieving low latency at scale while reducing idle compute costs.

About Me

I build AI systems that work in production. My focus is on transforming complex data and modern language models into scalable, reliable applications — from LLM-powered pipelines to full-stack AI backends. I specialize in retrieval-augmented generation, multi-agent architectures, and fine-tuned transformer models designed for real-world performance.
I engineer end-to-end ML lifecycles covering raw data ingestion, feature engineering, model training, hyperparameter optimization, quantization, and API deployment. On the retrieval side, I build intelligent search systems using vector databases (FAISS, Chroma) and semantic embeddings for context-aware reasoning over large-scale knowledge bases.
My MLOps experience spans Docker, Kubernetes, CI/CD pipelines, GCP Vertex AI, and AWS — with a strong focus on post-production monitoring, inference optimization, and responsible AI guardrails. With an M.S. in AI and a background in Business Analytics, I combine technical depth with a clear sense of how AI drives measurable impact.

Technical Skills

Programming Languages
Python SQL C/C++ JavaScript
ML / DL Frameworks
PyTorch TensorFlow Scikit-learn XGBoost Hugging Face Transformers LangChain LangGraph
Generative AI & LLMs
RAG Prompt Engineering Vector DBs (FAISS, Chroma) LLM Evaluation (RAGAS) Fine-tuning Quantization (4-bit, FP16) Agentic Pipelines
MLOps & Cloud Infrastructure
Docker Kubernetes GitHub Actions CI/CD GCP Vertex AI AWS (EC2, S3) FastAPI Triton Inference Server Prometheus Gunicorn Streamlit
Data Engineering & Tools
ETL Pipelines Pandas NumPy NLTK Tableau D3.js FRED/BLS/BEA APIs Stata

Selected Projects

Healthcare Chatbot
Agentic RAG Platform
  • Built a 5-agent LangGraph RAG pipeline over 500K+ medical records, benchmarking Linear SVM, Naive Bayes, and Logistic Regression — achieving 90.0% accuracy and 0.898 macro F1 with Linear SVM as the production model.
  • Integrated AWS S3 & FAISS vector storage with ETL pipelines reducing dataset noise by 35%; applied RAGAS to diagnose retrieval gaps and guide chunking strategy improvements.
  • Implemented responsible-AI guardrails with PII detection, hallucination filters, and emergency escalation routing — unsafe response rate reduced to <0.3% on a 2K red-team eval set.
  • Deployed Dockerized microservices with GitHub Actions CI/CD, Gunicorn, and Prometheus monitoring; full deployment pipeline completes in under 4 minutes.
PythonLangGraphLangChain FAISSGroqAWS DockerRAGAS
Credit Risk Analytics
End-to-End ML Platform
  • Built full ML lifecycle pipelines for loan-default prediction over 2.26M financial records (110K+ loans, 7 states), covering ingestion, feature engineering, training, and a Flask REST serving layer.
  • Trained XGBoost & Logistic Regression models achieving AUC-ROC 0.79, R²=0.91, and 89.7% accuracy; applied SMOTE oversampling and threshold tuning to lift minority-class F1 by 12 pp.
  • Automated ETL pipelines integrating FRED, BLS, and BEA macroeconomic APIs, producing a zero-null feature store of 40+ engineered features.
Scikit-learnXGBoostPandas FlaskFRED APITableau
Threat Detection
NLP Classification System
  • Fine-tuned Hugging Face Toxic-BERT (PyTorch) on 130K+ labeled texts, achieving 84.9% accuracy and F1-score 0.855; benchmarked TF-IDF + Logistic Regression baseline to quantify transformer lift of +9.3 F1 pp.
  • Designed a FastAPI inference service with async handling and SQLite result caching, serving real-time NLP classification at <80 ms median latency under concurrent load.
  • Established full MLOps CI/CD via GitHub Actions, Docker, and AWS EC2/S3 with automated model regression tests gating every deployment.
PyTorchHugging FaceFastAPI DockerAWSCI/CD
Sleep Quality Prediction
Health Analytics ML Service
  • Built end-to-end ML pipelines on 110K+ health records, benchmarking Random Forest, Gradient Boosting, Logistic Regression, and KNN across classification and regression tasks.
  • Achieved 82.4% accuracy and 0.747 macro F1 (Random Forest) on disorder classification; regression model reached R²=0.671, RMSE=0.737 (Logistic Regression) via 5-fold cross-validated model selection.
  • Deployed a Flask prediction API and real-time analytics dashboard serving personalized health recommendations, with model artifacts versioned per best-performing pipeline.
Scikit-learnPandasNumPy Flask
Global Population Prediction
Interactive Analytics Dashboard
  • Engineered a scalable ETL pipeline integrating 12 World Bank indicators across 195+ countries and 60+ years (1960–2023), producing a zero-null feature store for both ML training and D3.js frontend.
  • Benchmarked Linear Regression, RNN, and CNN forecasting models — Linear Regression achieved the strongest performance with R²=94.53%, MAE=3.11%, MSE=0.60%, outperforming both deep learning baselines.
  • Built an interactive D3.js dashboard with choropleth map, time-series analytics, and demographic comparison charts (birth rate, death rate, fertility, life expectancy, age distribution) across 195+ countries.
PythonScikit-learnPandas D3.jsETL PipelinesFlask

Education

M.S. Computer Science — Artificial Intelligence (GPA: 3.5 / 4.0)
Troy University · Troy, Alabama · Jul 2025
Coursework: Machine Learning, Advanced AI, Analysis of Algorithms, Data Visualization, Business Analytics (MBA electives)
B.Eng. Electronic & Electrical Engineering (UK 2:1 Honours)
University of Sunderland · Sunderland, UK · Jul 2021

Certifications

LLM Application Engineering & Development
Simplilearn · Oct 2025
View ↗
Generative AI with Large Language Models
DeepLearning.AI · Sep 2025
View ↗
Data Science Methodology
IBM · Sep 2025
View ↗

Languages

English
Vietnamese