Experience
- Developed embodied task-planning modules using multi-agent workflows and Vision-Language Models for autonomous task execution, and decision-making in humanoid robots
- Built and deployed model-based and model-free 6DoF object pose estimation modules to improve robotic grasping accuracy
- Developed an instruction-following data generation and benchmarking pipeline for VLN training, automating dataset creation and evaluation workflows
ROS 2PythonPyTorchLangGraphTool CallingVLAVLMGazebo
Projects
Key Skills
- Built a Computer Vision Acceleration Pipeline cutting development-phase time cost by 50–70%
- Warehouse Package Tracking with RF-Detr deployed in warehouse via surveillance cameras, automating storage processes
- Food-Ordering Agentic Workflow with LangGraph deployed in kiosks to automate ordering process from customer interaction (STT/NLP) to order fulfillment and customer feedback collection (TTS/NLP)
PythonFastAPIONNXLangChainLangGraphLLMNLPASRTTS
- Developed an Android VR See-Through System for medical assessment robots supporting clinical observation
- Designed an Unreal-Based VR Driving Simulator with integrated AI driving assistant for driver training and cognition evaluation (Hitachi Corp collaboration)
FlutterPythonUnityUnreal EngineROS 2
Projects
Key Skills
- Built and deployed large-scale speaker verification for Viettel Call Center — serving millions of customers daily with >99% accuracy, significantly reducing manual workload (>50%), preventing fraud, and improving customer experience
- Developed a voice quality assessment system for call centers to monitor and improve audio quality for speech datasets, leading to a reduction of manual review time by 80% and improving ASR performance through better data quality
- Participated in VoxCeleb Speaker Recognition Challenge (VoxSRC) 2022 (Top 20) and Vietnamese Language and Speech Processing (VLSP 2022)
- Data preparation and implementation of LLM into the My Viettel chatbot platform
PythonPyTorchTritonDockerKafkaASRVADNLPLLMSpeaker Verification
Projects
Internal Contributions
- Tech Talk: Speaker Verification for Call Center — internal knowledge-sharing session
- Mentor — Viettel Digital Talent Program 2022
Achievements
- Participated in VLSP (Vietnamese Language & Speech Processing) competition
- High placement in VoxCeleb Speaker Recognition Challenge (VoxSRC) 2022 (Top 20)
Key Skills
- Completed intensive AI & Data Science training under Viettel's national talent initiative
- Contributed to Vietnamese speaker verification system later deployed in production
Projects
Achievement
- Selected as Top 10 candidate out of the entire Viettel Digital Talent Program 2021 cohort
Key Skills

Master of Engineering — Human Robotics Interaction
Nara Institute of Science and Technology (NAIST), Japan
Oct 2023 – Sep 2025
Research: Personalized human-robot interface for APMVs using generative models tailored to passenger personality.
Funded by MEXT Scholarship, Japan (2023).

Bachelor of Science — Automation & Electrical Engineering
Hanoi University of Science and Technology (HUST), Vietnam
Sept 2018 – Sept 2022
GPA: 3.52 / 4.0 — Graduated with Distinction, 2nd place in Thesis defense committee.
Thesis: Speaker verification system for telecommunication service.
Task: Speaker Verification
Track 1 (closed, no external data) and Track 2 (open, with external dataset)
Approach:
- Model fusion: RawNet (raw audio input) + ECAPA-TDNN (spectrogram input)
- Multi-loss optimization: AAM-Softmax (classification) + Proxy Anchor (metric learning)
- n-fold data augmentation pipeline
Task: O-COCOSDA and VLSP 2022 - MSV Shared task: Multilingual Speaker Verification
(track 1: seen languages, track 2: unseen languages, track 3: cross-lingual)
Approach:
- Model fusion: RawNet (raw audio input) + ECAPA-TDNN (spectrogram input)
- Pretrained self-supervised models: Wav2Vec 2.0 and WavLM as feature extractors
- Multi-loss optimization for cross-lingual speaker representation
Task: Improve human detection with YOLOv5 using a data-driven approach only (no model architecture changes)
Approach:
- Data preprocessing: noisy label removal, annotation standardization, image quality enhancement
- Ensemble with diverse augmentation strategies and hyperparameter sets
- Knowledge distillation to reduce manual data evaluation overhead
