Research
2024
Recreating LlaVa for Emergency Landing Use Cases
Recreated the "Visual Instruction Tuning" paper which involved coding from scratch and training a Llama model and visual encoder using Umich GPUs and using GPT-4o to evaluate our Llava implementation using the COCO and Skyview datasets to determine LLaVa’s ability to identify emergency landing spots from aerial images.
Technology Stack
PyTorch
Llama
Computer Vision
GPU Computing
GPT-4o API
COCO Dataset
Skyview Dataset
Research Report
View Full PDFRead the Report
PDF preview is limited on mobile devices. Click below to view the full research report.