Research 2024

Recreating LlaVa for Emergency Landing Use Cases

Recreated the "Visual Instruction Tuning" paper which involved coding from scratch and training a Llama model and visual encoder using Umich GPUs and using GPT-4o to evaluate our Llava implementation using the COCO and Skyview datasets to determine LLaVa’s ability to identify emergency landing spots from aerial images.

Technology Stack

PyTorch Llama Computer Vision GPU Computing GPT-4o API COCO Dataset Skyview Dataset

Research Report

View Full PDF

Read the Report

PDF preview is limited on mobile devices. Click below to view the full research report.

Open PDF Report