Integrating Vision-Language Models with Knowledge Graphs for Advancing AI-Driven Robotics and Precision Agriculture

dc.contributor.advisor Zhang, Hongbo
dc.contributor.author Vundavilli, Venkata
dc.contributor.committeemember Dong, Zhijang
dc.contributor.committeemember Gu, Yi
dc.date.accessioned 2024-12-16T20:03:06Z
dc.date.available 2024-12-16T20:03:06Z
dc.date.issued 2024
dc.date.updated 2024-12-16T20:03:06Z
dc.description.abstract Precision agriculture is at the forefront of modern innovations in farming, using ad- vanced technologies to optimize resource use, increase crop yields, and promote sustainable agricultural practices. This thesis, therefore, addresses some of the critical challenges in precision agriculture, including accurate weed detection, efficient resource allocation, and integration of multimodal data from diverse sources such as drones and IoT devices. In order to address these challenges, a novel strategy is suggested that combines MiniGPT-4, a multimodal vision-language model, with a systematic Knowledge Graph (KG) derived from credible datasets, specifically FAOSTAT and USDA PLANTS. This KG, integrated in the inference pipeline of MiniGPT-4, further expands the model’s contextual understanding and increases its reasoning capabilities; hence, more accurate results are generated in tasks like weed detection and crop monitoring. Empirical evaluations demonstrate that the KG-enhanced MiniGPT-4 significantly out- performs the baseline model on various performance metrics, including BLEU scores, METEOR, ROUGE, and CIDEr, in addition to lowering hallucination rates and improving object and relation coverage. While the quantized model exhibits a slight trade-off with respect to some performance measures, it still retains good functionality, which is acceptable for real-time agricultural applications. This work not only contributes to the technical integration of vision-language models with structured knowledge bases but also provides practical solutions to enhance precision agriculture robotics. The proposed system fosters more sustainable and productive farming practices by enabling smarter decision-making and automating complex agricultural tasks. Future research will look into dynamic Knowl- edge Graph updates, mechanisms for continual learning, and further applications in both agricultural and non-agricultural domains to further solidify the role of AI-driven solutions in modern agriculture.
dc.description.degree M.S.
dc.identifier.uri https://jewlscholar.mtsu.edu/handle/mtsu/7561
dc.language.rfc3066 en
dc.publisher Middle Tennessee State University
dc.source.uri http://dissertations.umi.com/mtsu:11966
dc.subject Deep learning
dc.subject Image captioning
dc.subject Natural language processing
dc.subject Precision agriculture
dc.subject Quantization
dc.subject Vision language model
dc.subject Computer science
dc.subject Agriculture
dc.thesis.degreelevel masters
dc.title Integrating Vision-Language Models with Knowledge Graphs for Advancing AI-Driven Robotics and Precision Agriculture
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Vundavilli_mtsu_0170N_11966.pdf
Size:
2.09 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.27 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections