Integrating Vision-Language Models with Knowledge Graphs for Advancing AI-Driven Robotics and Precision Agriculture
Integrating Vision-Language Models with Knowledge Graphs for Advancing AI-Driven Robotics and Precision Agriculture
dc.contributor.advisor | Zhang, Hongbo | |
dc.contributor.author | Vundavilli, Venkata | |
dc.contributor.committeemember | Dong, Zhijang | |
dc.contributor.committeemember | Gu, Yi | |
dc.date.accessioned | 2024-12-16T20:03:06Z | |
dc.date.available | 2024-12-16T20:03:06Z | |
dc.date.issued | 2024 | |
dc.date.updated | 2024-12-16T20:03:06Z | |
dc.description.abstract | Precision agriculture is at the forefront of modern innovations in farming, using ad- vanced technologies to optimize resource use, increase crop yields, and promote sustainable agricultural practices. This thesis, therefore, addresses some of the critical challenges in precision agriculture, including accurate weed detection, efficient resource allocation, and integration of multimodal data from diverse sources such as drones and IoT devices. In order to address these challenges, a novel strategy is suggested that combines MiniGPT-4, a multimodal vision-language model, with a systematic Knowledge Graph (KG) derived from credible datasets, specifically FAOSTAT and USDA PLANTS. This KG, integrated in the inference pipeline of MiniGPT-4, further expands the model’s contextual understanding and increases its reasoning capabilities; hence, more accurate results are generated in tasks like weed detection and crop monitoring. Empirical evaluations demonstrate that the KG-enhanced MiniGPT-4 significantly out- performs the baseline model on various performance metrics, including BLEU scores, METEOR, ROUGE, and CIDEr, in addition to lowering hallucination rates and improving object and relation coverage. While the quantized model exhibits a slight trade-off with respect to some performance measures, it still retains good functionality, which is acceptable for real-time agricultural applications. This work not only contributes to the technical integration of vision-language models with structured knowledge bases but also provides practical solutions to enhance precision agriculture robotics. The proposed system fosters more sustainable and productive farming practices by enabling smarter decision-making and automating complex agricultural tasks. Future research will look into dynamic Knowl- edge Graph updates, mechanisms for continual learning, and further applications in both agricultural and non-agricultural domains to further solidify the role of AI-driven solutions in modern agriculture. | |
dc.description.degree | M.S. | |
dc.identifier.uri | https://jewlscholar.mtsu.edu/handle/mtsu/7561 | |
dc.language.rfc3066 | en | |
dc.publisher | Middle Tennessee State University | |
dc.source.uri | http://dissertations.umi.com/mtsu:11966 | |
dc.subject | Deep learning | |
dc.subject | Image captioning | |
dc.subject | Natural language processing | |
dc.subject | Precision agriculture | |
dc.subject | Quantization | |
dc.subject | Vision language model | |
dc.subject | Computer science | |
dc.subject | Agriculture | |
dc.thesis.degreelevel | masters | |
dc.title | Integrating Vision-Language Models with Knowledge Graphs for Advancing AI-Driven Robotics and Precision Agriculture |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Vundavilli_mtsu_0170N_11966.pdf
- Size:
- 2.09 MB
- Format:
- Adobe Portable Document Format
- Description:
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 2.27 KB
- Format:
- Item-specific license agreed upon to submission
- Description: