
Industry
All
In today’s fast-paced software development landscape, developers often face challenges in efficiently locating and adapting existing models, leading to delays and inefficiencies. Retrieval Augmentation Generation (RAG) can streamline this process by automatically retrieving relevant artifacts from extensive repositories, accelerating model generation in Model Driven Engineering (MDE).
However, implementing RAG systems presents challenges such as ensuring contextual relevance, managing scalability in large datasets, and maintaining optimal performance. The complexity further increases when dealing with diverse unstructured data types like text, images, SmartArt, and charts.
This paper presents an advanced RAG pipeline designed to enhance MDE processes, with a special focus on querying over 2000 PowerPoint documents. Key focus areas include:
-
- Pre-processing Layer for Enhanced Retrieval Efficiency
- Scalable Architecture for Large-Scale Unstructured Data Handling
- Evaluation Framework Tailored for Unstructured Data Retrieval