Zero Trust RAG | aihorizon r&d

Implementing a Zero Trust RAG Architecture in Education

A joint project by aihorizon R&D, Microsoft’s AI Factory and the Educational Foundation Freiburg

This project originated from a nomination process, positioning the Educational Foundation Freiburg as a pivotal partner for AI-driven educational innovation. It initiated a strategic collaboration between aihorizon R&D and Microsoft’s AI Factory, dedicated to developing a secure, enterprise-ready Retrieval-Augmented Generation (RAG) solution utilizing Azure GPT-RAG Solution Accelerator.

Targeted use cases focused specifically on enhancing curriculum navigation and administrative knowledge management—domains characterized by structured and semi-structured content that traditionally pose challenges for effective querying and automation. Adopting Zero Trust principles from the outset ensured that robust AI capabilities aligned seamlessly with stringent, enterprise-grade security requirements tailored specifically for the educational sector.

Project Objectives

The initiative revolved around three primary objectives:

Deploy a secure, enterprise-class GPT-RAG infrastructure within Azure, grounded in Zero Trust principles.
Showcase practical educational use cases leveraging real-world, operationally relevant data. Creating and utilizing a 𝗴𝗼𝗹𝗱 𝘀𝘁𝗮𝗻𝗱𝗮𝗿𝗱 𝗱𝗮𝘁𝗮 𝘀𝗲𝘁 to enhance RAG performance and reliability.
Enable AI-driven workflows to boost productivity and streamline access to institutional knowledge, while ensuring stringent governance and transparent audit trails.

Architecture Overview

Rooted in Zero Trust Meeting rigorous security and compliance mandates required embedding the GPT-RAG solution within a dedicated Zero Trust landing zone, adhering strictly to Microsoft's Cloud Adoption Framework and Enterprise-Scale Architecture best practices.

Core Architectural Components

Virtual Network Isolation: Deployment of all services (App Service, Azure Cognitive Search, Cosmos DB, Azure OpenAI, Key Vault) through private endpoints exclusively, eliminating public internet exposure.
Hub & Spoke Networking: Secure network traffic management via a central Hub VNet interconnected through VNet Peering to distinct service-specific VNets.
Azure Front Door and Web Application Firewall: Ensured secure ingress with global reach and robust protection against OWASP vulnerabilities.
Azure Bastion: Provided secure administrative access to resources without exposing public IP addresses.
Private DNS and Azure DNS Resolver: Maintained secure internal DNS resolution confined within network boundaries.
Microsoft Entra ID-Based AI Access Control: Enabled granular permissions aligned with specific operational roles.
Governance and Monitoring Tools (Azure Policy, Monitor, Defender for Cloud): Enforced proactive governance, continuous system monitoring, and robust security posture.

Deployment leveraged infrastructure-as-code (IaC) exclusively through Azure Developer CLI (azd), automating provisioning, network configuration, secret management, and deployment.

Use Case 1: Teacher-Focused Curriculum Assistant

A prominent use case involved indexing curriculum documentation from official educational frameworks provided by the federal state Baden-Württemberg.

Data Pipeline: Documents were systematically chunked, embedded using OpenAI-generated embeddings, and indexed within Azure Cognitive Search.
Interactive Chat Interface: Teachers employed a secure, intuitive webapp interface to query curricula using natural language (e.g., "Which competences are required for grade 9 English?", "Identify cross-curricular connections in geography.").
Pilot Feedback: Teacher evaluations highlighted significant improvements in usability and response accuracy, substantially reducing time spent searching complex educational resources.

This prototype notably improved accessibility to detailed curricular information typically buried in PDFs or convoluted website structures.

Use Case 2: AI Assistant for Administrative Operations

The administrative division of the Educational Foundation Freiburg benefitted significantly from an AI assistant designed to facilitate navigating internal legal, procedural, and policy documentation.

Conversational Querying: Enabled natural language interaction with indexed administrative content.
Secure Data Management: Institutional knowledge securely stored in Cosmos DB, augmented with rich metadata, and effectively indexed for rapid retrieval.
Enhanced access to previously siloed documentation, fostering greater administrative efficiency.

Results and Impact

✅ Zero Trust Adoption: Successfully demonstrated Zero Trust's efficacy in securing advanced AI implementations, leading the foundation to adopt Zero Trust principles in its broader IT strategy.
✅ Enhanced Productivity: Substantial adoption by educators and administrative staff validated the value of AI-driven conversational access to complex institutional information.
✅ Foundation for Future Expansion: Encouraged by pilot outcomes, partners committed to broadening the scope and scale of RAG implementations.

Key Lessons Learned

Integrated Security is Essential: Adopting a Zero Trust approach from inception accelerated deployment and instilled stakeholder confidence.
Responsible AI in Education: Effective governance, transparency, and auditable design are critical to gaining and maintaining stakeholder trust.
Data Quality Enhances RAG Performance: Domain-specific performance significantly improved through careful content curation, embedding strategies, and refined prompt engineering.

Next Steps

The collaboration between aihorizon R&D, Microsoft, and the Educational Foundation Freiburg will continue to:

Broaden departmental RAG integrations.
Advance model fine-tuning and sophisticated prompt orchestration employing tools like AutoGen or Semantic Kernel.

Contributors: Dr. Michael Jülich (aihorizon R&D), Sonja Kellner (aihorizon R&D, Microsoft), Moritz Goeke (aihorizon R&D), Yacine Yakoubi (Microsoft), Tobias Oberrauch (Microsoft), Brij Gupta (Microsoft) and Maral Demirseçen (Microsoft).