Automation & AI

AI Gateway (Light LLM Proxy)

Secure Internal AI Endpoint with Cost Management & Guardrails

Developed a secure, centralized internal AI gateway (Light LLM proxy) that manages AI queries across multiple vendors, provides custom security guardrails, cost tracking, rate limiting, and integrates with enterprise infrastructure for factually dense, context-aware AI interactions.

Client

Luxottica

Completion

3 months

Automation & AI

Situation

The organization needed a secure, centralized solution for managing AI interactions across multiple vendors while addressing security concerns around sensitive data exposure. Existing solutions lacked cost visibility, rate limiting capabilities, and the ability to enforce custom security guardrails. There was also a need to integrate real-time infrastructure data for context-aware AI interactions without permanently storing sensitive information.

Task

Design and implement a vendor-agnostic AI gateway that provides a single internal endpoint for AI queries, enforces custom security guardrails, tracks costs, implements rate limiting, supports vector stores for enhanced context, and integrates with enterprise systems like Azure for real-time infrastructure data.

Action

→Architected a flexible, vendor-agnostic backend supporting multiple AI models from Anthropic (Claude), OpenAI, Perplexity, Google's RAG solution, and local providers like Ollama
→Implemented full reverse compatibility with OpenAI API framework, enabling seamless integration into existing development environments and tools like Visual Studio and Cursor
→Developed custom guardrails system that forces security snippets into queries regardless of user input, enforcing boundaries such as professional tone and preventing exposure of internal company information
→Implemented pattern detection and blocking for sensitive data like credit card numbers, PII, and Social Security numbers
→Built comprehensive logging and traceability system providing accountability for all AI usage across the organization
→Designed access control system using user or team-specific API keys, allowing granular assignment of access to specific models, vectors, and MCP tools
→Integrated cost tracking and import functionality enabling detailed cost-benefit analysis for different flows and models
→Implemented rate and budget limiting capabilities, allowing throttling of usage (tokens per minute, requests per minute) based on models, teams, or individual users
→Developed dashboard for comparing how different models operate on the same flows, enabling cost-effective model selection based on required features
→Created vector store support for SQL databases and Azure AI vector stores, enabling recall of factually dense artifacts like KC exported PDFs or SOPs for high-quality outputs
→Implemented Multi-protocol Communication Protocol (MCP) integration, specifically AZ Tools, leveraging Python AZ CLI authentication to retrieve bearer tokens for Azure Management API
→Designed ephemeral data handling for Azure infrastructure data, ensuring bearer tokens and associated data stores are erased upon conclusion of MCP discussions

Results

✓Established a secure, centralized endpoint reducing vendor risk and enhancing compliance by keeping sensitive queries within internal infrastructure
✓Enabled cost optimization through comprehensive tracking and comparison capabilities, allowing selection of cost-effective models (e.g., Gemini for large input tokens vs GPT-4o)
✓Improved security posture through custom guardrails preventing accidental exposure of sensitive information and enforcing organizational policies
✓Provided granular access control enabling teams to access only appropriate models and tools based on their needs and authorization levels
✓Enhanced AI output quality by integrating vector stores with factually dense organizational artifacts, avoiding context window limitations of large, generalized data stores
✓Enabled real-time infrastructure insights through AZ Tools MCP integration, providing factually accurate Azure infrastructure data without permanent storage
✓Demonstrated proof of concept for secure, internal AI management that can scale across the organization
✓Reduced reliance on external AI endpoints for sensitive queries, improving data privacy and compliance

Technologies Used

Light LLMOpenAI APIAnthropic ClaudePerplexityGoogle RAGOllamaVector StoresSQL DatabaseAzure AIMCP (Multi-protocol Communication Protocol)PythonAzure CLIAzure Management APIRESTful APIAPI GatewayCost TrackingRate LimitingCustom Guardrails

Security Skills Applied

AI Security & AlignmentData Security & PrivacyAccess Control & RBACAPI SecurityCost OptimizationRate LimitingCustom GuardrailsCompliance ManagementIdentity & Access Management (IAM)Secure API Design

Get in touch

AI Gateway (Light LLM Proxy)

Luxottica

3 months

Automation & AI

Situation

Task

Action

Results

Technologies Used

Security Skills Applied

Let's connect

Ilya Sulakov

[email protected]

Cincinnati, Ohio, United States