Documentation

Comprehensive guides and API documentation for the Micro AI platform

Getting Started with Micro AI

MSO (Microservices-Oriented Architecture) for AI model deployment, routing, logging, and monitoring

Overview

Micro AI is a comprehensive microservices platform that provides LLM servers (LLAMA_CPP_PY, VLLM), intelligent routing with LiteLLM and NGINX gateway, and complete observability stack. The platform enables easy deployment and management of AI services with real-time monitoring and scaling capabilities.

API Key Required: Contact elie.r@sirenanalytics.com to get your Micro AI API key for development access.

Architecture Components

CoreGateway & Routing

NGINX Gateway - Front-facing request router
LiteLLM Router - AI model management and routing
Service Manager - Lifecycle orchestration

AI/MLProcessing Services

Text Tools - Chunking, tokenization, NLP
Translator - Machine translation via LiteLLM
Adapters - Model-to-API mapping

MonitoringObservability Stack

LangFuse - LLM application monitoring
Grafana - Metrics visualization
Prometheus - Metrics collection
Checkmate - Uptime monitoring

InfrastructureData & Storage

PostgreSQL - Primary database
Redis - Caching and temporary storage
ClickHouse - Analytics database
MinIO - Object storage

Available Endpoints

/llm_router/v1 - OpenAI API compatible endpoint for LLM services
/service-manager - Service management with UI at /service-manager/ui
/text_tools - Text processing APIs (chunking, tokenization)
/translator - Machine translation service
/langfuse - LLM observability platform
/grafana - Monitoring dashboards
/services - List available local models and services

Complete Endpoint List: For a comprehensive list of all available endpoints and their documentation, visit https://microai.staging.sirenanalytics.com/microservices

Quick Start Guide

1. Service Management

Navigate to Service Manager to load/unload local models
Create new services by selecting available models
Configure memory utilization and scaling parameters
Monitor service health and resource usage in real-time

2. API Integration

Base URL:https://microai.staging.sirenanalytics.com/llm_router/v1

Authentication:Bearer microai-key-xxx

Use the base URL above as your OpenAI client endpoint to properly route through the LiteLLM gateway. All requests require a valid Micro AI API key for authentication.

3. Monitoring & Observability

Access Grafana dashboards for system metrics
Use LangFuse for LLM application performance monitoring
Monitor container resources with cAdvisor
Track GPU utilization with DCGM integration

Service Profiles

The platform uses Docker Compose profiles to organize services by functionality:

text_utilsloggingmonitoringservice_manager

Additional Resources

Migration Guide

Complete guide for migrating existing applications to Micro AI

View Migration Guide

LiteLLM Documentation

Complete API reference for all available endpoints

View LiteLLM Docs