Micro AI

Service Platform

v1.0.0

Documentation

Comprehensive guides and API documentation for the Micro AI platform

Getting Started with Micro AI
MSO (Microservices-Oriented Architecture) for AI model deployment, routing, logging, and monitoring

Overview

Micro AI is a comprehensive microservices platform that provides LLM servers (LLAMA_CPP_PY, VLLM), intelligent routing with LiteLLM and NGINX gateway, and complete observability stack. The platform enables easy deployment and management of AI services with real-time monitoring and scaling capabilities.

Architecture Components

CoreGateway & Routing
  • NGINX Gateway - Front-facing request router
  • LiteLLM Router - AI model management and routing
  • Service Manager - Lifecycle orchestration
AI/MLProcessing Services
  • Text Tools - Chunking, tokenization, NLP
  • Translator - Machine translation via LiteLLM
  • Adapters - Model-to-API mapping
MonitoringObservability Stack
  • LangFuse - LLM application monitoring
  • Grafana - Metrics visualization
  • Prometheus - Metrics collection
  • Checkmate - Uptime monitoring
InfrastructureData & Storage
  • PostgreSQL - Primary database
  • Redis - Caching and temporary storage
  • ClickHouse - Analytics database
  • MinIO - Object storage

Available Endpoints

  • /llm_router/v1 - OpenAI API compatible endpoint for LLM services
  • /service-manager - Service management with UI at /service-manager/ui
  • /text_tools - Text processing APIs (chunking, tokenization)
  • /translator - Machine translation service
  • /langfuse - LLM observability platform
  • /grafana - Monitoring dashboards
  • /services - List available local models and services

Complete Endpoint List: For a comprehensive list of all available endpoints and their documentation, visit https://microai.staging.sirenanalytics.com/microservices

Quick Start Guide

1. Service Management

  • Navigate to Service Manager to load/unload local models
  • Create new services by selecting available models
  • Configure memory utilization and scaling parameters
  • Monitor service health and resource usage in real-time

2. API Integration

Base URL:https://microai.staging.sirenanalytics.com/llm_router/v1
Authentication:Bearer microai-key-xxx

Use the base URL above as your OpenAI client endpoint to properly route through the LiteLLM gateway. All requests require a valid Micro AI API key for authentication.

3. Monitoring & Observability

  • Access Grafana dashboards for system metrics
  • Use LangFuse for LLM application performance monitoring
  • Monitor container resources with cAdvisor
  • Track GPU utilization with DCGM integration

Service Profiles

The platform uses Docker Compose profiles to organize services by functionality:

text_utilsloggingmonitoringservice_manager

Additional Resources

Migration Guide

Complete guide for migrating existing applications to Micro AI

View Migration Guide

LiteLLM Documentation

Complete API reference for all available endpoints

View LiteLLM Docs