Under the hood

How AuditGuardX Works

No black box. Here is exactly how AuditGuardX processes your documents, analyzes compliance, and generates audit-ready output from upload to report in under 2 minutes.

Document Intelligence Pipeline

Every document passes through 5 stages. Each stage is designed for accuracy, speed, and traceability.

Document Upload & Extraction

Documents are uploaded via the web interface or API and stored encrypted in Google Cloud Storage.

Supported formats: PDF, DOCX, XLSX, images (OCR via Tesseract)
Max file size: 50MB per document
Text extraction with layout preservation
Automatic content type detection and metadata extraction

Semantic Chunking & Embedding

Extracted text is broken into semantically meaningful sections and converted to vector embeddings.

Context-preserving semantic chunking (not naive splitting)
384-dimensional embeddings via all-MiniLM-L6-v2 model
Vectors stored in PostgreSQL with pgvector extension
Enables hybrid search: semantic similarity + keyword matching

AI Compliance Analysis

Multi-provider AI maps document content to regulatory controls, scores compliance, and identifies gaps.

Multi-provider AI routing: Vertex AI, Groq, Cerebras with automatic fallbacks
3,485+ controls across 39 frameworks evaluated per document
Confidence scoring with evidence citations for each finding
Severity classification: critical, high, medium, low, informational
AI-generated remediation suggestions with corrected clause text

Conversational Voice AI Interface

Natural language voice interaction and Whisper speech-to-text.

Multi-provider TTS: Groq Orpheus → Gemini TTS → local Piper (sherpa-onnx) fallback chain
Whisper speech-to-text (whisper-large-v3-turbo) for voice input transcription
Streaming TTS with sentence prefetching for near-zero playback gaps
Natural language access to compliance, document, and knowledge base tools via voice
2 input modes: push-to-talk (spacebar) and voice-activation (hands-free)
Silero VAD v5 (Voice Activity Detection) with barge-in support for hands-free operation

Report Generation & Output

AI generates structured compliance reports with executive summaries, gap analyses, and remediation roadmaps.

Executive summary with AI-generated narrative
Control-by-control assessment with pass/fail/partial status
Gap analysis per framework with severity-weighted prioritization
Remediation suggestions with corrected policy language
PDF export in audit-ready formatting
Compliance score trends and historical tracking

Real Processing Timeline

Actual timestamps from a 42-page policy document analysis.