Thesis Home

Designing Lightweight AI Agents for Edge Deployment

A Minimal Capability Framework with Insights from Literature Synthesis

Appendix D: Agent Layer Diagrams

Contents Overview

This appendix provides detailed architectural diagrams for each of the MCD layers: the Prompt Layer, the Stateless Control Layer, the Execution Layer, and the integrated Fallback mechanisms. These visual representations clarify how MCD avoids orchestration-heavy pipelines while maintaining architectural discipline.

Purpose Statement

To visually link the subsystem designs from Chapter 4 with the instantiated agent architecture in Chapter 5, demonstrating how MCD principles (Minimality by Default, Bounded Rationality, Degeneracy Detection) manifest in concrete system architecture without requiring complex orchestration frameworks.

D.1 MCD Three-Layer Architectural Stack

Figure D.1: Complete MCD Layer Architecture

┌─────────────────────────────────────────────────────────────┐
│                 PROMPT LAYER (Section 4.3.1)                │
├─────────────────────────────────────────────────────────────┤
│  • 90-130 token capability plateau (Bounded Rationality)    │
│  • Zero-shot baseline prompting (Minimality by Default)     │
│  • Embedded fallback logic (Degeneracy Detection)           │
│  • Symbolic routing with IF-THEN decision trees             │
│                                                             │
│  Input: User Query → Intent Router → Decision Prompt        │
│  Output: Symbolic routing tokens + Execution instructions   │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│            STATELESS CONTROL LAYER (Section 4.3.2)          │
├─────────────────────────────────────────────────────────────┤
│  • In-prompt routing logic (No external orchestration)      │
│  • Deterministic fallback paths (Bounded Rationality)       │
│  • Symbolic decision trees (≤3 depth, ≤4 branches)          │
│  • Context regeneration without persistent memory           │
│                                                             │
│  Flow: Intent Classification → Route Selection → Context    │
│        Anchoring → Execution Triggering                     │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│               EXECUTION LAYER (Section 4.3.3)               │
├─────────────────────────────────────────────────────────────┤
│  • Q1/Q4/Q8 quantization tiers (Hardware-aware)             │
│  • Local inference only (WebAssembly/llama.cpp)             │
│  • Dynamic tier routing: Q1→Q4→Q8 (drift >10% threshold) │
│  • Resource constraints: <512MB RAM, <500ms latency   │
│                                                             │
│  Components: Quantized LLM → Local Runtime → Response       │
└─────────────────────────────────────────────────────────────┘
                              ↓
                        RESPONSE OUTPUT

D.2 Prompt Layer Internal Architecture

Figure D.2: Prompt Layer Design Pattern

USER INPUT
     ↓
┌─────────────────────────────────────────────────┐
│           PROMPT STRUCTURE                      │
├─────────────────────────────────────────────────┤
│  System: [Lightweight stateless assistant]      │
│  Context: [Compressed state tokens]             │
│  Intent Router (Symbolic Decision Tree):        │
│    • IF intent=booking → appointment_logic      │
│    • IF intent=navigation → spatial_logic       │
│    • IF intent=diagnostic → heuristic_logic     │
│    • ELSE → clarification_logic                 │
│  Fallback: [Bounded loops ≤2 iterations]        │
│  Output Format: [Structured symbolic tokens]    │
└─────────────────────────────────────────────────┘
     ↓
SYMBOLIC ROUTING DECISION
     ↓
EXECUTION PATHWAY

Key Components:

Token-efficient context packing: intent=book, time=today, specialty=neuro (explicit slot passing, T4 validation)
Embedded routing logic: Decision branches encoded as IF-THEN token patterns (Section 5.2.1)
Fallback safety: Bounded clarification loops (≤2 iterations, Anti-Pattern 4)
Adaptation patterns: Dynamic (W1/W3), Semi-Static (W2) routing strategies (Table 5.1)

D.3 Stateless Control Layer Flow

Figure D.3: Control Layer Decision Logic

PROMPT INPUT
     ↓
┌─────────────────────────────────────────────────┐
│         INTENT CLASSIFICATION                   │
│  ┌─────────────┐  ┌─────────────┐  ┌──────────┐ │
│  │   BOOKING   │  │  NAVIGATION │  │DIAGNOSTIC│ │
│  │   Route A   │  │   Route B   │  │ Route C  │ │
│  │  (Dynamic)  │  │(Semi-Static)│  │(Dynamic) │ │
│  └─────────────┘  └─────────────┘  └──────────┘ │
└─────────────────────────────────────────────────┘
     ↓                  ↓                  ↓
ROUTE A: Booking      ROUTE B: Navigation   ROUTE C: Diagnostic
┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
│ • Dynamic slot  │  │ • Deterministic │  │ • Heuristic     │
│   extraction    │  │   coordinate    │  │   category      │
│ • Clarification │  │   calculation   │  │   routing       │
│ • Confirmation  │  │ • Landmark refs │  │ • Priority      │
│   (W1 pattern)  │  │   (W2 pattern)  │  │   (W3 pattern)  │
└─────────────────┘  └─────────────────┘  └─────────────────┘
     ↓                  ↓                  ↓
           FALLBACK ROUTE (if needed)
         ┌─────────────────────────┐
         │ • Bounded clarification │
         │ • Safe limitation exit  │
         │ • Controlled failure    │
         └─────────────────────────┘
                  ↓
            EXECUTION LAYER

Control Flow Characteristics:

No persistent state: Each decision cycle is self-contained (T4: 5/5 stateless success)
Symbolic routing: Token patterns trigger execution paths (Section 5.2.1)
Bounded fallback: Maximum 2-loop recovery prevents semantic drift (T5: >3 steps causes drift)
Context regeneration: State reconstructed from explicit slot reinjection (Section 4.2)

D.4 Execution Layer Quantization Architecture

Figure D.4: Tiered Execution Model

TASK COMPLEXITY ASSESSMENT
     ↓
┌─────────────────────────────────────────────────┐
│            TIER SELECTION LOGIC (T10)           │
├─────────────────────────────────────────────────┤
│  Q1: Ultra-minimal (Qwen2-0.5B, 300MB RAM)      │
│      ↓ (if semantic drift >10%)              │
│  Q4: Optimal balance (TinyLlama-1.1B, 560MB)    │
│      ↓ (if performance <80% or timeout)      │
│  Q8: Strategic fallback (Llama-3.2-1B, 800MB)   │
│                                                 │
│  Evidence: Q4 optimal for 80% of tasks (T10)    │
└─────────────────────────────────────────────────┘
     ↓
┌─────────────────────────────────────────────────┐
│          LOCAL EXECUTION RUNTIME (T8)           │
├─────────────────────────────────────────────────┤
│  WebAssembly Runtime (Browser deployment)       │
│  OR                                             │
│  llama.cpp (Native/Raspberry Pi deployment)     │
│  OR                                             │
│  WebLLM (JavaScript-based inference)            │
│                                                 │
│  Validated Constraints (T8):                    │
│  • No backend servers (edge-first principle)    │
│  • Local inference only                         │
│  • <500ms average latency (Q4 tier: 430ms)   │
│  • <512MB memory stable deployment           │
└─────────────────────────────────────────────────┘
     ↓
RESPONSE OUTPUT

D.5 Integrated Fallback Layer

Figure D.5: Fallback Recovery Paths

TASK EXECUTION
     ↓
MONITORING LAYER (Continuous Validation)
┌─────────────────────────────────────────────────┐
│  • Semantic Drift Detection (>10% threshold, T10) │
│  • Confidence Scoring (below threshold triggers)  │
│  • Response Timeout (>latency limit detection)    │
│  • Input Ambiguity (unclear intent classification)│
└─────────────────────────────────────────────────┘
     ↓ (if failure detected)
┌─────────────────────────────────────────────────┐
│            BOUNDED FALLBACK SEQUENCE            │
├─────────────────────────────────────────────────┤
│  Loop 1: Specific clarification request         │
│    "Please specify [missing_slot]"              │
│         ↓ (if still unclear)                    │
│  Loop 2: Bounded options or constraints         │
│    "Choose: [option_A, option_B, option_C]"     │
│         ↓ (if continued failure, max depth=2)   │
│  Safe Exit: Transparent limitation              │
│    "Unable to complete [task]. Limitation:      │
│     [specific_constraint]. Please [action]."    │
└─────────────────────────────────────────────────┘
     ↓
CONTROLLED TERMINATION (T7: 80% success)

Fallback Characteristics (Empirically Validated):

Bounded loops: Maximum 2 recovery attempts (T5: >3 steps causes semantic drift)
Progressive degradation: Each loop reduces complexity, narrows scope
Transparent limitation: Clear acknowledgment of constraint boundaries (W2/W3 safety-critical)
Stateless recovery: No dependency on session memory (T4: 5/5 stateless success)

D.6 Cross-Layer Integration Diagram

Figure D.6: Complete MCD Agent Lifecycle

USER QUERY
     ↓
┌─────────────────────────────────────────────────┐
│  PROMPT LAYER: Intent parsing + Route selection │
│    • Adaptation pattern determination (W1/W2/W3)│
├─────────────────────────────────────────────────┤
│  CONTROL LAYER: Symbolic routing + Context mgmt │
│    • Decision tree execution (≤3 depth, ≤4 branch)│
├─────────────────────────────────────────────────┤
│  EXECUTION LAYER: Q-tier selection + Local exec │
│    • Dynamic tier routing Q1→Q4→Q8 (T10)        │
├─────────────────────────────────────────────────┤
│  FALLBACK MONITORING: Error detection + Recovery│
│    • Bounded loops ≤2, transparent limitations  │
└─────────────────────────────────────────────────┘
     ↓
┌─────────────────────────────────────────────────┐
│               SUCCESS PATH                      │
│  Task Completion → Validated Response Output    │
│  Performance: 85% retention under Q1 (T10)      │
└─────────────────────────────────────────────────┘
     OR
┌─────────────────────────────────────────────────┐
│               FALLBACK PATH                     │
│  Controlled Degradation → Safe Limitation Exit  │
│  Transparency: Clear constraint acknowledgment  │
└─────────────────────────────────────────────────┘

References

Chapter 4, Section 4.6: MCD Subsystem Definitions
Chapter 5: Instantiated Agent Design Patterns
Chapter 6, Tests T1-T10: Empirical validation of layer interactions
Chapter 7, Walkthroughs W1-W3: Applied layer architecture in domain scenarios

This configuration framework ensures reproducible, statistically valid results while maintaining the ecological validity of real-world deployment constraints. All parameters were optimized for browser-based execution environments typical of edge AI deployment scenarios.

Appendix D