Micro-LMs

Micro-LMs are lightweight, domain-specialized AIs that run on NGF rails, turning natural language into deterministic, auditable actions with built-in safety and abstain guarantees. We are piloting this idea first on ARC (Abstraction & Reasoning Corpus) testing to highlight its reasoning power, then for DeFi (Decentralized Finance) to highlight it applicability (one of many verticals) — both built on top of the ngeodesic Python package.

Attributes

Determinism: Same inputs → same decisions (traceable, reproducible).
Domain focus: Small, curated primitive sets (e.g., ARC ops, DeFi ops).
Safety-first: Refuse (ABSTAIN) when uncertain instead of hallucinating.
Composability: Plug into existing apps and LLMs without retraining.

LLMs vs. micro-LMs

LLM = generalist: broad knowledge, flexible language, but stochastic and unsafe for mission-critical execution.
micro-LM = specialist: slim, deterministic, auditable, and more accurate where it matters (DeFi/Finance, Manufacturing & Robotics, Industrial Robotics, Supply Chain & Logistics, Energy & Grid Management, etc).

Dimension	LLMs (ChatGPT, Claude, Meta, Perplexity, etc.)	micro-LMs (ARC, DeFi)
Domain accuracy	Broad coverage, but DeFi primitives are not a training focus. Accuracy drifts under phrasing changes.	Mapper trained on 1k–5k usecase prompts (e.g., DeFi, ARC). Benchmarked accuracy > 98% on 8 DeFi primitives; abstains correctly when uncertain.
Determinism	Outputs vary run-to-run (sampling drift). Even `temperature=0` doesn’t guarantee identical results.	Stage-11 NGF rails (Warp → Detect → Denoise) yield reproducible traces. Perturbation tests confirm stable decisions.
Safety / Policy enforcement	Can be prompted with “stay under LTV 0.75,” but no hard guarantees — may still propose unsafe actions.	Built-in verifiers: Loan-to-Value (LTV), Health Factor (HF), Oracle freshness. Unsafe paths always block or abstain.
Abstain behavior	Rarely abstains — tends to “make something up” even when uncertain.	Explicit abstain mode: non-exec prompts (balance checks, nonsense) → abstain with clear reason (`abstain_non_exec`).
Auditability	Opaque; no structured rationale.	Every run produces machine-readable artifacts: mapper score, abstain reason, verifier tags, plan trace. Auditable for compliance.
Efficiency / Cost	10s–100s of billions of params; inference is slow/expensive.	SBERT (~22M params) + lightweight classifier. Fast, cheap, deployable in CI.
Regulatory / Compliance fit	Hard to certify (stochastic, unexplainable).	Deterministic + auditable by design. Built for domains where regulators demand safety.

Core Components

Mapper: Interprets user intent and maps to domain primitives.
Rails: The NGF pipeline (Warp → Detect → Denoise → Verify) that stabilizes decisions.
Verifiers: Domain rules/policies (e.g., LTV/HF/oracle in DeFi).
Executor: Applies the verified plan; or aborts if verifiers fail.

Examples

DeFi A prompt like “deposit 10 ETH into aave” is mapped to a deposit_asset primitive. Verifiers check collateralization, oracle age, and policy limits. If safe, the plan is emitted; otherwise the system ABSTAINS with an explicit reason.