.. _concepts-micro-lms:

Micro-LMs
===================

Micro-LMs are lightweight, domain-specialized AIs that run on NGF rails, turning natural language into deterministic, auditable actions with built-in safety and abstain guarantees. We are piloting this idea first on ARC (Abstraction & Reasoning Corpus) testing to highlight its reasoning power, then for DeFi (Decentralized Finance) to highlight it applicability (one of many verticals) — both built on top of the ngeodesic Python package.

**Attributes**

- **Determinism:** Same inputs → same decisions (traceable, reproducible).
- **Domain focus:** Small, curated primitive sets (e.g., ARC ops, DeFi ops).
- **Safety-first:** Refuse (ABSTAIN) when uncertain instead of hallucinating.
- **Composability:** Plug into existing apps and LLMs without retraining.


LLMs vs. micro-LMs 
-------------------

- **LLM = generalist**: broad knowledge, flexible language, but *stochastic and unsafe* for mission-critical execution.  
- **micro-LM = specialist**: slim, deterministic, auditable, and **more accurate where it matters** (DeFi/Finance, Manufacturing & Robotics, Industrial Robotics, Supply Chain & Logistics, Energy & Grid Management, etc).

+-------------------------+-------------------------------------------------------------+-------------------------------------------------------------------+
| Dimension               | LLMs (ChatGPT, Claude, Meta, Perplexity, etc.)              | **micro-LMs (ARC, DeFi)**                                         |
+=========================+=============================================================+===================================================================+
| **Domain accuracy**     | Broad coverage, but DeFi primitives are not a training      | Mapper trained on 1k–5k usecase prompts (e.g., DeFi, ARC).        |
|                         | focus. Accuracy drifts under phrasing changes.              | Benchmarked accuracy > 98% on 8 DeFi primitives; abstains         |
|                         |                                                             | correctly when uncertain.                                         |
+-------------------------+-------------------------------------------------------------+-------------------------------------------------------------------+
| **Determinism**         | Outputs vary run-to-run (sampling drift). Even              | Stage-11 NGF rails (Warp → Detect → Denoise) yield reproducible   |
|                         | ``temperature=0`` doesn’t guarantee identical results.      | traces. Perturbation tests confirm stable decisions.              |
+-------------------------+-------------------------------------------------------------+-------------------------------------------------------------------+
| **Safety / Policy       | Can be prompted with “stay under LTV 0.75,” but no hard     | Built-in verifiers: Loan-to-Value (LTV), Health Factor (HF),      |
| enforcement**           | guarantees — may still propose unsafe actions.              | Oracle freshness. Unsafe paths always block or abstain.           |
+-------------------------+-------------------------------------------------------------+-------------------------------------------------------------------+
| **Abstain behavior**    | Rarely abstains — tends to “make something up” even when    | Explicit abstain mode: non-exec prompts (balance checks, nonsense)| 
|                         | uncertain.                                                  | → abstain with clear reason (``abstain_non_exec``).               |
+-------------------------+-------------------------------------------------------------+-------------------------------------------------------------------+
| **Auditability**        | Opaque; no structured rationale.                            | Every run produces machine-readable artifacts: mapper score,      |
|                         |                                                             | abstain reason, verifier tags, plan trace. Auditable for          |
|                         |                                                             | compliance.                                                       |
+-------------------------+-------------------------------------------------------------+-------------------------------------------------------------------+
| **Efficiency / Cost**   | 10s–100s of billions of params; inference is slow/expensive.| SBERT (~22M params) + lightweight classifier. Fast, cheap,        |
|                         |                                                             | deployable in CI.                                                 |
+-------------------------+-------------------------------------------------------------+-------------------------------------------------------------------+
| **Regulatory /          | Hard to certify (stochastic, unexplainable).                | Deterministic + auditable by design. Built for domains where      |
| Compliance fit**        |                                                             | regulators demand safety.                                         |
+-------------------------+-------------------------------------------------------------+-------------------------------------------------------------------+


Core Components
---------------
- **Mapper:** Interprets user intent and maps to domain primitives.
- **Rails:** The NGF pipeline (Warp → Detect → Denoise → Verify) that stabilizes decisions.
- **Verifiers:** Domain rules/policies (e.g., LTV/HF/oracle in DeFi).
- **Executor:** Applies the verified plan; or aborts if verifiers fail.

Examples 
--------------

**DeFi**
A prompt like *"deposit 10 ETH into aave"* is mapped to a ``deposit_asset`` primitive.
Verifiers check collateralization, oracle age, and policy limits. If safe, the plan
is emitted; otherwise the system **ABSTAINS** with an explicit reason.