§ Lab / Open Research

Open Research

North Star

We research the systems around frontier LLMs: retrieval, small-model optimization, and runtime verification — the layer that makes AI grounded, auditable, and production-ready.

§ The Catalog

Five threads.

Each entry is a working artifact: a paper, a project, an engine, a library, or an open model. The thread runs from theory to deployable systems.

§012025.Q2 · paperPublished

LLM pruning and knowledge distillation on HPC.

Joint research endeavour with HLRS, AMD, and HPE on pruning and distilling LLMs at scale on AMD MI300A hardware inside Germany's secure national HPC.

HLRS
AMD
HPE

Read the paper ↗

§05Published · FFplus projectPublished

HALO — Hallucination-Aware Layered Optimization.

Dennis, our Co-Founder, set the foundation for LLM auditability at runtime during the EU-funded FFplus project.

Explainability
Sovereign AI

Read the success story ↗

§02Enterprise searchOpen Source

ColSearch — document-native multi-vector retrieval.

High-performance multi-vector retrieval that treats the document, not the chunk, as the unit of search. Much simpler pipeline plumbing with a new quantization scheme for high efficiency-to-recall.

Multi-vector
Quantization

View on GitHub ↗

§03Research RepoOpen Source

LLM-Opt — pruning & knowledge distillation.

Structured pruning of LLMs and encoders, with distillation back into smaller, deployment-ready students. Built for cost, latency, and recoverable quality.

Pruning
Distillation

View on GitHub ↗

§04Open ModelsOpen Models

Open models on Hugging Face.

Specialist small models for the boring parts of production AI: compression, PII detection, relation extraction, agent routing, privacy-aware classification.

PaliGLiNERsoon
PrivacyPalisoon

View on Hugging Face ↗

§ Collaborate

Build Search 2.0 with us.

We work with universities, infrastructure partners, and enterprise labs on the hard parts of retrieval, small-model optimization, and runtime verification. If your research or product touches that surface, we want to talk.

View on GitHub ↗