Model Training in Progress

Language Models
Beyond Attention

We're building a fundamentally new architecture for language understanding. No attention mechanism. No tokenizer. Pure linear complexity — processing raw bytes at unprecedented scale.

Get in Touch → Learn More

O(L)

Complexity

500M+

Parameters

256

Vocabulary

Zero

Attention Layers

// approach

Not an optimization.
A departure.

Others are making attention faster. We removed it entirely — and built something new from first principles.

Conventional

Transformer-Based Models

✕ Quadratic O(n²) complexity — cost explodes with context
✕ Tokenizer-dependent — language bias baked in
✕ Fixed architecture — manually designed, static capacity
✕ Context window capped by memory constraints

Our Approach

Attention-Free Architecture

✓ Linear O(L) complexity — scales without cost explosion
✓ Byte-native — processes raw UTF-8, zero language bias
✓ Self-growing — model expands its own architecture during training
✓ Theoretically unlimited context from state-based design

// Transformer
attention: O(n²) quadratic
tokenizer: required (30K-100K vocab)
context: bounded by memory

// Dreamera
attention: none
tokenizer: none (raw bytes, vocab=256)
context: theoretically unbounded
complexity: O(L) linear

// technology

Core Capabilities

Each component is designed to solve a fundamental limitation of current language models.

⚡

Attention-Free Computation

Not sparse attention. Not linear attention. No attention mechanism at all. A completely different approach to sequence modeling that processes information in constant time per step.

◆

Byte-Native Processing

Operates directly on raw UTF-8 bytes with a vocabulary of just 256. No tokenizer means no language bias, no information loss, and no vocabulary mismatch across languages.

↗

Linear Scaling

Time and memory complexity scale linearly with sequence length. Double the input, double the cost — not quadruple it. This makes long-context processing economically viable.

◎

Self-Growing Architecture

The model autonomously expands its own capacity during training — adding depth and parameters exactly when needed, without manual architecture search.

⊕

Universal Multilingual

Byte-level processing treats every language and script identically from the ground up. Korean, English, Arabic, CJK — all first-class citizens by design.

∞

Unbounded Context

State-based architecture enables theoretically unlimited context. No sliding windows, no chunking, no retrieval augmentation needed.

Building the future of language AI.

We're a Seoul-based research lab. Our model is actively training.
Interested in what we're building? Let's talk.

dreamera.co.kr@gmail.com →

Language Models Beyond Attention