NIKOLA Chess Engine

Overview

NIKOLA uses a state-of-the-art Efficiently Updatable Neural Network (NNUE) for position evaluation. Unlike traditional hand-crafted evaluation functions, NNUE learns complex positional patterns from millions of high-quality games and engine self-play.

HalfKA Architecture

NIKOLA employs the HalfKA (Half-King-All) feature set, which encodes:

King position - Relative location of each side's king
Piece placement - All pieces indexed relative to king position
Perspective - Separate views for white and black

This representation enables efficient incremental updates as pieces move, requiring only sparse matrix operations rather than full network evaluation.

Network Topology

Input Layer:768 x 2 (HalfKA features)

Hidden Layer 1:1024 neurons (ClippedReLU)

Hidden Layer 2:512 neurons (ClippedReLU)

Hidden Layer 3:256 neurons (ClippedReLU)

Output Layer:1 (centipawn evaluation)

GPU-Batched Evaluation

NIKOLA's neural network inference is fully GPU-accelerated using native MIND tensor operations. The GPU-batched NNUE system collects positions from Lazy SMP search threads and evaluates them in batches for maximum throughput:

Batched inference - 32-256 positions per GPU kernel, achieving 500M+ positions/sec
Tensor cores - FP8/INT8 mixed precision on Blackwell and Vera Rubin architectures
Multi-GPU distribution - Scale to 8+ GPUs per node with NVLink 5.0
Async CUDA streams - Overlap compute with memory transfers for maximum utilization
Virtual loss - Enable parallel MCTS expansion without tree corruption

Supported GPU Architectures

NVIDIA (CUDA)

Ampere (A100, RTX 30xx/40xx)
Blackwell Consumer (RTX 5090, RTX 5080)
Hopper (H100, H200)
Blackwell Data Center (B200, GB200, GB300)
Vera Rubin (next-gen architecture)

AMD (ROCm)

CDNA 2 (MI200 series)
CDNA 3 (MI300 series)
RDNA 3 (RX 7000 series)

Apple (Metal)

M1 / M1 Pro / M1 Max / M1 Ultra
M2 / M2 Pro / M2 Max / M2 Ultra
M3 / M3 Pro / M3 Max
M4 / M4 Pro / M4 Max

WebGPU

Chrome 113+
Firefox 121+
Safari 18+

Incremental Updates

The NNUE architecture enables "efficiently updatable" evaluation. When a move is made, only the affected features need recalculation rather than the entire network. This provides 10-100x speedup compared to full evaluation, critical for deep search.

Training Data

NIKOLA's network is trained on a proprietary dataset of over 10 billion positions generated through:

Self-play games at various time controls
High-depth analysis of master games
Curated endgame positions from tablebases
Adversarial positions designed to expose weaknesses

Custom Networks

Advanced users can load custom NNUE networks via the UCI option:

setoption name NNUEPath value /path/to/custom.nnue

Neural Network Architecture