NIKOLA Chess Engine

Supercomputer Chess Engine

NIKOLA is a supercomputer-class chess engine built for modern data-center GPUs—from A100 through Blackwell B200/B300. Written entirely in Mind Language, it uses Mind IR for NNUE evaluation and distributed endgame intelligence—without Python, without Rust, without dependencies. Scales from single-node to multi-GPU clusters.

3200+
Elo Target
SPRT-verified
50M+
NNUE Positions
scaling to 1B+
100M+
Opening Book
quality-filtered
B300
Blackwell Ready
multi-GPU clusters
UCI
Compatible
+ Lichess Bot

Core Technology

Artificial Intelligence

Alexander Kronrod, a pioneering Russian AI researcher, famously stated, "Chess is the Drosophila of AI. Pieces and strategic positions have no intrinsic utility in chess; the super goal is winning." For humans, the "self" acts as a profound, unifying symbol—encompassing the player, their goal-driven system, and the interplay of mind and body that perceives and acts from this perspective.

To replicate expert human strategy, we are developing an advanced AI engine that prioritizes the strategic essence of chess. This engine employs deep analytical evaluation, drawing from a vast dataset of every recorded game and the styles of legendary players. Using cutting-edge machine learning, it discerns unique personalities and tactical signatures, distinguishing one player's approach from another.

The AI engine constantly evaluates the current game cycle and all algorithms running in parallel, then chooses the optimal strategy for the next move. The only way to lose a game is if a player or opponent makes a mistake. Perfection is a much harder problem than simply being unbeatable—and that is our goal.

Numerical Analytics

Recent breakthroughs in unified memory architectures have revolutionized data access, enabling CPUs and GPUs to share ultra-high-speed memory seamlessly. NVIDIA's latest Hopper architecture pushes this integration further by employing stacked HBM3e modules alongside advanced memory virtualization. This design now achieves bandwidths exceeding 3 terabytes per second, dramatically reducing data transfer latencies.

NVIDIA data-center GPUs from A100 through Blackwell feature up to 80+ GB of HBM memory with NVLink 5.0 connectivity for optimized cluster operations. These architectures enable unprecedented scale for AI inference and chess computation.

In collaboration with Dell, Supermicro, and NVIDIA, modern clusters harness Spectrum-X networking, the upgraded NVLink 5.0 interface, and cutting-edge Infiniband NDR to achieve GPU throughput efficiencies of up to 98%. This distributed architecture, bolstered by sophisticated RDMA and parallel programming frameworks, is redefining computational limits for revolutionary AI applications.

Blackwell Architecture

Next-generation supercomputers like xAI's Colossus cluster integrate hundreds of thousands of NVIDIA Blackwell GPUs. This CPU-GPU hybrid architecture represents a massive leap from earlier systems, enabling distributed chess computation at unprecedented scale.

NVIDIA GPUDirect facilitates direct GPU communication, while Remote Direct Memory Access (RDMA) ensures high-throughput, low-latency transfers without OS overhead, ideal for global clusters exceeding past benchmarks. The system evolves beyond the V100 with HBM3e memory surpassing 3 TB/s bandwidth.

Leveraging Spectrum-X networking, NVLink 5.0, and Infiniband NDR, the Colossus cluster sustains 95% throughput across its GPUs. This distributed system, fortified by RDMA and parallel programming, redefines computational boundaries, advancing AI applications like deep learning and chess-solving algorithms to unprecedented levels.

Dynamic Parallelism

Dynamic parallelism marks a leap in GPU computing, allowing on-demand kernel spawning on the GPU without CPU involvement. Embedded in NVIDIA's CUDA framework, it empowers threads within a grid to configure, launch, and synchronize new grids, boosting flexibility for parallel tasks. Yet, it faces challenges: kernel launch overheads degrade performance, and dynamically spawned kernels often underutilize GPU cores.

The hardware-based SPAWN framework addresses these issues, optimizing dynamically generated kernels. By controlling scheduling and resource allocation, SPAWN cuts launch overheads and queuing delays, alleviating bottlenecks in dynamic parallelism. This enhances GPU efficiency, leveraging CUDA's latest features like grid synchronization and multi-grid management for complex, recursive workloads.

Within systems powered by Blackwell GPUs—with HBM3e memory exceeding 3 TB/s bandwidth and FP8 precision— integrating SPAWN optimizes workloads like chess game-tree traversal, spawning kernels to probe midgame positions efficiently. This architecture minimizes latency and maximizes core utilization across distributed GPU clusters.

Built with Mind Language

NIKOLA is written entirely in Mind, a systems programming language designed for high-performance computing with first-class support for parallelism and hardware optimization. Mind compiles to native code that rivals hand-tuned assembly while providing safe, expressive abstractions.

IR

Mind IR

Mind Intermediate Representation provides a low-level abstraction that maps directly to hardware, enabling optimal code generation for CPUs and GPUs. The IR supports SSA form, explicit memory management, and hardware-specific intrinsics for maximum control over generated code.

MIC

Mind Intrinsics Compiler

MIC compiles Mind code to native machine instructions with full support for AVX-512, AVX-VNNI, and AMX on Intel processors, plus SM90 PTX for NVIDIA Hopper and Blackwell GPUs. The compiler performs aggressive optimizations while preserving programmer intent.

MAP

Mind Array Processing

MAP enables efficient array operations with automatic parallelization across CPU cores and GPU streaming multiprocessors. Perfect for batch position evaluation and neural network inference, MAP handles memory layout optimization and kernel fusion automatically.

Key Features

NNUE Evaluation

HalfKAv2 neural network architecture trained on over 1 billion positions from Lichess master games. Features incremental accumulator updates for efficient position evaluation. Supports AVX-512, AVX-VNNI, and AMX on CPU, plus CUDA 12.x with FP8/INT8 tensor cores for GPU acceleration.

Opening Book

Comprehensive opening book with over 100 million positions extracted from master-level games, stored in mmap-friendly Polyglot format for instant lookups. Includes weighted move selection based on win rates and supports multiple opening repertoires.

GPU Acceleration

Native support for NVIDIA data-center GPUs from A100 through Blackwell (B200/B300) architectures via Mind MAP. Leverages FP8 tensor cores for blazing-fast neural network inference and parallel search tree exploration.

Search Algorithm

State-of-the-art alpha-beta pruning with principal variation search (PVS), null move pruning, late move reductions (LMR), transposition tables with Zobrist hashing, killer moves, history heuristics, aspiration windows, and iterative deepening with pondering support.

UCI Protocol

Full UCI (Universal Chess Interface) protocol support for seamless integration with popular chess GUIs including Arena, ChessBase, CuteChess, and Fritz. Also supports Lichess Bot API for automated online play and tournament participation.

Data Service

Connects to dedicated chess data servers via HTTP for opening book queries, NNUE position cache lookups, and Syzygy tablebase probing. Supports local caching and fallback modes for offline operation.

Technology Stack

NVIDIA CUDA

  • RDMA for GPUDirect
  • NVLink 5.0
  • CUDA cuBLAS
  • Dynamic Parallelism
  • Triton Inference Server
  • cuDNN

Mind Language

  • Mind IR Compiler
  • MIC Intrinsics
  • MAP Array Processing
  • SM90 PTX Backend
  • AVX-512/AMX Support
  • OpenCL 3.0

Chess Engine

  • NNUE HalfKAv2
  • Polyglot Opening Book
  • Syzygy Tablebases
  • UCI Protocol
  • Lichess Bot API
  • FIDE Time Controls

Infrastructure

  • Spectrum-X Networking
  • Infiniband NDR
  • HBM3e Memory
  • RAPIDS Analytics
  • Nsight Systems
  • Omniverse USD

Get in Touch

Have questions about NIKOLA or want to contribute to the project? We would love to hear from you. Join our community of chess enthusiasts and AI researchers.