Axono: Deep Learning Framework Overview | Generated by AI

Home 2026.05

Here’s a comprehensive breakdown of Axono:


What It Is

A from-scratch deep learning framework built in C++ with Python bindings — similar in spirit to a mini PyTorch, targeting education, research, and prototyping. Made by ByteRainTech, version 0.2.0, Apache 2.0 licensed.


Architecture

Layer Stack

Python API (axono.*)
      ↕  pybind11
C++ Core (libaxono) — shared .so/.dll
      ↕
CPU backend (OpenMP + AVX2/AVX SIMD)
CUDA backend (NVIDIA GPUs)

Build System


Core Components

axono.core.Tensor

Python wrapper around a C++ Tensor class. Key capabilities:

axono.core.operators / axono.core.ops

Thin Python wrappers that delegate to C++ implementations:

Each operator has separate CPU and CUDA kernel headers under include/axono/ops/{cpu,cuda}/.

axono.nn


C++ Internals (from headers)

include/axono/core/
├── tensor.h          # Core Tensor class
├── types.h           # DataType enum, Status codes
├── ops.h             # Op dispatch interface
├── module.h          # C++ Module base
├── macros.h          # Cross-platform macros
├── cpu/tensor/
│   ├── kernel.h      # CPU compute kernels
│   └── transpose.h   # CPU transpose
└── cuda/
    ├── detail.h      # CUDA helpers
    └── tensor/
        ├── kernel.h  # CUDA kernels
        └── transpose.h

Performance (CPU benchmarks vs PyTorch & PaddlePaddle)

Op Axono vs Torch (4000×4000)
from_numpy ~15× slower than Torch
add comparable (~0.06s vs 0.07s)
relu ~3× slower than Torch
matmul ~10× slower than Torch

Matmul is the weakest point — PyTorch uses highly optimized BLAS (MKL/OpenBLAS); Axono uses hand-written SIMD kernels which are educational but not production-tuned.


Examples

The examples/ directory shows the aspirational API (not all implemented yet):


Current State (v0.2.0)

Implemented: Tensor, from_numpy/to_numpy, matmul, add, relu, Linear layer, Module base class

Not yet implemented (referenced in examples): DataLoader, CNN models, Trainer, optimizers (Adam), visualization tools, no_grad() context

This is an early-stage educational framework with a solid C++/Python foundation and a clearly mapped-out roadmap toward a more complete PyTorch-like API.


Back Donate