Moonlight

Moonlight is a C++/CUDA sandbox for experimenting with GPU-focused numerical computing. The goal is to prototype low-level GPU kernels and pipelines, while keeping the workflow strict and ergonomic (sanitizers, LSP, Make targets).

Rust is my main environment for systems + numerical work, but GPU kernels will live here — with ffi bridges back into Rust where needed.

Build & Run

Development

make              # build debug target
make run          # run debug build

Release

make release      # optimized build (versioned under ./target/build-<VERSION>/)

Diagnostics / Checks

Moonlight integrates multiple safety nets:

Host C++ checks:
- -fsanitize=address,undefined,leak
Device CUDA checks:
- make memcheck → run under compute-sanitizer --tool memcheck
- make racecheck → run under compute-sanitizer --tool racecheck

make memcheck
make racecheck

These are analogous to Rust’s strict runtime checks — immediate feedback when you trip on UB.

Example Pipeline

Currently, the project has a simple demo pipeline:

vector<float> pipeline(
    int n,
    float c,
    vector<float>& x,
    vector<float>& y
) {
    int blocks = (n + BLOCKSIZE - 1) / BLOCKSIZE;
    float *d_x, *d_y;

    cudaMalloc(&d_x, n * sizeof(float));
    cudaMalloc(&d_y, n * sizeof(float));

    cudaMemcpy(d_x, x.data(), n * sizeof(float), cudaMemcpyHostToDevice);
    cudaMemcpy(d_y, y.data(), n * sizeof(float), cudaMemcpyHostToDevice);

    add<<<blocks, BLOCKSIZE>>>(n, d_x, d_y);
    scale<<<blocks, BLOCKSIZE>>>(n, c, d_y);

    cudaMemcpy(y.data(), d_y, n * sizeof(float), cudaMemcpyDeviceToHost);
    cudaFree(d_x);
    cudaFree(d_y);

    return y;
}

This kernel-pipeline pattern is the core abstraction:

low-level CUDA kernels (e.g. add, scale)
C++ host pipelines wrapping them
future Rust FFI bindings will connect at the vector<float>& boundary.

Next Steps

Implement core matrix operations (matmul, transpose, reductions).
Explore memory layouts (row-major, shared memory tiling).
Build minimal Rust FFI surface on top.
Compare performance vs. pure Rust StellarMath experiments.

Notes

Built & tested on Arch Linux with CUDA 13.0.
GPU: NVIDIA RTX 2070 (compute capability sm_75).
Compiler: nvcc with clangd LSP integration.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
src		src
.clangd		.clangd
.gitignore		.gitignore
.ignore		.ignore
CMakeLists.txt		CMakeLists.txt
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Moonlight

Build & Run

Development

Release

Diagnostics / Checks

Example Pipeline

Next Steps

Notes

About

Uh oh!

Releases

Packages

Languages

cyancirrus/moonlight

Folders and files

Latest commit

History

Repository files navigation

Moonlight

Build & Run

Development

Release

Diagnostics / Checks

Example Pipeline

Next Steps

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages