Skip to content

[BUG] Overflow in _C_make_dataobj due to c_int type #2777

@kelv1n9

Description

@kelv1n9

Description

When running large 3D grids (e.g., 1295^3 points), Devito fails due to integer overflow in the C structure definition in devito/types/dense.py.
The problem originates from the use of ctypes.c_int (32-bit signed integer), which overflows when matrix size exceeds 2^31-1 elements, even if index-mode=int64 and linearize=True are enabled.

This results in incorrect memory size interpretation and crashes in GPU memory allocation.

File

devito/types/dense.py

Affected Section

_C_ctype = POINTER(type(_C_structname, (Structure,),
                            {'_fields_': [(_C_field_data, c_restrict_void_p),
                                          (_C_field_size, POINTER(c_int)),
                                          (_C_field_nbytes, c_ulong),
                                          (_C_field_nopad_size, POINTER(c_ulong)),
                                          (_C_field_domain_size, POINTER(c_ulong)),
                                          (_C_field_halo_size, POINTER(c_int)),
                                          (_C_field_halo_ofs, POINTER(c_int)),
                                          (_C_field_owned_ofs, POINTER(c_int)),
                                          (_C_field_dmap, c_void_p)]}))

Proposed Fix

Replace c_int with c_long in the _C_ctype struct definition to safely handle arrays larger than 2^31 elements:

from ctypes import c_long

_C_ctype = POINTER(type(_C_structname, (Structure,),
                            {'_fields_': [(_C_field_data, c_restrict_void_p),
                                          (_C_field_size, POINTER(c_long)),
                                          (_C_field_nbytes, c_ulong),
                                          (_C_field_nopad_size, POINTER(c_ulong)),
                                          (_C_field_domain_size, POINTER(c_ulong)),
                                          (_C_field_halo_size, POINTER(c_long)),
                                          (_C_field_halo_ofs, POINTER(c_long)),
                                          (_C_field_owned_ofs, POINTER(c_long)),
                                          (_C_field_dmap, c_void_p)]}))

After this modification, all large-domain runs succeed without overflow.

Steps to Reproduce:
1. Use a large 3D simulation (> 2^31 elements):
2. Observe the runtime error (CUDA memory allocation failure, but triggered by invalid size).
3. Inspect dense.py - note c_int usage for size fields.

Observed Behavior

Out of memory allocating 18446744065653020036 bytes of device memory
Failing in Thread:1
Accelerator Fatal Error: call to cuMemAlloc returned error 2 (CUDA_ERROR_OUT_OF_MEMORY)

The reported allocation size is clearly invalid (≈ 1.8×10^31 bytes), caused by signed 32-bit overflow.

Expected Behavior

Correct allocation for 24.5 GB domain (~2.17×10^9 elements), no overflow, normal execution.

Environment
• Devito version: 4.8.20
• Backend: OpenACC / NVHPC 25.1
• MPI: Enabled
• Hardware: NVIDIA H100 (80 GB HBM)
• OS: HPC cluster environment - Linux
• Python: 3.10 (NVHPC env)

Fix verified on GPU backends - replacing c_int with c_long solves the issue fully.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions