The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

DESCRIPTION

    Interface to runtime cuda kernel compile module.

Constructor

    MXRtc object in mxnet.
    This class allow you to write cuda kernel in perl
    and call them with NDArray.

    Parameters
    ----------
    name : str
        name of the kernel
    inputs : tuple of (str, mxnet.ndarray)
        list of input names and ndarray
    outputs : tuple of (str, mxnet.ndarray)
        list of output names and ndarray
    kernel : str
        the actual kernel code.
        Note that this is only the body of the kernel, i.e.
        after { and before }. Rtc will decorate the kernel.
        For example, if name = "mykernel" and
        inputs = [('x', mx.nd.zeros((10,)))]
        outputs = [('y', mx.nd.zeros((10,)))]
        kernel = "y[threadIdx.x] = x[threadIdx.x];",
        the kernel that is compile will be:
        extern "C" __global__ mykernel(float *x, float *y) {
            const int x_ndim = 1;
            const int x_dims = { 10 };
            const int y_ndim = 1;
            const int y_dims = { 10 };

            y[threadIdx.x] = x[threadIdx.x];
        }

push

        run the kernel.

        Parameters
        ----------
        inputs : list of ndarray
            list of input. Can be different ndarray then uses for constructor,
            but must have the same shape and in the same order.
        outputs : list of ndarray
            list of out. Can be different ndarray then uses for constructor,
            but must have the same shape and in the same order.
        grid_dims : tuple of 3 uint
            grid dimension for kernel launch
        block_dims : tuple of 3 uint
            block dimension for kernel launch