LeetGPU Solutions and Notes

Notes and solutions in PyTorch, Triton, and CUDA. Runtime shown for T4 GPU.

  • LeetGPU-5: Matrix Addition

    Jan 04, 2026
    Implement a program that performs element-wise addition of two matrices containing 32-bit floating point numbers on a GPU. The program should take two input matrices of equal dimensions and produce a single output matrix containing their element-wise sum.
  • LeetGPU-3: Matrix Transpose

    Jan 03, 2026
    Write a program that transposes a matrix of 32-bit floating point numbers on a GPU. The transpose of a matrix switches its rows and columns. Given a matrix of dimensions , the transpose will have dimensions . All matrices are stored in row-major format.
  • LeetGPU-12: Simple Inference

    Dec 28, 2025
    Run inference on a PyTorch model. Given an input tensor and a trained torch.nn.Linear model, compute the forward pass and store the result in the output tensor.
  • LeetGPU-2: Matrix Multiplication

    Dec 17, 2025
    Write a program that multiplies two matrices of 32-bit floating point numbers on a GPU. Given matrix A of dimensions \(M \times N\) and matrix B of dimensions N x K, compute the product matrix C, which will have dimensions MxK. All matrices are stored in row-major format.
  • LeetGPU-1: Vector Addition

    Dec 14, 2025
    Implement a program that performs element-wise addition of two vectors containing 32-bit floating point numbers on a GPU. The program should take two input vectors of equal length and produce a single output vector containing their sum.
No matching items