1
Fork 0
mirror of https://github.com/Steffo99/unimore-hpc-assignments.git synced 2024-11-25 01:24:22 +00:00
hpc-2022-g3/README.md
2022-12-12 01:10:41 +01:00

2.9 KiB

 **Stefano Pigozzi** + **Caterina Gazzotti** + **Fabio Zanichelli** | Topic CUDA | High Performance Computing Laboratory | Unimore 

C code optimization using NVIDIA CUDA

Assignment #2

Every team is called to optimize (parallellize) the execution time of the assigned applications on multi-processor system.

Expected outcomes

  • Repository of the code (github/gitlab is ok, or .zip )
  • Oral presentation (5 min + 5 min Q&A) of your work

Assigned application

Group 3: OpenMP/linear-algebra/kernels/atax

Results

Flags: -DMINI_DATASET
CB***  Average of 3 runs:  1.03e-05 seconds
Flags: -DMINI_DATASET -DHPC_INCLUDE_INIT
CB***  Average of 3 runs:  1.27e-05 seconds
Flags: -DMINI_DATASET -DHPC_USE_CUDA
CB***  Average of 3 runs:  0.00123 seconds
Flags: -DMINI_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
CB***  Average of 3 runs:  0.00161 seconds
Flags: -DSMALL_DATASET
CB***  Average of 3 runs:  0.0014 seconds
Flags: -DSMALL_DATASET -DHPC_INCLUDE_INIT
CB***  Average of 3 runs:  0.00344 seconds
Flags: -DSMALL_DATASET -DHPC_USE_CUDA
CB***  Average of 3 runs:  0.00971 seconds
Flags: -DSMALL_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
CB***  Average of 3 runs:  0.0112 seconds
Flags: -DSTANDARD_DATASET
CB***  Average of 3 runs:  0.0876 seconds
Flags: -DSTANDARD_DATASET -DHPC_INCLUDE_INIT
CB***  Average of 3 runs:  0.188 seconds
Flags: -DSTANDARD_DATASET -DHPC_USE_CUDA
CB***  Average of 3 runs:  0.201 seconds
Flags: -DSTANDARD_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
CB***  Average of 3 runs:  0.0647 seconds
Flags: -DLARGE_DATASET
CB***  Average of 3 runs:  0.35 seconds
Flags: -DLARGE_DATASET -DHPC_INCLUDE_INIT
CB***  Average of 3 runs:  0.746 seconds
Flags: -DLARGE_DATASET -DHPC_USE_CUDA
CB***  Average of 3 runs:  0.26 seconds
Flags: -DLARGE_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
CB***  Average of 3 runs:  0.278 seconds
Flags: -DEXTRALARGE_DATASET
CB***  Average of 3 runs:  0.789 seconds
Flags: -DEXTRALARGE_DATASET -DHPC_INCLUDE_INIT
CB***  Average of 3 runs:  1.68 seconds
Flags: -DEXTRALARGE_DATASET -DHPC_USE_CUDA
CB***  Average of 3 runs:  0.647 seconds
Flags: -DEXTRALARGE_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
CB***  Average of 3 runs:  0.665 seconds

Validation

Compiler used: nvcc

Built on Mon_Mar_11_22:13:24_CDT_2019
Cuda compilation tools, release 10.0, V10.0.326

Device used: Unimore Jetson Nano #8

To reproduce the obtained results:

  1. Load the CUDA module:

    $ module load cuda
    
  2. Clone the repository on @Steffo99's computer:

    $ git clone https://github.com/Steffo99/unimore-hpc-assignments
    
  3. Checkout the exact commit the tests were executed on:

    $ git checkout d13a9b786a53d5195ae17ef7afa776e2600ce8e0
    
  4. Access our group's assigned folder:

    $ cd unimore-hpc-assignments/atax
    
  5. Run the benchmarking script:

    $ make bench