1
Fork 0
mirror of https://github.com/Steffo99/unimore-hpc-assignments.git synced 2024-11-25 17:44:23 +00:00
Progetti svolti di laboratorio di High Performance Computing
Find a file
2022-11-17 20:59:38 +01:00
.idea Remove -DEXTRALARGE_DATASET from CLion's runConfigs as it was moved into the Makefile 2022-11-16 01:45:15 +01:00
.media Create contributors' documentation 2022-11-14 16:22:11 +01:00
common init commit 2022-11-11 13:23:45 +01:00
OpenMP Cleanup comments 2022-11-17 19:53:46 +01:00
.clang-format Make CLion formatting work 2022-11-15 19:40:17 +01:00
.gitignore init commit 2022-11-11 13:23:45 +01:00
CONTRIBUTING.md Update CONTRIBUTING with the bench target 2022-11-16 17:44:49 +01:00
README.md Improve README 2022-11-17 20:59:38 +01:00

 **Stefano Pigozzi** + **Caterina Gazzotti** + **Fabio Zanichelli** | Topic OpenMP | High Performance Computing Laboratory | Unimore 

C code optimization using OpenMP

Assignment #1

Every team is called to optimize (parallellize) the execution time of the assigned applications on multi-processor system.

Expected outcomes

  • Repository of the code (github/gitlab is ok, or .zip )
  • Oral presentation (5 min + 5 min Q&A) of your work

Assigned application

Group 3: OpenMP/linear-algebra/kernels/atax

Developed features

Results

$ make bench
Flags: -DMINI_DATASET
.........................  Average of 25 runs:  1.35e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_INIT_ARRAY_1
.........................  Average of 25 runs:  1.92e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_INIT_ARRAY_2
.........................  Average of 25 runs:  2.16e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2
.........................  Average of 25 runs:  2.61e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  1.9e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  2.12e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  2.36e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  2.58e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  1.72e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  1.91e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  2.12e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  2.32e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  1.92e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  2.11e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  2.3e-05 seconds
Flags: -DMINI_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  2.6e-05 seconds
Flags: -DSMALL_DATASET
.........................  Average of 25 runs:  0.00751 seconds
Flags: -DSMALL_DATASET -DTOGGLE_INIT_ARRAY_1
.........................  Average of 25 runs:  0.00752 seconds
Flags: -DSMALL_DATASET -DTOGGLE_INIT_ARRAY_2
.........................  Average of 25 runs:  0.00279 seconds
Flags: -DSMALL_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2
.........................  Average of 25 runs:  0.0028 seconds
Flags: -DSMALL_DATASET -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  0.00761 seconds
Flags: -DSMALL_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  0.00761 seconds
Flags: -DSMALL_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  0.00289 seconds
Flags: -DSMALL_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  0.00293 seconds
Flags: -DSMALL_DATASET -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.00707 seconds
Flags: -DSMALL_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.00703 seconds
Flags: -DSMALL_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.00228 seconds
Flags: -DSMALL_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.00227 seconds
Flags: -DSMALL_DATASET -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.00707 seconds
Flags: -DSMALL_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.00706 seconds
Flags: -DSMALL_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.00228 seconds
Flags: -DSMALL_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.00228 seconds
Flags: -DSTANDARD_DATASET
.........................  Average of 25 runs:  0.419 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_INIT_ARRAY_1
.........................  Average of 25 runs:  0.419 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_INIT_ARRAY_2
.........................  Average of 25 runs:  0.162 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2
.........................  Average of 25 runs:  0.162 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  0.42 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  0.42 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  0.162 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  0.162 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.386 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.386 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.128 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.128 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.386 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.386 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.128 seconds
Flags: -DSTANDARD_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.129 seconds
Flags: -DLARGE_DATASET
.........................  Average of 25 runs:  1.83 seconds
Flags: -DLARGE_DATASET -DTOGGLE_INIT_ARRAY_1
.........................  Average of 25 runs:  1.83 seconds
Flags: -DLARGE_DATASET -DTOGGLE_INIT_ARRAY_2
.........................  Average of 25 runs:  0.707 seconds
Flags: -DLARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2
.........................  Average of 25 runs:  0.707 seconds
Flags: -DLARGE_DATASET -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  1.82 seconds
Flags: -DLARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  1.82 seconds
Flags: -DLARGE_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  0.704 seconds
Flags: -DLARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  0.703 seconds
Flags: -DLARGE_DATASET -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  1.64 seconds
Flags: -DLARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  1.64 seconds
Flags: -DLARGE_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.527 seconds
Flags: -DLARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.527 seconds
Flags: -DLARGE_DATASET -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  1.63 seconds
Flags: -DLARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  1.64 seconds
Flags: -DLARGE_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.527 seconds
Flags: -DLARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  0.525 seconds
Flags: -DEXTRALARGE_DATASET
.........................  Average of 25 runs:  4.24 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_INIT_ARRAY_1
.........................  Average of 25 runs:  4.23 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_INIT_ARRAY_2
.........................  Average of 25 runs:  1.65 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2
.........................  Average of 25 runs:  1.65 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  4.22 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  4.16 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  1.62 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1
.........................  Average of 25 runs:  1.62 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  3.69 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  3.68 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  1.2 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  1.2 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  3.67 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  3.67 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  1.19 seconds
Flags: -DEXTRALARGE_DATASET -DTOGGLE_INIT_ARRAY_1 -DTOGGLE_INIT_ARRAY_2 -DTOGGLE_KERNEL_ATAX_1 -DTOGGLE_KERNEL_ATAX_2
.........................  Average of 25 runs:  1.19 seconds

Validation

  • Compiler used: gcc
  • Jetson Nano used: 8

To reproduce the obtained results:

  1. Clone the repository on a Jetson Nano:

    $ git clone https://github.com/Steffo99/unimore-hpc-1
    
  2. Access our group's assigned folder:

    $ cd unimore-hpc-1/OpenMP/linear-algebra/kernels/atax
    
  3. Checkout the exact commit the tests were executed on:

    $ git checkout 28479dfb4b730fb50a50e6da02b9b1fe4fb298db
    
  4. Run the benchmarking script:

    $ make bench