mirror of
https://github.com/Steffo99/unimore-hpc-assignments.git
synced 2024-11-21 23:54:25 +00:00
Update repository metadata
This commit is contained in:
parent
65b88f2913
commit
12d2ff7af3
3 changed files with 59 additions and 134 deletions
|
@ -1,34 +0,0 @@
|
||||||
# Come riprodurre i risultati
|
|
||||||
|
|
||||||
Perchè tutto il team possa collaborare al progetto, è importante che tutti sappiano come abbiamo fatto a ottenere un certo risultato.
|
|
||||||
|
|
||||||
## Come compilare
|
|
||||||
|
|
||||||
Per compilare il codice a noi assegnato, è necessario:
|
|
||||||
|
|
||||||
1. Accedere alla cartella in cui è contenuto:
|
|
||||||
```console
|
|
||||||
$ cd ./atax
|
|
||||||
```
|
|
||||||
|
|
||||||
2. Eseguire il Makefile:
|
|
||||||
```console
|
|
||||||
$ make atax.elf
|
|
||||||
```
|
|
||||||
|
|
||||||
## Come debuggare e profilare
|
|
||||||
|
|
||||||
Ho configurato il [Makefile](OpenMP/linear-algebra/kernels/atax/Makefile) con un phony target che esegue il programma 25 volte e calcola il tempo di esecuzione medio:
|
|
||||||
|
|
||||||
1. Accedere alla cartella in cui è contenuto:
|
|
||||||
```console
|
|
||||||
$ cd ./atax
|
|
||||||
```
|
|
||||||
|
|
||||||
2. Eseguire il Makefile:
|
|
||||||
```console
|
|
||||||
$ make bench
|
|
||||||
```
|
|
||||||
|
|
||||||
> Nota: funziona solo su sistemi UNIX-like!
|
|
||||||
> Nota2: ricordarsi di fare module load cuda e assegnare poi il giusto path a nvcc (sia su .vscode/c_cpp_properties.json sia nel Makefile)
|
|
69
README.md
69
README.md
|
@ -17,37 +17,86 @@
|
||||||
|
|
||||||
## Results
|
## Results
|
||||||
|
|
||||||
Results can be read in the ex.txt file where we stored all the
|
```
|
||||||
experiments done.
|
Flags: -DMINI_DATASET
|
||||||
|
CB*** Average of 3 runs: 1.03e-05 seconds
|
||||||
|
Flags: -DMINI_DATASET -DHPC_INCLUDE_INIT
|
||||||
|
CB*** Average of 3 runs: 1.27e-05 seconds
|
||||||
|
Flags: -DMINI_DATASET -DHPC_USE_CUDA
|
||||||
|
CB*** Average of 3 runs: 0.00123 seconds
|
||||||
|
Flags: -DMINI_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
||||||
|
CB*** Average of 3 runs: 0.00161 seconds
|
||||||
|
Flags: -DSMALL_DATASET
|
||||||
|
CB*** Average of 3 runs: 0.0014 seconds
|
||||||
|
Flags: -DSMALL_DATASET -DHPC_INCLUDE_INIT
|
||||||
|
CB*** Average of 3 runs: 0.00344 seconds
|
||||||
|
Flags: -DSMALL_DATASET -DHPC_USE_CUDA
|
||||||
|
CB*** Average of 3 runs: 0.00971 seconds
|
||||||
|
Flags: -DSMALL_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
||||||
|
CB*** Average of 3 runs: 0.0112 seconds
|
||||||
|
Flags: -DSTANDARD_DATASET
|
||||||
|
CB*** Average of 3 runs: 0.0876 seconds
|
||||||
|
Flags: -DSTANDARD_DATASET -DHPC_INCLUDE_INIT
|
||||||
|
CB*** Average of 3 runs: 0.188 seconds
|
||||||
|
Flags: -DSTANDARD_DATASET -DHPC_USE_CUDA
|
||||||
|
CB*** Average of 3 runs: 0.201 seconds
|
||||||
|
Flags: -DSTANDARD_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
||||||
|
CB*** Average of 3 runs: 0.0647 seconds
|
||||||
|
Flags: -DLARGE_DATASET
|
||||||
|
CB*** Average of 3 runs: 0.35 seconds
|
||||||
|
Flags: -DLARGE_DATASET -DHPC_INCLUDE_INIT
|
||||||
|
CB*** Average of 3 runs: 0.746 seconds
|
||||||
|
Flags: -DLARGE_DATASET -DHPC_USE_CUDA
|
||||||
|
CB*** Average of 3 runs: 0.26 seconds
|
||||||
|
Flags: -DLARGE_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
||||||
|
CB*** Average of 3 runs: 0.278 seconds
|
||||||
|
Flags: -DEXTRALARGE_DATASET
|
||||||
|
CB*** Average of 3 runs: 0.789 seconds
|
||||||
|
Flags: -DEXTRALARGE_DATASET -DHPC_INCLUDE_INIT
|
||||||
|
CB*** Average of 3 runs: 1.68 seconds
|
||||||
|
Flags: -DEXTRALARGE_DATASET -DHPC_USE_CUDA
|
||||||
|
CB*** Average of 3 runs: 0.647 seconds
|
||||||
|
Flags: -DEXTRALARGE_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
||||||
|
CB*** Average of 3 runs: 0.665 seconds
|
||||||
|
```
|
||||||
|
|
||||||
### Validation
|
### Validation
|
||||||
|
|
||||||
* Compiler used: **nvcc**
|
> Compiler used: **nvcc**
|
||||||
* Device used: `JETSON NANO DEVELOPER KIT`
|
> ```
|
||||||
* Built on: Mon_Mar_11_22:13:24_CDT_2019 Cuda compilation tools, release 10.0, V10.0.326
|
> Built on Mon_Mar_11_22:13:24_CDT_2019
|
||||||
|
> Cuda compilation tools, release 10.0, V10.0.326
|
||||||
|
> ```
|
||||||
|
>
|
||||||
|
> Device used: **Unimore Jetson Nano #8**
|
||||||
|
|
||||||
To reproduce the obtained results:
|
To reproduce the obtained results:
|
||||||
|
|
||||||
1. Clone the repository on @Steffo99's computer:
|
1. Load the CUDA module:
|
||||||
|
|
||||||
|
```console
|
||||||
|
$ module load cuda
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Clone the repository on @Steffo99's computer:
|
||||||
|
|
||||||
```console
|
```console
|
||||||
$ git clone https://github.com/Steffo99/unimore-hpc-assignments
|
$ git clone https://github.com/Steffo99/unimore-hpc-assignments
|
||||||
```
|
```
|
||||||
|
|
||||||
2. Checkout the exact commit the tests were executed on:
|
3. Checkout the exact commit the tests were executed on:
|
||||||
|
|
||||||
```console
|
```console
|
||||||
$ git checkout d13a9b786a53d5195ae17ef7afa776e2600ce8e0
|
$ git checkout d13a9b786a53d5195ae17ef7afa776e2600ce8e0
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Access our group's assigned folder:
|
4. Access our group's assigned folder:
|
||||||
|
|
||||||
```console
|
```console
|
||||||
$ cd unimore-hpc-assignments/atax
|
$ cd unimore-hpc-assignments/atax
|
||||||
```
|
```
|
||||||
|
|
||||||
4. Run the benchmarking script:
|
5. Run the benchmarking script:
|
||||||
|
|
||||||
```console
|
```console
|
||||||
$ make bench
|
$ make bench
|
||||||
|
|
90
ex.txt
90
ex.txt
|
@ -1,90 +0,0 @@
|
||||||
2d6448e5aa3707370b837a37db4eb880ca06ddb7
|
|
||||||
Performed on GTX 1070 driver 525.60.11 with atomic add on double number.
|
|
||||||
|
|
||||||
Flags: -DMINI_DATASET
|
|
||||||
CB*** Average of 3 runs: 3.33e-06 seconds
|
|
||||||
Flags: -DMINI_DATASET -DHPC_INCLUDE_INIT
|
|
||||||
CB*** Average of 3 runs: 8.33e-06 seconds
|
|
||||||
Flags: -DMINI_DATASET -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 6.8e-05 seconds
|
|
||||||
Flags: -DMINI_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 7.2e-05 seconds
|
|
||||||
Flags: -DSMALL_DATASET
|
|
||||||
CB*** Average of 3 runs: 0.000563 seconds
|
|
||||||
Flags: -DSMALL_DATASET -DHPC_INCLUDE_INIT
|
|
||||||
CB*** Average of 3 runs: 0.00139 seconds
|
|
||||||
Flags: -DSMALL_DATASET -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.000229 seconds
|
|
||||||
Flags: -DSMALL_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.000309 seconds
|
|
||||||
Flags: -DSTANDARD_DATASET
|
|
||||||
CB*** Average of 3 runs: 0.0276 seconds
|
|
||||||
Flags: -DSTANDARD_DATASET -DHPC_INCLUDE_INIT
|
|
||||||
CB*** Average of 3 runs: 0.0664 seconds
|
|
||||||
Flags: -DSTANDARD_DATASET -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.00938 seconds
|
|
||||||
Flags: -DSTANDARD_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.0128 seconds
|
|
||||||
Flags: -DLARGE_DATASET
|
|
||||||
CB*** Average of 3 runs: 0.109 seconds
|
|
||||||
Flags: -DLARGE_DATASET -DHPC_INCLUDE_INIT
|
|
||||||
CB*** Average of 3 runs: 0.243 seconds
|
|
||||||
Flags: -DLARGE_DATASET -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.0449 seconds
|
|
||||||
Flags: -DLARGE_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.0459 seconds
|
|
||||||
Flags: -DEXTRALARGE_DATASET
|
|
||||||
CB*** Average of 3 runs: 0.248 seconds
|
|
||||||
Flags: -DEXTRALARGE_DATASET -DHPC_INCLUDE_INIT
|
|
||||||
CB*** Average of 3 runs: 0.584 seconds
|
|
||||||
Flags: -DEXTRALARGE_DATASET -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.0971 seconds
|
|
||||||
Flags: -DEXTRALARGE_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.108 seconds
|
|
||||||
|
|
||||||
--------------------------------------------------------
|
|
||||||
d13a9b786a53d5195ae17ef7afa776e2600ce8e0
|
|
||||||
Experiment after changing a index of the vector Y
|
|
||||||
nothing special changed but i place it here.
|
|
||||||
Performed on jetson nano with atomic add on float number.
|
|
||||||
|
|
||||||
Flags: -DMINI_DATASET
|
|
||||||
CB*** Average of 3 runs: 1.03e-05 seconds
|
|
||||||
Flags: -DMINI_DATASET -DHPC_INCLUDE_INIT
|
|
||||||
CB*** Average of 3 runs: 1.27e-05 seconds
|
|
||||||
Flags: -DMINI_DATASET -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.00123 seconds
|
|
||||||
Flags: -DMINI_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.00161 seconds
|
|
||||||
Flags: -DSMALL_DATASET
|
|
||||||
CB*** Average of 3 runs: 0.0014 seconds
|
|
||||||
Flags: -DSMALL_DATASET -DHPC_INCLUDE_INIT
|
|
||||||
CB*** Average of 3 runs: 0.00344 seconds
|
|
||||||
Flags: -DSMALL_DATASET -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.00971 seconds
|
|
||||||
Flags: -DSMALL_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.0112 seconds
|
|
||||||
Flags: -DSTANDARD_DATASET
|
|
||||||
CB*** Average of 3 runs: 0.0876 seconds
|
|
||||||
Flags: -DSTANDARD_DATASET -DHPC_INCLUDE_INIT
|
|
||||||
CB*** Average of 3 runs: 0.188 seconds
|
|
||||||
Flags: -DSTANDARD_DATASET -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.201 seconds
|
|
||||||
Flags: -DSTANDARD_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.0647 seconds
|
|
||||||
Flags: -DLARGE_DATASET
|
|
||||||
CB*** Average of 3 runs: 0.35 seconds
|
|
||||||
Flags: -DLARGE_DATASET -DHPC_INCLUDE_INIT
|
|
||||||
CB*** Average of 3 runs: 0.746 seconds
|
|
||||||
Flags: -DLARGE_DATASET -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.26 seconds
|
|
||||||
Flags: -DLARGE_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.278 seconds
|
|
||||||
Flags: -DEXTRALARGE_DATASET
|
|
||||||
CB*** Average of 3 runs: 0.789 seconds
|
|
||||||
Flags: -DEXTRALARGE_DATASET -DHPC_INCLUDE_INIT
|
|
||||||
CB*** Average of 3 runs: 1.68 seconds
|
|
||||||
Flags: -DEXTRALARGE_DATASET -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.647 seconds
|
|
||||||
Flags: -DEXTRALARGE_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
|
||||||
CB*** Average of 3 runs: 0.665 seconds
|
|
Loading…
Reference in a new issue