1
Fork 0
mirror of https://github.com/Steffo99/unimore-hpc-assignments.git synced 2025-02-16 17:13:57 +00:00

Document and improve a bit more

This commit is contained in:
Steffo 2022-12-02 01:10:05 +01:00
parent 2d6448e5aa
commit 86601b266e
Signed by: steffo
GPG key ID: 6965406171929D01
2 changed files with 47 additions and 27 deletions

View file

@ -15,30 +15,61 @@
>
> Group 3: `OpenMP/linear-algebra/kernels/atax`
## Developed features
* [Workaround for unavailable `M_PI`](https://github.com/Steffo99/unimore-hpc-1/blob/bffa0502393d97e7cda4ac34c57dd9c3ac9ac9dc/OpenMP/linear-algebra/kernels/atax/atax.c#L13-L18)
* [Enabled `POLYBENCH_TIME` by default](https://github.com/Steffo99/unimore-hpc-1/blob/bffa0502393d97e7cda4ac34c57dd9c3ac9ac9dc/OpenMP/linear-algebra/kernels/atax/Makefile#L4-L5)
* [Enabled extra warnings by default](https://github.com/Steffo99/unimore-hpc-1/blob/bffa0502393d97e7cda4ac34c57dd9c3ac9ac9dc/OpenMP/linear-algebra/kernels/atax/Makefile#L6-L8)
* [Applied the maximum level of compiler optimizations](https://github.com/Steffo99/unimore-hpc-1/blob/bffa0502393d97e7cda4ac34c57dd9c3ac9ac9dc/OpenMP/linear-algebra/kernels/atax/Makefile#L9-L10)
* [Replaced `tmp` array with a iteration-local variable](https://github.com/Steffo99/unimore-hpc-1/commit/7fc2506cc7c6743288a56047cbb44e960abec4fc)
* [Allowed the addition of `CFLAGS` from `make` calls](https://github.com/Steffo99/unimore-hpc-1/blob/bffa0502393d97e7cda4ac34c57dd9c3ac9ac9dc/OpenMP/linear-algebra/kernels/atax/Makefile#L13-L14)
* [Disabled `make` output](https://github.com/Steffo99/unimore-hpc-1/commit/f655df0eb7e539b06965de7c79dbc1c7bc6a5950)
* [Created `make bench` target to run](https://github.com/Steffo99/unimore-hpc-1/blob/bffa0502393d97e7cda4ac34c57dd9c3ac9ac9dc/OpenMP/linear-algebra/kernels/atax/Makefile#L20-L23) [a script performing multiple parametrized tests on the code to determine the best optimizations](https://github.com/Steffo99/unimore-hpc-1/blob/master/OpenMP/linear-algebra/kernels/atax/.bench.sh)
* [Moved `polybench_start_instruments` to include the `init_array` function](https://github.com/Steffo99/unimore-hpc-1/commit/0ba75336e60b1cf149684a5f259fa933a36e2c5c)
## Results
TBD
```console
steffo@nitro:/s/D/W/S/u/atax[130]$ make bench
./.bench.sh
Flags: -DMINI_DATASET
CB*** Average of 3 runs: 3.33e-06 seconds
Flags: -DMINI_DATASET -DHPC_INCLUDE_INIT
CB*** Average of 3 runs: 8.33e-06 seconds
Flags: -DMINI_DATASET -DHPC_USE_CUDA
CB*** Average of 3 runs: 6.8e-05 seconds
Flags: -DMINI_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
CB*** Average of 3 runs: 7.2e-05 seconds
Flags: -DSMALL_DATASET
CB*** Average of 3 runs: 0.000563 seconds
Flags: -DSMALL_DATASET -DHPC_INCLUDE_INIT
CB*** Average of 3 runs: 0.00139 seconds
Flags: -DSMALL_DATASET -DHPC_USE_CUDA
CB*** Average of 3 runs: 0.000229 seconds
Flags: -DSMALL_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
CB*** Average of 3 runs: 0.000309 seconds
Flags: -DSTANDARD_DATASET
CB*** Average of 3 runs: 0.0276 seconds
Flags: -DSTANDARD_DATASET -DHPC_INCLUDE_INIT
CB*** Average of 3 runs: 0.0664 seconds
Flags: -DSTANDARD_DATASET -DHPC_USE_CUDA
CB*** Average of 3 runs: 0.00938 seconds
Flags: -DSTANDARD_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
CB*** Average of 3 runs: 0.0128 seconds
Flags: -DLARGE_DATASET
CB*** Average of 3 runs: 0.109 seconds
Flags: -DLARGE_DATASET -DHPC_INCLUDE_INIT
CB*** Average of 3 runs: 0.243 seconds
Flags: -DLARGE_DATASET -DHPC_USE_CUDA
CB*** Average of 3 runs: 0.0449 seconds
Flags: -DLARGE_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
CB*** Average of 3 runs: 0.0459 seconds
Flags: -DEXTRALARGE_DATASET
CB*** Average of 3 runs: 0.248 seconds
Flags: -DEXTRALARGE_DATASET -DHPC_INCLUDE_INIT
CB*** Average of 3 runs: 0.584 seconds
Flags: -DEXTRALARGE_DATASET -DHPC_USE_CUDA
CB*** Average of 3 runs: 0.0971 seconds
Flags: -DEXTRALARGE_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
CB*** Average of 3 runs: 0.108 seconds
```
### Validation
* Compiler used: **nvcc**
* Jetson Nano used: `8`
* Device used: `NVIDIA GTX 1070` with `525.60.11` driver
To reproduce the obtained results:
1. Clone the repository on a Jetson Nano:
1. Clone the repository on @Steffo99's computer:
```console
$ git clone https://github.com/Steffo99/unimore-hpc-assignments
@ -47,7 +78,7 @@ To reproduce the obtained results:
2. Checkout the exact commit the tests were executed on:
```console
$ git checkout TBD
$ git checkout 2d6448e5aa3707370b837a37db4eb880ca06ddb7
```
3. Access our group's assigned folder:

View file

@ -66,27 +66,16 @@ __host__ inline static void print_cudaError(cudaError_t err, std::string txt) {
#ifndef HPC_USE_CUDA
__host__ static void init_array(DATA_TYPE* A, DATA_TYPE* X, DATA_TYPE* Y)
{
/* X = [ 3.14, 6.28, 9.42, ... ] */
for (unsigned int y = 0; y < NY; y++)
{
X[y] = y * M_PI;
}
/* Y = [ 0.00, 0.00, 0.00, ... ] */
for (unsigned int x = 0; x < NY; x++)
{
Y[x] = 0;
}
/*
* A = [
* [ 0, 0, 0, 0, ... ],
* [ 1 / NX, 2 / NX, 3 / NX, 4 / NX, ... ],
* [ 2 / NX, 4 / NX, 6 / NX, 8 / NX, ... ],
* [ 3 / NX, 6 / NX, 9 / NX, 12 / NX, ... ],
* ...
* ]
*/
for (unsigned int x = 0; x < NX; x++)
{
for (unsigned int y = 0; y < NY; y++)