mirror of
https://github.com/Steffo99/unimore-hpc-assignments.git
synced 2025-02-16 17:13:57 +00:00
Document and improve a bit more
This commit is contained in:
parent
2d6448e5aa
commit
86601b266e
2 changed files with 47 additions and 27 deletions
63
README.md
63
README.md
|
@ -15,30 +15,61 @@
|
|||
>
|
||||
> Group 3: `OpenMP/linear-algebra/kernels/atax`
|
||||
|
||||
## Developed features
|
||||
|
||||
* [Workaround for unavailable `M_PI`](https://github.com/Steffo99/unimore-hpc-1/blob/bffa0502393d97e7cda4ac34c57dd9c3ac9ac9dc/OpenMP/linear-algebra/kernels/atax/atax.c#L13-L18)
|
||||
* [Enabled `POLYBENCH_TIME` by default](https://github.com/Steffo99/unimore-hpc-1/blob/bffa0502393d97e7cda4ac34c57dd9c3ac9ac9dc/OpenMP/linear-algebra/kernels/atax/Makefile#L4-L5)
|
||||
* [Enabled extra warnings by default](https://github.com/Steffo99/unimore-hpc-1/blob/bffa0502393d97e7cda4ac34c57dd9c3ac9ac9dc/OpenMP/linear-algebra/kernels/atax/Makefile#L6-L8)
|
||||
* [Applied the maximum level of compiler optimizations](https://github.com/Steffo99/unimore-hpc-1/blob/bffa0502393d97e7cda4ac34c57dd9c3ac9ac9dc/OpenMP/linear-algebra/kernels/atax/Makefile#L9-L10)
|
||||
* [Replaced `tmp` array with a iteration-local variable](https://github.com/Steffo99/unimore-hpc-1/commit/7fc2506cc7c6743288a56047cbb44e960abec4fc)
|
||||
* [Allowed the addition of `CFLAGS` from `make` calls](https://github.com/Steffo99/unimore-hpc-1/blob/bffa0502393d97e7cda4ac34c57dd9c3ac9ac9dc/OpenMP/linear-algebra/kernels/atax/Makefile#L13-L14)
|
||||
* [Disabled `make` output](https://github.com/Steffo99/unimore-hpc-1/commit/f655df0eb7e539b06965de7c79dbc1c7bc6a5950)
|
||||
* [Created `make bench` target to run](https://github.com/Steffo99/unimore-hpc-1/blob/bffa0502393d97e7cda4ac34c57dd9c3ac9ac9dc/OpenMP/linear-algebra/kernels/atax/Makefile#L20-L23) [a script performing multiple parametrized tests on the code to determine the best optimizations](https://github.com/Steffo99/unimore-hpc-1/blob/master/OpenMP/linear-algebra/kernels/atax/.bench.sh)
|
||||
* [Moved `polybench_start_instruments` to include the `init_array` function](https://github.com/Steffo99/unimore-hpc-1/commit/0ba75336e60b1cf149684a5f259fa933a36e2c5c)
|
||||
|
||||
## Results
|
||||
|
||||
TBD
|
||||
```console
|
||||
steffo@nitro:/s/D/W/S/u/atax[130]$ make bench
|
||||
./.bench.sh
|
||||
Flags: -DMINI_DATASET
|
||||
CB*** Average of 3 runs: 3.33e-06 seconds
|
||||
Flags: -DMINI_DATASET -DHPC_INCLUDE_INIT
|
||||
CB*** Average of 3 runs: 8.33e-06 seconds
|
||||
Flags: -DMINI_DATASET -DHPC_USE_CUDA
|
||||
CB*** Average of 3 runs: 6.8e-05 seconds
|
||||
Flags: -DMINI_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
||||
CB*** Average of 3 runs: 7.2e-05 seconds
|
||||
Flags: -DSMALL_DATASET
|
||||
CB*** Average of 3 runs: 0.000563 seconds
|
||||
Flags: -DSMALL_DATASET -DHPC_INCLUDE_INIT
|
||||
CB*** Average of 3 runs: 0.00139 seconds
|
||||
Flags: -DSMALL_DATASET -DHPC_USE_CUDA
|
||||
CB*** Average of 3 runs: 0.000229 seconds
|
||||
Flags: -DSMALL_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
||||
CB*** Average of 3 runs: 0.000309 seconds
|
||||
Flags: -DSTANDARD_DATASET
|
||||
CB*** Average of 3 runs: 0.0276 seconds
|
||||
Flags: -DSTANDARD_DATASET -DHPC_INCLUDE_INIT
|
||||
CB*** Average of 3 runs: 0.0664 seconds
|
||||
Flags: -DSTANDARD_DATASET -DHPC_USE_CUDA
|
||||
CB*** Average of 3 runs: 0.00938 seconds
|
||||
Flags: -DSTANDARD_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
||||
CB*** Average of 3 runs: 0.0128 seconds
|
||||
Flags: -DLARGE_DATASET
|
||||
CB*** Average of 3 runs: 0.109 seconds
|
||||
Flags: -DLARGE_DATASET -DHPC_INCLUDE_INIT
|
||||
CB*** Average of 3 runs: 0.243 seconds
|
||||
Flags: -DLARGE_DATASET -DHPC_USE_CUDA
|
||||
CB*** Average of 3 runs: 0.0449 seconds
|
||||
Flags: -DLARGE_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
||||
CB*** Average of 3 runs: 0.0459 seconds
|
||||
Flags: -DEXTRALARGE_DATASET
|
||||
CB*** Average of 3 runs: 0.248 seconds
|
||||
Flags: -DEXTRALARGE_DATASET -DHPC_INCLUDE_INIT
|
||||
CB*** Average of 3 runs: 0.584 seconds
|
||||
Flags: -DEXTRALARGE_DATASET -DHPC_USE_CUDA
|
||||
CB*** Average of 3 runs: 0.0971 seconds
|
||||
Flags: -DEXTRALARGE_DATASET -DHPC_INCLUDE_INIT -DHPC_USE_CUDA
|
||||
CB*** Average of 3 runs: 0.108 seconds
|
||||
```
|
||||
|
||||
### Validation
|
||||
|
||||
* Compiler used: **nvcc**
|
||||
* Jetson Nano used: `8`
|
||||
* Device used: `NVIDIA GTX 1070` with `525.60.11` driver
|
||||
|
||||
To reproduce the obtained results:
|
||||
|
||||
1. Clone the repository on a Jetson Nano:
|
||||
1. Clone the repository on @Steffo99's computer:
|
||||
|
||||
```console
|
||||
$ git clone https://github.com/Steffo99/unimore-hpc-assignments
|
||||
|
@ -47,7 +78,7 @@ To reproduce the obtained results:
|
|||
2. Checkout the exact commit the tests were executed on:
|
||||
|
||||
```console
|
||||
$ git checkout TBD
|
||||
$ git checkout 2d6448e5aa3707370b837a37db4eb880ca06ddb7
|
||||
```
|
||||
|
||||
3. Access our group's assigned folder:
|
||||
|
|
11
atax/atax.cu
11
atax/atax.cu
|
@ -66,27 +66,16 @@ __host__ inline static void print_cudaError(cudaError_t err, std::string txt) {
|
|||
#ifndef HPC_USE_CUDA
|
||||
__host__ static void init_array(DATA_TYPE* A, DATA_TYPE* X, DATA_TYPE* Y)
|
||||
{
|
||||
/* X = [ 3.14, 6.28, 9.42, ... ] */
|
||||
for (unsigned int y = 0; y < NY; y++)
|
||||
{
|
||||
X[y] = y * M_PI;
|
||||
}
|
||||
|
||||
/* Y = [ 0.00, 0.00, 0.00, ... ] */
|
||||
for (unsigned int x = 0; x < NY; x++)
|
||||
{
|
||||
Y[x] = 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* A = [
|
||||
* [ 0, 0, 0, 0, ... ],
|
||||
* [ 1 / NX, 2 / NX, 3 / NX, 4 / NX, ... ],
|
||||
* [ 2 / NX, 4 / NX, 6 / NX, 8 / NX, ... ],
|
||||
* [ 3 / NX, 6 / NX, 9 / NX, 12 / NX, ... ],
|
||||
* ...
|
||||
* ]
|
||||
*/
|
||||
for (unsigned int x = 0; x < NX; x++)
|
||||
{
|
||||
for (unsigned int y = 0; y < NY; y++)
|
||||
|
|
Loading…
Add table
Reference in a new issue