|
ff36abefcd
|
Update README.md
|
2022-11-28 17:08:26 +01:00 |
|
|
72d867c287
|
Remove .clang-format
|
2022-11-28 17:02:20 +01:00 |
|
|
9972695a9f
|
Remove unused files
|
2022-11-28 17:02:03 +01:00 |
|
|
2bcfb17f7d
|
Inline i in print_array
|
2022-11-28 16:59:10 +01:00 |
|
|
4435d8325f
|
Fix y being allocated with the wrong size
|
2022-11-28 16:54:34 +01:00 |
|
|
bda70940f8
|
Re-add clean target to the makefile
|
2022-11-28 16:26:14 +01:00 |
|
|
32850e8131
|
Significantly simplify the makefile
|
2022-11-28 16:22:38 +01:00 |
|
|
da121a4cdc
|
Move polybench inside the main directory
To simplify the build chain
|
2022-11-28 16:21:51 +01:00 |
|
|
c812d85783
|
Configure VSC environment to use nvcc
|
2022-11-28 16:21:05 +01:00 |
|
|
f979f9332b
|
Remove trailing space
|
2022-11-28 16:11:00 +01:00 |
|
|
682d161b16
|
Remove -Wall and -Wextra
They do not exist in nvcc
|
2022-11-28 16:10:47 +01:00 |
|
|
44cfb43dac
|
Indent polybench_start_instruments in atax.cu
|
2022-11-28 15:48:27 +01:00 |
|
|
0eb45d1249
|
Tell VSC that .hu files are CUDA C++
|
2022-11-28 15:48:06 +01:00 |
|
|
6c97ed5107
|
Reformat atax.hu
|
2022-11-28 15:47:41 +01:00 |
|
|
518040a414
|
Allow including or excluding init_array via the POLYBENCH_INCLUDE_INIT macro
|
2022-11-28 15:44:40 +01:00 |
|
|
bf873d846c
|
Cleanup and format a lot of the atax.cu file
|
2022-11-28 15:43:05 +01:00 |
|
|
2f476affee
|
Exclude again init_array from the benchmark
|
2022-11-28 15:26:52 +01:00 |
|
|
f4a903371a
|
Run format document to indent code using tabs
|
2022-11-28 15:26:10 +01:00 |
|
|
f0394d1b3b
|
Indent M_PI definition
|
2022-11-28 15:23:27 +01:00 |
|
|
2ab3f9b06b
|
Try fixing the makefile
|
2022-11-28 15:08:26 +01:00 |
|
|
118b18a2a1
|
Move --silent to the bench script
|
2022-11-28 14:46:43 +01:00 |
|
|
fbfa6f3b47
|
Use tabs in Makefile
|
2022-11-28 14:41:35 +01:00 |
|
|
43deb504c9
|
Remove OpenMP pragmas
|
2022-11-28 14:38:13 +01:00 |
|
|
a2a070bb3a
|
Configure makefile to use nvcc
|
2022-11-28 14:37:37 +01:00 |
|
|
be3a4ec301
|
Update README
|
2022-11-28 14:30:41 +01:00 |
|
|
140c40bf6c
|
Improve README
|
2022-11-17 20:59:38 +01:00 |
|
|
28479dfb4b
|
Cleanup comments
|
2022-11-17 19:53:46 +01:00 |
|
|
44fe50bd4a
|
Reduce EXTRALARGE_DATASET to 12000
|
2022-11-17 19:53:39 +01:00 |
|
|
bffa050239
|
Merge branch 'master' of github.com:Steffo99/unimore-hpc-1
|
2022-11-17 19:14:17 +01:00 |
|
|
097efddbe3
|
Remove -g3 CFLAG
|
2022-11-17 19:13:26 +01:00 |
|
Fabio Zanichelli
|
0ba75336e6
|
Spostato polybench_start per cronometrare anche le inizializzazioni
|
2022-11-17 17:33:59 +01:00 |
|
|
a86d078546
|
Run bench with all dataset sizes
|
2022-11-17 14:58:32 +01:00 |
|
|
c91361ba88
|
Add indicator of progress for single runs
|
2022-11-17 03:02:20 +01:00 |
|
|
7cd8707bb9
|
Make some optimizations toggleable, so results can be compared easily
|
2022-11-17 02:59:31 +01:00 |
|
|
60a061991b
|
main : Remove commented duplicate polybench_start_instruments
|
2022-11-17 02:07:57 +01:00 |
|
|
20e653ea70
|
Fix EXTRALARGE_DATASET so it does not overflow anymore
|
2022-11-17 02:03:18 +01:00 |
|
|
e49e89817d
|
kernel_atax : Format for loops
|
2022-11-17 01:59:27 +01:00 |
|
|
1850c42a9f
|
kernel_atax : Remove nested parallelization
Seems to improve the execution time on my PC
0.0340s → 0.0317s
|
2022-11-17 01:59:09 +01:00 |
|
|
a16813dc01
|
main : Remove blank lines
|
2022-11-17 01:57:25 +01:00 |
|
|
0eb63cb684
|
init_array : Format second for loop
|
2022-11-17 01:57:13 +01:00 |
|
|
ac1ec275d7
|
print_array : Format for loop
|
2022-11-17 01:56:52 +01:00 |
|
|
c5c79e00c4
|
print_array : Add (obvious) comment
|
2022-11-17 01:56:27 +01:00 |
|
|
1e45a5adca
|
print_array : Remove newline after every 20 elements
The terminal will handle wrapping as necessary.
|
2022-11-17 01:56:08 +01:00 |
|
|
a23fd895e9
|
init_array : Parallelize the second loop
The performance hit is gone?
0.0437s → 0.0342s
|
2022-11-17 01:54:13 +01:00 |
|
FABIO ZANICHELLI
|
9dc24a3367
|
Aggiunta una reduction (al momento fa poco, magari con acceleratore va meglio), tolto un *4 perche Jetson ha 4 core CPU)
|
2022-11-16 15:01:58 -05:00 |
|
|
d89c501b59
|
kernel_atax : Parallelizing the second loop gives a nice speedup
|
2022-11-16 18:05:12 +01:00 |
|
|
9c153bb89f
|
Hide insignificant digits in the bench target
|
2022-11-16 18:03:24 +01:00 |
|
|
c104caa1a6
|
Use THREAD_COUNT instead of a fixed amount of threads
|
2022-11-16 17:51:27 +01:00 |
|
|
cc23d73254
|
Update CONTRIBUTING with the bench target
|
2022-11-16 17:44:49 +01:00 |
|
|
e23d565fd2
|
Create bench target for calculating the average of 9 runs
|
2022-11-16 17:39:09 +01:00 |
|