bda-6-steffo

mirror of https://github.com/Steffo99/unimore-bda-6.git synced 2024-11-25 17:24:20 +00:00

Author	SHA1	Message	Date
Stefano Pigozzi	3d9eeecb2a	Improve the tokenizer situation	2023-02-10 05:12:07 +01:00
Stefano Pigozzi	0ce584e856	Fix tensorspec error	2023-02-10 04:21:08 +01:00
Stefano Pigozzi	0a4ce38982	Use the correct sentiment analyzer in main	2023-02-10 04:17:50 +01:00
Stefano Pigozzi	8907272002	Configure file logging	2023-02-10 04:07:34 +01:00
Stefano Pigozzi	2bd97cbad2	Count runs	2023-02-10 03:32:31 +01:00
Stefano Pigozzi	b4573d5eab	Improve logging	2023-02-10 03:30:41 +01:00
Stefano Pigozzi	14dd8045e0	Add docstring for the root module	2023-02-10 03:27:35 +01:00
Stefano Pigozzi	48ac00b548	Document and fix imports in `.tokenizer`	2023-02-10 03:19:17 +01:00
Stefano Pigozzi	1294335da0	Document and fix imports in `.database`	2023-02-10 03:18:45 +01:00
Stefano Pigozzi	218c91bcc1	Document and fix imports in `.analysis`	2023-02-10 03:17:36 +01:00
Stefano Pigozzi	7e1b4cfc71	Use staging area in IntelliJ	2023-02-10 03:17:20 +01:00
Stefano Pigozzi	1809db5f00	Refactor code	2023-02-09 18:54:58 +01:00
Stefano Pigozzi	704624507a	Add missing file	2023-02-08 19:48:41 +01:00
Stefano Pigozzi	027f8e07e8	Add validation to gitignore	2023-02-08 19:48:32 +01:00
Stefano Pigozzi	e3005ab8b0	enough	2023-02-08 19:46:05 +01:00
Stefano Pigozzi	4d6c8f0fee	stuff's working	2023-02-08 10:54:14 +01:00
Stefano Pigozzi	c31743f066	back to i have no idea of what's happening, but at least it works	2023-02-07 10:22:09 +01:00
Stefano Pigozzi	e9a4421acd	Now I understand text vectorization (but this still does not work)	2023-02-06 01:12:30 +01:00
Stefano Pigozzi	3abba24ca2	Made good progress How does text vectorization in tensorflow work?	2023-02-05 17:40:22 +01:00
Stefano Pigozzi	dcfc4fbc3b	Getting closer...	2023-02-04 06:14:24 +01:00
Stefano Pigozzi	02f10e6ae4	Use a class as DataTuple	2023-02-04 05:34:56 +01:00
Stefano Pigozzi	4af654a2fa	Make the namedtuple verbose	2023-02-04 05:29:03 +01:00
Stefano Pigozzi	2675f5ead8	Use `float` instead of `str` as `Category`	2023-02-04 05:28:18 +01:00
Stefano Pigozzi	4f24d399b8	Convert `DataTuple` to a `collections.namedtuple`	2023-02-04 05:16:54 +01:00
Stefano Pigozzi	e6dcf6e423	stop here for now	2023-02-04 01:36:42 +01:00
Stefano Pigozzi	6ef81c1c19	New version working nicely	2023-02-03 23:27:44 +01:00
Stefano Pigozzi	379cbdd13a	PEP8	2023-02-03 17:50:40 +01:00
Stefano Pigozzi	f7ef9b5ac2	Make more progress	2023-02-03 03:24:23 +01:00
Stefano Pigozzi	32cd81bca6	it works, but at what cost	2023-02-03 02:49:14 +01:00
Stefano Pigozzi	4e1a9f842f	Fix VanillaSA to work with iterators	2023-02-03 02:10:00 +01:00
Stefano Pigozzi	767a6087a8	Improve logging	2023-02-02 17:46:21 +01:00
Stefano Pigozzi	965cea692a	Refactor things to work better	2023-02-02 17:24:11 +01:00
Stefano Pigozzi	4c3f892038	Use composition instead of inheritance	2023-02-02 16:03:07 +01:00
Stefano Pigozzi	3ae43b2714	Do not create a dataset with just 2 and 4 reviews	2023-02-02 15:16:46 +01:00
Stefano Pigozzi	4344752cf6	Make some more progress for the night Many things still do not work properly	2023-02-02 05:01:31 +01:00
Stefano Pigozzi	b347031663	Allow registration of multiple custom loggers	2023-02-02 04:36:55 +01:00
Stefano Pigozzi	ab5f12f8fc	Implement basic Potts sentiment analyzer	2023-02-02 04:34:05 +01:00
Stefano Pigozzi	e2b9133bd5	Remove language reference from VanillaSA docstrings	2023-02-02 04:28:44 +01:00
Stefano Pigozzi	a34baebeb5	Completely remove `language` parameter from VanillaSA	2023-02-02 04:26:58 +01:00
Stefano Pigozzi	c212be37c3	Move `language` VanillaSA parameter to `_tokenize_text`	2023-02-02 04:26:20 +01:00
Stefano Pigozzi	aa980012d7	Add licensing note	2023-02-02 04:23:11 +01:00
Stefano Pigozzi	cf37d13cb4	Remove `__main__`	2023-02-02 04:18:46 +01:00
Stefano Pigozzi	ce959f18be	Convert `unichr` calls to `chr`	2023-02-02 04:18:24 +01:00
Stefano Pigozzi	29c3d05b6c	Rename `__html2unicode` to `__html2string`	2023-02-02 04:18:08 +01:00
Stefano Pigozzi	569f9e5359	Include `typing` module in Potts' tokenizer	2023-02-02 04:17:43 +01:00
Stefano Pigozzi	c7345cb3a3	In Potts' tokenizer, use `html.entities` instead of `htmlentitydefs`	2023-02-02 04:12:56 +01:00
Stefano Pigozzi	a85131cb58	Vendor Potts' tokenizer	2023-02-02 04:12:25 +01:00
Stefano Pigozzi	b8acf5fc7c	Group documents in just two categories	2023-02-02 04:07:28 +01:00
Stefano Pigozzi	ded20c33e1	Configure working set	2023-02-02 04:07:17 +01:00
Stefano Pigozzi	14d1e1a22f	Working prototype	2023-02-02 02:56:37 +01:00

1 2

58 commits