1
Fork 0
mirror of https://github.com/Steffo99/unimore-bda-6.git synced 2024-11-25 17:24:20 +00:00
Commit graph

58 commits

Author SHA1 Message Date
3d9eeecb2a
Improve the tokenizer situation 2023-02-10 05:12:07 +01:00
0ce584e856
Fix tensorspec error 2023-02-10 04:21:08 +01:00
0a4ce38982
Use the correct sentiment analyzer in main 2023-02-10 04:17:50 +01:00
8907272002
Configure file logging 2023-02-10 04:07:34 +01:00
2bd97cbad2
Count runs 2023-02-10 03:32:31 +01:00
b4573d5eab
Improve logging 2023-02-10 03:30:41 +01:00
14dd8045e0
Add docstring for the root module 2023-02-10 03:27:35 +01:00
48ac00b548
Document and fix imports in .tokenizer 2023-02-10 03:19:17 +01:00
1294335da0
Document and fix imports in .database 2023-02-10 03:18:45 +01:00
218c91bcc1
Document and fix imports in .analysis 2023-02-10 03:17:36 +01:00
7e1b4cfc71
Use staging area in IntelliJ 2023-02-10 03:17:20 +01:00
1809db5f00
Refactor code 2023-02-09 18:54:58 +01:00
704624507a
Add missing file 2023-02-08 19:48:41 +01:00
027f8e07e8
Add validation to gitignore 2023-02-08 19:48:32 +01:00
e3005ab8b0
enough 2023-02-08 19:46:05 +01:00
4d6c8f0fee
stuff's working 2023-02-08 10:54:14 +01:00
c31743f066
back to i have no idea of what's happening, but at least it works 2023-02-07 10:22:09 +01:00
e9a4421acd
Now I understand text vectorization (but this still does not work) 2023-02-06 01:12:30 +01:00
3abba24ca2
Made good progress
How does text vectorization in tensorflow work?
2023-02-05 17:40:22 +01:00
dcfc4fbc3b
Getting closer... 2023-02-04 06:14:24 +01:00
02f10e6ae4
Use a class as DataTuple 2023-02-04 05:34:56 +01:00
4af654a2fa
Make the namedtuple verbose 2023-02-04 05:29:03 +01:00
2675f5ead8
Use float instead of str as Category 2023-02-04 05:28:18 +01:00
4f24d399b8
Convert DataTuple to a collections.namedtuple 2023-02-04 05:16:54 +01:00
e6dcf6e423
stop here for now 2023-02-04 01:36:42 +01:00
6ef81c1c19
New version working nicely 2023-02-03 23:27:44 +01:00
379cbdd13a
PEP8 2023-02-03 17:50:40 +01:00
f7ef9b5ac2
Make more progress 2023-02-03 03:24:23 +01:00
32cd81bca6
it works, but at what cost 2023-02-03 02:49:14 +01:00
4e1a9f842f
Fix VanillaSA to work with iterators 2023-02-03 02:10:00 +01:00
767a6087a8
Improve logging 2023-02-02 17:46:21 +01:00
965cea692a
Refactor things to work better 2023-02-02 17:24:11 +01:00
4c3f892038
Use composition instead of inheritance 2023-02-02 16:03:07 +01:00
3ae43b2714
Do not create a dataset with just 2 and 4 reviews 2023-02-02 15:16:46 +01:00
4344752cf6
Make some more progress for the night
Many things still do not work properly
2023-02-02 05:01:31 +01:00
b347031663
Allow registration of multiple custom loggers 2023-02-02 04:36:55 +01:00
ab5f12f8fc
Implement basic Potts sentiment analyzer 2023-02-02 04:34:05 +01:00
e2b9133bd5
Remove language reference from VanillaSA docstrings 2023-02-02 04:28:44 +01:00
a34baebeb5
Completely remove language parameter from VanillaSA 2023-02-02 04:26:58 +01:00
c212be37c3
Move language VanillaSA parameter to _tokenize_text 2023-02-02 04:26:20 +01:00
aa980012d7
Add licensing note 2023-02-02 04:23:11 +01:00
cf37d13cb4
Remove __main__ 2023-02-02 04:18:46 +01:00
ce959f18be
Convert unichr calls to chr 2023-02-02 04:18:24 +01:00
29c3d05b6c
Rename __html2unicode to __html2string 2023-02-02 04:18:08 +01:00
569f9e5359
Include typing module in Potts' tokenizer 2023-02-02 04:17:43 +01:00
c7345cb3a3
In Potts' tokenizer, use html.entities instead of htmlentitydefs 2023-02-02 04:12:56 +01:00
a85131cb58
Vendor Potts' tokenizer 2023-02-02 04:12:25 +01:00
b8acf5fc7c
Group documents in just two categories 2023-02-02 04:07:28 +01:00
ded20c33e1
Configure working set 2023-02-02 04:07:17 +01:00
14d1e1a22f
Working prototype 2023-02-02 02:56:37 +01:00