1
Fork 0
mirror of https://github.com/Steffo99/unimore-bda-6.git synced 2024-11-21 15:34:18 +00:00
Commit graph

121 commits

Author SHA1 Message Date
e2b9133bd5
Remove language reference from VanillaSA docstrings 2023-02-02 04:28:44 +01:00
a34baebeb5
Completely remove language parameter from VanillaSA 2023-02-02 04:26:58 +01:00
c212be37c3
Move language VanillaSA parameter to _tokenize_text 2023-02-02 04:26:20 +01:00
aa980012d7
Add licensing note 2023-02-02 04:23:11 +01:00
cf37d13cb4
Remove __main__ 2023-02-02 04:18:46 +01:00
ce959f18be
Convert unichr calls to chr 2023-02-02 04:18:24 +01:00
29c3d05b6c
Rename __html2unicode to __html2string 2023-02-02 04:18:08 +01:00
569f9e5359
Include typing module in Potts' tokenizer 2023-02-02 04:17:43 +01:00
c7345cb3a3
In Potts' tokenizer, use html.entities instead of htmlentitydefs 2023-02-02 04:12:56 +01:00
a85131cb58
Vendor Potts' tokenizer 2023-02-02 04:12:25 +01:00
b8acf5fc7c
Group documents in just two categories 2023-02-02 04:07:28 +01:00
ded20c33e1
Configure working set 2023-02-02 04:07:17 +01:00
14d1e1a22f
Working prototype 2023-02-02 02:56:37 +01:00
2f7237ebfa
Make some progress 2023-02-01 17:46:25 +01:00
0f37d206a1
Ignore dict alignment warnings 2023-02-01 16:03:41 +01:00
079ac312a1
Configure MongoDB integration 2023-02-01 16:03:22 +01:00
9a6e041035
Make config values optional 2023-02-01 16:03:10 +01:00
f544850403
Fix MongoDB queries 2023-02-01 16:02:52 +01:00
c2e195845e
Second commit 2023-02-01 04:20:09 +01:00
8dd77d0e3e
Remove dependabot 2023-02-01 02:33:56 +01:00
31e813bc15
First commit 2023-02-01 02:33:42 +01:00