|
e2b9133bd5
|
Remove language reference from VanillaSA docstrings
|
2023-02-02 04:28:44 +01:00 |
|
|
a34baebeb5
|
Completely remove language parameter from VanillaSA
|
2023-02-02 04:26:58 +01:00 |
|
|
c212be37c3
|
Move language VanillaSA parameter to _tokenize_text
|
2023-02-02 04:26:20 +01:00 |
|
|
aa980012d7
|
Add licensing note
|
2023-02-02 04:23:11 +01:00 |
|
|
cf37d13cb4
|
Remove __main__
|
2023-02-02 04:18:46 +01:00 |
|
|
ce959f18be
|
Convert unichr calls to chr
|
2023-02-02 04:18:24 +01:00 |
|
|
29c3d05b6c
|
Rename __html2unicode to __html2string
|
2023-02-02 04:18:08 +01:00 |
|
|
569f9e5359
|
Include typing module in Potts' tokenizer
|
2023-02-02 04:17:43 +01:00 |
|
|
c7345cb3a3
|
In Potts' tokenizer, use html.entities instead of htmlentitydefs
|
2023-02-02 04:12:56 +01:00 |
|
|
a85131cb58
|
Vendor Potts' tokenizer
|
2023-02-02 04:12:25 +01:00 |
|
|
b8acf5fc7c
|
Group documents in just two categories
|
2023-02-02 04:07:28 +01:00 |
|
|
ded20c33e1
|
Configure working set
|
2023-02-02 04:07:17 +01:00 |
|
|
14d1e1a22f
|
Working prototype
|
2023-02-02 02:56:37 +01:00 |
|
|
2f7237ebfa
|
Make some progress
|
2023-02-01 17:46:25 +01:00 |
|
|
0f37d206a1
|
Ignore dict alignment warnings
|
2023-02-01 16:03:41 +01:00 |
|
|
079ac312a1
|
Configure MongoDB integration
|
2023-02-01 16:03:22 +01:00 |
|
|
9a6e041035
|
Make config values optional
|
2023-02-01 16:03:10 +01:00 |
|
|
f544850403
|
Fix MongoDB queries
|
2023-02-01 16:02:52 +01:00 |
|
|
c2e195845e
|
Second commit
|
2023-02-01 04:20:09 +01:00 |
|
|
8dd77d0e3e
|
Remove dependabot
|
2023-02-01 02:33:56 +01:00 |
|
|
31e813bc15
|
First commit
|
2023-02-01 02:33:42 +01:00 |
|