bda-6-steffo/unimore_bda_6/tokenization/nltk_based.py at 965cea692aebfc4f4a21879f62a55e650a7f17ed - unimore/bda-6-steffo - Forgejo: Beyond coding. We forge.

unimore/bda-6-steffo

mirror of https://github.com/Steffo99/unimore-bda-6.git synced 2024-11-25 09:14:19 +00:00

Stefano Pigozzi 965cea692a

Refactor things to work better

2023-02-02 17:24:11 +01:00

16 lines

294 B

Python

Raw Blame History

 import nltk
 import nltk.sentiment.util
 def tokenizer(text: str) -> list[str]:
     """
     Convert a text string into a list of tokens.
     """
     tokens = nltk.word_tokenize(text)
     nltk.sentiment.util.mark_negation(tokens, shallow=True)
     return tokens
 __all__ = (
     "tokenizer",
 )