Data Preparation (Latin NLP with Python 05)
In this video, I speak about preparing your Latin text data for natural language processing.
from cltk.stem.latin.j_v import JVReplacer
with open (‘data/pl.txt’, “r”) as f:
text = f.read()
j = JVReplacer()
text = j.replace(text)
def clean_pl(text, lower=False):
cleaned = re.sub(r”[(.*?[)]]”, “”, text)
cleaned = cleaned.replace(” “, ” “).replace(” “, ” “)
lower_cleaned = cleaned.lower()
return (cleaned, lower_cleaned)
text = jvtext(text)
text = clean_pl(text, lower=True)
Video on Functions:
If you enjoy this video, please subscribe. I provide all my content at no cost. If you want to support my channel, please donate via
Patreon: https://www.patreon.com/WJBMattingly (its my www.themedievalworld.com account as well).
If there’s a specific video you would like to see or a tutorial series, let me know in the comments and I will try and make it.
If you liked this video, check out www.PythonHumanities.com, where I have Coding Exercises, Lessons, on-site Python shells where you can experiment with code, and a text version of the material discussed here.