Part of Speech Tagging – Natural Language Processing With Python and NLTK p.4




[ad_1]

Part of Speech tagging does exactly what it sounds like, it tags each word in a sentence with the part of speech for that word. This means it labels words as noun, adjective, verb, etc. PoS tagging also covers tenses of the parts of speech.

This is normally quite the challenge, but NLTK makes this pretty darn simple!

sample code: http://pythonprogramming.net
http://hkinsley.com
https://twitter.com/sentdex
http://sentdex.com
http://seaofbtc.com

Source


[ad_2]

Comment List

  • sentdex
    November 24, 2020

    Thank you so much for sharing!

  • sentdex
    November 24, 2020

    it misclassified "strategery"

  • sentdex
    November 24, 2020

    "Hold on. Everybody settle down. This is the entire state of the union address" <Ad break> "Aaah This will be fun. Lets do it .Full steam ahead" 😊 😊 😊 (Its called chiong https://en.wiktionary.org/wiki/chiong )

  • sentdex
    November 24, 2020

    I want to know what is rule based and what is machine learning. Tokenizer and Stemmer in the previous videos were rule based? It's hard to find out since none of it is really labeled. What else is rule based in nltk? What else is model based? I'd really appreciate a thorough and complete answer!!!

  • sentdex
    November 24, 2020

    Hi,
    Can you make an extension of the video and create a tabular form of each POSs?

  • sentdex
    November 24, 2020

    New follower/subscriber – and still really enjoy these old videos! Thank you! Do you have any videos or suggestions how to take tokenized/stemmed words and put them in a dataframe for anayltics use? Thanks again for the great content!

  • sentdex
    November 24, 2020

    haha what a nerd

  • sentdex
    November 24, 2020

    sir you resemble mark zukerberg ; )

  • sentdex
    November 24, 2020

    What is the use of POS tagging?

  • sentdex
    November 24, 2020

    Use state_union.fileids() to get a list of all the speeches available if you want a democrat president instead. Of course you can always use help(state_union) to see all functions available.

  • sentdex
    November 24, 2020

    Is it better to use PunktSrentenceTokenizer rather than SentenceTokenizer?
    if yes then why?
    & with what data I should train it with to tokenize some specific type of data (example: Medical records)

  • sentdex
    November 24, 2020

    Hello, nice explanation thank you .
    What is the use of speech tagging ?

  • sentdex
    November 24, 2020

    Why did you tokenize twice?

  • sentdex
    November 24, 2020

    really awesome work. This vid could also have been started by sample sentence & then moving on to state_union. Anyways you rock & keep up the awesome stuff.

  • sentdex
    November 24, 2020

    Following can be utilised to retrieve POS tags list
    import nltk
    nltk.help.upenn_tagset()

  • sentdex
    November 24, 2020

    Thanks for sharing. In the case the tag is "None", how you can add the unknown word?

  • sentdex
    November 24, 2020

    please make a video on ML kit

  • sentdex
    November 24, 2020

    Would love to see more videos on NLP, keep up the great work! 🙂

  • sentdex
    November 24, 2020

    Para quem acompanha do Brasil e, deseja fazer o taggeamento em português, o Nltk-Tagger-Portuguese disponível em: https://github.com/fmaruki/Nltk-Tagger-Portuguese funciona de uma forma bem razoável.

  • sentdex
    November 24, 2020

    it is bit confusing why are we using PunktSentenceTokenizer what it does

  • sentdex
    November 24, 2020

    No module named 'nltk.corpus'; 'nltk' is not a package

  • sentdex
    November 24, 2020

    How can we use tagging to count the number of verbs ,nouns , etc in a sentence ?

  • sentdex
    November 24, 2020

    Can I know why you throw the exception in the function you have written ?

  • sentdex
    November 24, 2020

    more than everything, I'm amazed just by your type speed. :))

  • sentdex
    November 24, 2020

    Got an error while tokenizing different text.
    Value Error: The truth of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
    I don't know how to solve. Need your help. Please

  • sentdex
    November 24, 2020

    How i can create my own POS please

  • sentdex
    November 24, 2020

    How to do tokenization of each and every sentence from collected document in orderwise using python

  • sentdex
    November 24, 2020

    wow thanks man! quick question: what if 'stemming' firstly and then followed by this 'punktsentenceTokenizer'? The pasted tense/present continuous probably will be stemmed as 'present simple'. So the tense may not be recognized by 'punktsentenceTokenizer'?

  • sentdex
    November 24, 2020

    how we can do Noun-Adjective Pairs Extraction
    ?

Write a comment