Building Database – Creating a Chatbot with Deep Learning, Python, and TensorFlow p.5




[ad_1]

Welcome to part 5 of the chatbot with Python and TensorFlow tutorial series. Leading up to this tutorial, we’ve been working with our data and preparing the logic for how we want to insert it, now we’re ready to start inserting.

Text tutorials and sample code: https://pythonprogramming.net/
https://pythonprogramming.net/support-donate/
https://twitter.com/sentdex
https://www.facebook.com/pythonprogramming.net/
https://www.twitch.tv/sentdex
https://plus.google.com/+sentdex

Source


[ad_2]

Comment List

  • sentdex
    January 7, 2021

    If you have the parent_id, then why is it that you can't find out the parent_data? I didn't understand this. Can someone explain please?

  • sentdex
    January 7, 2021

    I'm following along exactly as you're showing (i've even copy and pasted just to prove it wasn't just my code {because I did my own thing while following along} that was having the problem) but I keep getting… " Invalid control character at: line 1 column 497 (char 496)
    "…no matter what I do. I've been trying to figure it out for hours

  • sentdex
    January 7, 2021

    getting an OSError: [Errno 22] Invalid argument: 'C:\Chatboxdatareddit_data/2015/RC_2015-07' helllllllppp

  • sentdex
    January 7, 2021

    My total rows read is currently going beyond 5 million, how the hell did his end at 100,000 while he is using a much bigger RC file???

  • sentdex
    January 7, 2021

    If anyone's using 2019-12 data,
    parent_id = row['parent_id'][3:]
    comment_id = row['id'] #Not sure if this is right!
    Check sentdex's pythonprogrammingtutorial website for his complete code. He did not write everything the same way in his video.

  • sentdex
    January 7, 2021

    parent_id = row['parent_id'].split('_')[1]

    comment_id = row['id']
    this is my part of the code, it throw paired_rows : 0.
    How to fix this?

    I use RC_2019-12 .

  • sentdex
    January 7, 2021

    i am getting "Error: duplicate key value violates unique constraint "parent_reply_pkey", why are u inserting duplicate parent_id?

  • sentdex
    January 7, 2021

    the 2015-01 db has not matching parent and comment id's.

  • sentdex
    January 7, 2021

    I got paired_rows: 0, dont know why

  • sentdex
    January 7, 2021

    i have inserted around 30 million rows to the db, shall i continue or is it enough??

  • sentdex
    January 7, 2021

    16:52 lmao xD xD

  • sentdex
    January 7, 2021

    hey, if you are using 2018 and more recent ones you have to replace the comment id to:
    coment_id = 't1_' + row['id']

  • sentdex
    January 7, 2021

    for 2011-08 data what will be comment_id?

  • sentdex
    January 7, 2021

    I get all comments and parent false! What should I do?

  • sentdex
    January 7, 2021

    json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) any help??

  • sentdex
    January 7, 2021

    my problem is the weirdest, i have compiled everything correctly. But my parent reply seems to be empty. It took me 19hrs to make the database. Plz help me

  • sentdex
    January 7, 2021

    "Getting through all of the data will depend on the size of the starting file. Inserting will slow down the larger the database gets. To do the entire May 2015 file, it will probably take 5-10 hrs."

    Should I wait until all of the entire May 2015 is loaded or something?

  • sentdex
    January 7, 2021

    and im getting paired rows = 0

  • sentdex
    January 7, 2021

    sir why is that my data not getting loaded in the db file… i have no errors too!!
    pls do reply

  • sentdex
    January 7, 2021

    why am I getting ''s0 insertion', "'ascii' codec can't encode character u'\xe9' in position 53: ordinal not in range(128)"' when I run this? :/

  • sentdex
    January 7, 2021

    The code returns an error saying charmap codec can't decode byte 0x90…
    After putting encoding = utf8 inside file open
    Again it throws error utf8 can't decode byte 0xb5
    Then changing encoding to latin1
    It throws error – json. decoder. JSONDecodeError

  • sentdex
    January 7, 2021

    im getting 0 paired rows with 2015-05. Dont understand why though

  • sentdex
    January 7, 2021

    hey i'm still trying to get the hang of all of this, how would to put all the file to make one database?
    like timeframe = '2007-10' '2007-11' and so on most like a list that splits and read all files at the same time and builds a database in one database file, yes it might take a long time for it to complete
    but i'm a lit of confused about how to really put It in coding

  • sentdex
    January 7, 2021

    Traceback (most recent call last):

    File "C:/Users/Maniech/Desktop/chatbot/b.py", line 93, in <module>

    with open("C:/Users/Maniech/Desktop/chatbot/{}/RC_{}".format(timeframe.split('-')[0],timeframe), buffering=1000) as f:

    FileNotFoundError: [Errno 2] No such file or directory: 'C:/Users/Maniech/Desktop/chatbot/2015/RC_2015-01'

    enyone can u tell me what is the error

  • sentdex
    January 7, 2021

    Everytime i run the code i get this error:
    Traceback (most recent call last):

    File "C:/Desktop/coding/AI/chatbot/database.py", line 77, in <module>

    row=json.loads(row)

    File "C:UserssebasAnaconda3envstensorlibjson__init__.py", line 354, in loads

    return _default_decoder.decode(s)

    File "C:UserssebasAnaconda3envstensorlibjsondecoder.py", line 339, in decode

    obj, end = self.raw_decode(s, idx=_w(s, 0).end())

    File "C:UserssebasAnaconda3envstensorlibjsondecoder.py", line 355, in raw_decode

    obj, end = self.scan_once(s, idx)

    json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 374 (char 373)

    but before that it prints out a large amount of messages like "find_parent no such column: t1_cnapz1h

    "

  • sentdex
    January 7, 2021

    Hi, a bit of help here please.
    It's been 12 hours and my process is not yet completed.

    I am using 2015-01 data as in the tutorial. The .db file has reached over 6.5gb in size. Is it supposed to talk this long?

  • sentdex
    January 7, 2021

    For those who are getting paired == 0 you can try
    parent_id = row['parent_id'].split("_")[1]
    comment_id = "t1_" + row['id']
    worked for 2019_07 Database

  • sentdex
    January 7, 2021

    anyone can u help me i am beginner,. In this tutorial sir do not write the sql quires, sir just copy paste the sql quires and he said this sql quires given link in the description but i don,t see this quires please help me

  • sentdex
    January 7, 2021

    Harrison! Too many nested ifs! There's an operator called and in python. Use it my son!

  • sentdex
    January 7, 2021

    has anybody attempted this on kaggle? there is data set of one month of may there but somehow i cant open that dataset. there is some zip error.

  • sentdex
    January 7, 2021

    I have a 5gb compressed file of jan 2015. could anyone tell me what will be the exact size of the unzipped file. I am having space issues.

  • sentdex
    January 7, 2021

    Is it possible to make this code run faster on 8x core 16 threads processor? I mean like make it use all the threads possible?

  • sentdex
    January 7, 2021

    I'm getting parent column completely as NULL. I have copied the same code from his website and same file too. But this issue is percieving.

  • sentdex
    January 7, 2021

    I'm getting constant rows of,
    "Expecting value: line 1 column 1 (char 0)"

    what do I do?

  • sentdex
    January 7, 2021

    Hi @sentdex could you please help me out I am getting error : comment_id = row['data']
    keyError: 'data'

  • sentdex
    January 7, 2021

    Paired rows = 0 for RC_2015-01 data..Tried all possible options """parent_id = row['parent_id'].split('_')[1]

    comment_id = row['name'].split('_')[1]

    comment_id = row['id']""" no luck …Anyone tried this data ???

  • sentdex
    January 7, 2021

    The same mistake I do my whole life: Following outdated descriptions

    Edit: It works! Shame on me! Thx Sentdex and Daniel Kukila | if that was his name👍🏾🤪🤙🏾

  • sentdex
    January 7, 2021

    I don't know why but the database isn't being created for me. Its working fine and its reading it too but no database… Can someone please help me out???

  • sentdex
    January 7, 2021

    Can anyone share the code? It seems that the code has been removed from the website.

  • sentdex
    January 7, 2021

    !! EDIT !!
    For anyone having the same issue as I have, I fixed it by using the code from the website and changed the comment_id = row['name'] to comment_id = row[id']. Change parent_id to parent_id = row['parent_id'].split('_')[1]
    Also don't forget to change the timeframe at the top. I've changed nothing else, and I was able to generate it with the error message I got in the replace_comment function where most of the comments tell you to change the ? to {}. You don't need to do this for it to function! I hope this helps someone in the future

    For some reason I'm not getting any paired rows, the total rows reads ups just fine. Anyone knows what might cause this? Using this on dataset RC_2019-02

  • sentdex
    January 7, 2021

    the database is growing but my command prompt is showing this

    ('s-PARENT insertion', "'ascii' codec can't encode character u'\u2019' in position 10: ordinal not in range(128)")
    ('s-NO_PARENT insertion', "'ascii' codec can't encode character u'\u2019' in position 100: ordinal not in range(128)")
    ('s-NO_PARENT insertion', "'ascii' codec can't encode character u'\u2019' in position 112: ordinal not in range(128)")
    Total Rows Read: 900000, Paired Rows: 35575, Time: 2019-05-17 18:47:16.282000
    ('s-NO_PARENT insertion', "'ascii' codec can't encode character u'\xe9' in position 64: ordinal not in range(128)")
    ('s-NO_PARENT insertion', "'ascii' codec can't encode character u'\xe9' in po

    is this a problem?…..please help

  • sentdex
    January 7, 2021

    Unterminated string starting at: line 1 column 274 (char 273)
    i got this error in my code. Anyone else?
    how to fix it?

  • sentdex
    January 7, 2021

    Ok, I'm about to crash my pc against the wall…. I'm getting a loop printing "s-NO_PARENT insertion name 'transaction_bldr' is not defined." I'm a beginner and I don't know whats happening I did it step by step but… Now what?

  • sentdex
    January 7, 2021

    Does the script close or stop when the reddit data is finished being paired or is when it is "cleanin up"

  • sentdex
    January 7, 2021

    For what it's worth, using global and pass in a run-once import script isn't really a sin … well, either that or I'm seriously doomed.

  • sentdex
    January 7, 2021

    with open('E:/reddit_data/{}/RC_{}'.format(timeframe.split('-')[0], timeframe), buffering=1000) as f:
    FileNotFoundError: [Errno 2] No such file or directory: 'E:/reddit_data/2015/RC_2015-01'

Write a comment