Matplotlib Tutorial 8 – getting data from the internet
[ad_1]
Aside from loading data from the files, another popular source for data is the internet. We can load data from the internet from a variety of ways, but, for us, we’re going to just simply read the source code of the website, then use simple splitting to separate the data.
sample code: http://pythonprogramming.net
http://hkinsley.com
https://twitter.com/sentdex
http://sentdex.com
http://seaofbtc.com
Source
[ad_2]
Unfortunately, Yahoo decided to remove their API. To handle for this, you can change the stock_price_url to the following in the code:
stock_price_url = 'https://pythonprogramming.net/yahoo_finance_replacement'
Only AAPL is supported. If you have your own source, feel free to adapt the code to suit that, or you can look into pandas for pulling data.
The below code may be useful for someone.
%matplotlib inline
from matplotlib.dates import bytespdate2num
import matplotlib.pyplot as plt
import numpy as np
import urllib
def graph_data(stock):
stock_price_url = 'https://pythonprogramming.net/yahoo_finance_replacement'
source_code = urllib.request.urlopen(stock_price_url).read().decode()
stock_data = []
split_source = source_code.split('n')
for line in split_source:
split_line = line.split(',')
if len(split_line) == 7:
if 'Volume' not in line and 'labels' not in line:
stock_data.append(line)
date, openp, highp, lowp, closep, adjustedp, Volume = np.loadtxt(stock_data,
delimiter=',',
unpack=True,
converters={0: bytespdate2num('%Y-%m-%d')})
plt.plot_date(date, closep, '-', label='Price')
plt.xlabel('Date')
plt.ylabel('Price')
plt.title('Interesting Graph nCheck it out')
plt.legend()
plt.show()
graph_data('TSLA')
How did you get the data on web looking like that? When I try to manually navigate to Teslas stocks, I can't see anywhere the option to open the data as csv
where is the link??
Solution for Python 3+
https://pastebin.com/KNJURDDR
module 'urllib' has no attribute 'request', it shows an error
I am trying to plot Arabic file, success to the plot but the result shows the inverted letters
my code:
import matplotlib.pyplot as plt
import numpy as np
X, Y = [], []
for line in open('example.txt', 'r',encoding="utf-8"):
values = [str(s) for s in line.split(':')]
X.append(values[0])
Y.append(values[1])
plt.plot(X, Y)
plt.show()
Too bad I didn't buy TSLA when this video came out. $30/shr then, >$1000 today.
Hey, how do you make new line just exactly under the end of previous line but not of its begining?
Hello sir, I am doing this using anaconda
since yahoo has suspended its api its being tough to run the code please help as i am being a beginner .
waiting for your reply.
Hello sir, I am using Spyder IDE in anaconda. I could only get HTTPError: NOT FOUND. Can you help sir?
Thanks for posting this video, I got my code to work with the new url , but it was a bit confusing going thru the for loop. Might have to watch another time, regardless , your videos in general so far are GREAT. Thanks
Link is now different, which means the data is different, which mean the code in the video is wrong. I fixed all this before discovering Tomer Cna'an's comment below, but honestly, this video should either be redone, deleted, or have some sort of disclaimer upfront mentioning these issues…
import matplotlib.pyplot as plt
import numpy as np
import urllib
import matplotlib.dates as mdates
def graph_data(stock):
stock_price_url = 'https://pythonprogramming.net/yahoo_finance_replacement'
source_code = urllib.request.urlopen(stock_price_url).read().decode()
stock_data = []
split_source = source_code.split('n')
#plt.title('MY PLOT')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
graph_data('TSLA')
this is my code exactly the same but when i run this its giving me a blank graph like an empty square box,help me out ples
My code:
import matplotlib.pyplot as plt
import numpy as np
import urllib
import matplotlib.dates as mdates
import pandas as pd
def graph_data(stock):
stock_price_url = 'https://pythonprogramming.net/yahoo_finance_replacement'
source_code = urllib.request.urlopen(stock_price_url).read().decode()
stock_data = []
date = []
openP = []
highP = []
lowP = []
closeP = []
volume = []
split_source = source_code.split('n')
for line in split_source:
split_line = line.split(',')
if len(split_line) == 7:
if 'Date' not in line:
stock_data.append(line)
date.append(split_line[0])
openP.append(split_line[1])
highP.append(split_line[2])
lowP.append(split_line[3])
closeP.append(split_line[4])
volume.append(split_line[6])
date = date[:5]
highP = highP[:5]
plt.bar(date, highP, label='date vs max price', color='cyan')
plt.xlabel('date')
plt.ylabel('maximum price')
plt.legend()
plt.show()
graph_data('TSLA')
With the API changed this tutorial is now a mess. Please redo it or remove it. It's not beneficial in the state that it is in. Thanks.
how do i load the graph in an html page?
You can get data from Yahoo like this:
import pandas as pd
from pandas_datareader import data as wb
stock_data=wb.DataReader('TSLA',start='2008-1-1',data_source='yahoo')
Can't you just use pandas read_csv function to make this a lot simpler?
thank you so much sentdex
you can use this too.
import pandas_datareader as data
from datetime import datetime
data.get_data_yahoo('TSLA',start=datetime(2019, 4, 5),end=datetime(2019, 4, 7))
Upto which video I should watch in matplot to get knowledge enough to start machine learning
I find this rather confusing. I keep getting this error…. NameError: name 'date' is not defined
Anyone know why? Any help is appreciated.
Here's the code:
https://codeshare.io/axYEEN
import datetime as dt
import pandas_datareader.data as web
import matplotlib.pyplot as plt
import pandas as pd
start = dt.datetime(2016,1,1)
end = dt.datetime(2018,1,1)
df = web.DataReader('TSLA','iex',start,end)
print(df)
can we use list in np.loadtxt, because in pycharm it says that 'expected a string , but got a list instead'. what should i do now?
For me this url i m trying to open;its not opening,saying server ip address cant be found,url=http://chartapi.finance.yahoo.com/instrument/1.0/'+stock+'/chardata;type=quote;range=10y/csv, in place of stock TSLA
i am trying to do the same as u did to get data from internet but i am getting an error, i am using python 2.7,what change i need to make.please help me out
Thanks for your video. Could you tell me please how did you move several columns to right at the same time
I am talking about plt.xlabel( ….? Thanks
hey,#sentdex
when the url is read the values are taken in the form of str and that one is not allowing further execution
can anyone help in resolving this………:(
Hi, I haven't had any problem with understanding your tutorials, but from Tutorial 8, I can't understand anything… what is 'urllib', what is 'request', what is 'urlopen' 'read()' 'decode()'.. the difficulty suddenly soared up to the advanced level from the beginner's level…
I wanted to know how to handle if the response in now a days in JSON. how to handle the ?like below
https://www.alphavantage.co/query?apikey=demo&function=TIME_SERIES_DAILY_ADJUSTED&symbol=MSFT
hey sentdex I know this is an old tuts, but quick question in the data we had to account for the 'values' line bc it had 6, but 'labels' line right above also has 6 and each one is a date but we didn't have to account for that line…. also if this is a csv then why does each line besides the actual values were looking for start with a title then a colon then the Csv data. KIND OF LOOKS LIKE JSON, example= values: date, close, high, low, open, volume is each line a Dict of some sort or is it JSON
—-also no matter what id do im getting a URLLIB open error. SSL cert verify failed!!!!
google has api any link plz?>
can some one explain in a little detail of the function bytespdate2num? I am totally lost…
can some one explain why after reading the data, the date is an array like this:
array([736536., 736535., 736534., …, 730124., 730123., 730122.])
Does the number represent tick?
Hi sentdex, thanks for the wonderful tutorial 🙂
If I may ask, in the line
source_code = urllib.request.urlopen(source_url).read().decode()
where source_url = "https://pythonprogramming.net/yahoo_finance_replacement",
1) what does urllib.request.urlopen(source_url) do or return?
2) what does .read() and .decode() do respectively? why are they necessary?
Thank you!
https://www.google.com/finance/getprices?i=60&p=1d&f=d,o,h,l,c,v&df=unix&q=IBM
To convert the milliseconds use: %f
# %f = milliseconds
2018-05-27 14:42:11.163881 This date should have the following format
bytespdate2num('%Y-%m-%d %H:%M:%S.%f')
I'm getting an error saying : stock_data is not defined? Please help
First of all, great tutorial!
Since Yahoo has removed their API, could you make a video or some tutorials on how to use other currently available API, such as IEX API?
Thank you
thanks alot for easy explanation..And you saved lot of my time too..
Please help!
I'm not sure what's wrong with my code.
import matplotlib.pyplot as plt
import numpy as np
import urllib
import matplotlib.dates as mdates
def bytespdate2num(fmt, encoding='utf-8'):
strconverter = mdates.strdate2num(fmt)
def bytesconverter(b):
s = b.decode(encoding)
return strconverter(s)
return bytesconverter
def graph_data(stock):
stock_price_url = 'https://pythonprogramming.net/yahoo_finance_replacement'
source_code = urllib.request.urlopen(stock_price_url).read().decode()
stock_data = []
split_source = source_code.split('n')
for line in split_source:
split_line = line.split(',')
if len(split_line) == 6:
if 'values' not in line and 'labels' not in line:
stock_data.append(line)
date, closep, highp, lowp, openp, volume = np.loadtxt(stock_data,
delimiter=',',
unpack=True,
converters={0: bytespdate2num('%Y%m%d')})
plt.plot_date(date, closep)
plt.xlabel('Date')
plt.ylabel('Price')
plt.title('Intereting GraphnCheck it out')
plt.legend()
plt.show()
graph_data('AAPL')
This my by wrongheaded hairsplitting. I think this video title should be "Matplotlib Tutorial 8 – Getting data from the internet".
I've been losing my mind trying to learn Python and data. Your vids are THE BEST resource I've found so far. THANK YOU VERY MUCH.
you could filter out the un-needed data by checking for a number at the beginning of the line.
stock_price_url = 'https://pythonprogramming.net/yahoo_finance_replacement'
Works for me
File "matplotlib010.py", line 33
plt.xlabel('date')
^
IndentationError: unexpected indent
I had defined 'import matplotlib.pyplot as plt' but still it gives this error.. plzzz helpme
Hi, Your tutorials are very useful, why not you use jupyter notebook?
Correct Code:
import matplotlib.pyplot as plt
import numpy as np
import urllib
from matplotlib.dates import bytespdate2num
stock_url='https://pythonprogramming.net/yahoo_finance_replacement'
src=urllib.request.urlopen(stock_url).read().decode()
split_src=src.split('n')
data = []
for line in split_src:
split_line=line.split(',')
if len(split_line) ==7:
if 'Date' not in line:
data.append(line)
Date,Open,High,Low,Close,Adjusted_close,Volume =np.loadtxt(data,delimiter=',',unpack=True,converters={0:bytespdate2num('%Y-%m-%d')})
Huge thanks for great videos, I have a question that why we used decode() I didn't clearly understand…
Huge thanks for these videos! I found a way to pull data for this module.
1. First, you must have a current installation of pandas (check version with: import pandas as pd, pd.__version__).
2. Second, you have to install pandas_datareader (pip install pandas_datareader)
3. Then, at least if you're running version 20.3, these lines will pull in the data:
from pandas_datareader import data, wb
df = data.DataReader('TSLA','yahoo')
df
My finished function (for this module) looked like this, and I didn't have to do all the date manipulation:
def graph_data(stock):
df = data.DataReader(stock,'yahoo')
plt.plot_date(df.index, df.Close, '-')
plt.xlabel('Date')
plt.ylabel('Price')
plt.title('Interesting Graph')
plt.legend()
plt.show()
graph_data('TSLA')
Hope this helps!