2023.06.09
1. urlretrieve
1-1. What is urlretreive?
urlretrieve함수는 URL을 통해 파일을 다운로드 받을 때 쓰는 함수이다.
필수 인자는 하나이고, 다운로드 받고싶은 파일의 url이다.
리턴값은 카피된 로컬 파일의 이름(경로)와 헤더, 두 개로 구성된 튜플이다.
헤더는 urlOpen.info()를 사용했을 때의 리턴값과 같다.
urlretrieve is a function that allows users to download a nextwork object, denoted by a URL, into a local file.
The required argument is the url.
It returns a tuple that has a filename that can be found in local and headers.
The headers is what urlOpen.info() returns
1-2. Code Example
import urllib.request as req
img_url = 'https://ichef.bbci.co.uk/news/640/cpsprodpb/E172/production/_126241775_getty_cats.png' # A random cat pic
html_url = 'http://google.com'
dest_path1 = 'mypath/img.jpg'
dest_path2 = 'mypath/index.html'
try:
file1, header1 = req.urlretrieve(img_url, dest_path1)
file2, header2 = req.urlretrieve(html_url, dest_path2)
except Exception as e:
print('Download failed')
print(e)
else:
print(header1)
print(header2)
# Downloaded file info
print('file1 {}'.format(file1))
print('file2 {}'.format(file2))
결과(Result)
header1 info (date, expiration date, content-type, connection, etc...)
header2 info (date, expiration date, content-type, connection, etc...)
file1 mypath/img.jpg
file2 mypath/index.html
2. urlopen
2-1. What is urlropen?
함수 이름 그대로 URL을 여는 함수.
열고 싶은 url 또는 request모듈에서 제공하는 추상클래스인 Request의 객체를 인자로 받는다.
리턴값은 url, 헤더, status를 가지고 있는 객체이다.
urlopen literally opens a url
The required argument can be either a string containing a valid, properly encoded URL, or a Request object.
It returns an object that has url, headers, and status.
2-2. Code Example
# URLError: represent protocal errors
# HTTPError: rise when HTTP request returns an unsuccessful status code
import os
from dotenv import load_dotenv
import urllib.request as req
from urllib.error import URLError, HTTPError # For exception handling
load_dotenv()
PATH = os.getenv('DEST_PATH')
path_list = [PATH + '/img.jpg',PATH + '/index.html']
target_list = ['https://img.freepik.com/free-photo/adorable-kitty-looking-like-it-want-to-hunt_23-2149167099.jpg?w=2000', 'http://google.com']
for i, url in enumerate(target_list):
try:
# Read web resonse info
res = req.urlopen(url)
contents = res.read()
print("---------------------------------")
# Print status info
print('Header Info: {}'.format(i, res.info()))
print('HTTP Status Code: {}'.format(res.getcode()))
print("---------------------------------")
with open(path_list[i], 'wb') as c:
c.write(contents)
except HTTPError as e:
print("Download failed")
print("HTTPError code: ", e.code)
except URLError as e:
print("URL Error Reason: ", e.reason)
# Success
else:
print("Download succeed.")
결과(Result)
---------------------------------
Header Info: 0
HTTP Status Code: 200
---------------------------------
Download succeed.
---------------------------------
Header Info: 1
HTTP Status Code: 200
---------------------------------
Download succeed.