[python] requests와 BeautifulSoup을 이용해 웹페이지 정보 가져오기

Notice

DISQUS 댓글 연동 중지 / Delete ⋯

Recent Posts

Recent Comments

Link

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

woonizzooni

[python] requests와 BeautifulSoup을 이용해 웹페이지 정보 가져오기 본문

Programming/Python

[python] requests와 BeautifulSoup을 이용해 웹페이지 정보 가져오기

woonizzooni 2019. 6. 14. 01:32

아주(!!) 초단간 코드로...

정보 선택 : 아래 그림의 '다음을 시작페이지로'를 가져와보자 (https://www.daum.net)
코드 작성 (예외처리 등은 고려하지 않음. 말그대로 예제...)

#!/usr/bin/python
# -*- coding: utf-8 -*- 
import requests
from bs4 import BeautifulSoup

def get(url):
    headers = {\
        'Host':'www.daum.net',\
        'Connection':'close',\
        'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)',\
        'Accept':'*/*',\
        'Accept-Encoding':'gzip, deflate',\
        'Accept-Language':'ko-KR,ko;q=0.9,en-US;q=0.8,en;q=0.7',\
        'Cache-Control':'max-age=0'\
    }

    res = requests.get(url, headers=headers, allow_redirects=True)
    if res.status_code == 200:
        dom = BeautifulSoup(res.content, 'html.parser')
        homepage = dom.findAll('a', {'id':'homePage'})[0].string
        print(homepage)
    else:
        print("Error! [%d : %s]" % (res.status_code, res.request_url))

url = 'https://www.daum.net'
get(url)

실행결과

> python get.py
다음을 시작페이지로

저작자표시 (새창열림)

'Programming > Python' 카테고리의 다른 글

PyQt5 설치 (Windows 환경에서) (0)	2019.08.24
대한민국 행정동 경계 좌표 추출 #2 - python > GeoJSON (1)	2019.08.23
[PyProj] Proj executable not found. Please set PROJ_DIR variable. (0)	2019.08.22
[python] RuntimeError: maximum recursion depth exceeded while getting the str of an object (0)	2019.07.27
python 코드 실행 시간 측정 (0)	2019.07.25

'Programming/Python' Related Articles

Comments

woonizzooni

[python] requests와 BeautifulSoup을 이용해 웹페이지 정보 가져오기 본문

[python] requests와 BeautifulSoup을 이용해 웹페이지 정보 가져오기

'Programming > Python' 카테고리의 다른 글

티스토리툴바