爬了知乎“沙雕问题”,笑死个人!
questions_df = pd.DataFrame(columns = ['title','visit','follower','answer','is_open'])
for i in range(len(questions)): try: url = 'https://www.zhihu.com/'+questions[i] html = requests.get(url,cookies=cookie, headers=header).content bsObj = BeautifulSoup(html.decode('utf-8'),"html.parser") text = str(bsObj) title = bsObj.find('h1',attrs={'class':'QuestionHeader-title'}).text visit = int(re.findall('"visitCount":d+',text)[0].replace('"visitCount":','')) follower = int(re.findall('"followerCount":d+',text)[0].replace('"followerCount":','')) answer = int(re.findall('"answerCount":d+',text)[0].replace('"answerCount":','')) is_open = int(len(re.findall('问题已关闭',text))==0) questions_df = questions_df.append({'title':title,'visit':visit, (编辑:晋中站长网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |