Комментарии:
@ma1achite I use eclipse classic. It's free and works with most every language
Ответить@0Allhell Perform a view source in the browser to find out which tags you need to target. You can scrape anything that shows on the screen
Ответить@ma1achite he's using Eclipse google it eclipse IDE
Ответить@entrevu To scrap anything you just need the basic concepts I covered here with a better understanding of regular expressions. I did a tutorial in PHP that covers advanced website scraping called Web Design and Programming Pt 24. The Regular Expression explanation is identical to regex in python. I hope that helps
Ответитьmy only question is how to make eclipse recognize the beautifulsoup download (I used 'python setup.py install' in terminal so were does these files have to go? Like where do I have to put the beautifulsoup.py or other files that came with the install. As you would expect In eclipse I am getting an error Unresolved import: BeautifulSoup
ОтветитьAre you on a mac or pc
ОтветитьMac
Ответитьfigured it out now im just getting errors with re.findall giving an TypeError: Expected string or buffer
ОтветитьHi Derek. I need your help Do you have an email..I wll write a lot ..hope you answer
ОтветитьSend me an email and I'll see if I can help [email protected]
ОтветитьSince my network is behind a proxy, so when i open a webpage it asks me for username and password, is there any way that i can store username password in the program it self so that it doesn't asks me..... I searched and used urllib2 -> proxy handlers but got error
ОтветитьSorry, but I'd have to know more about how that information is checked.
ОтветитьHello! I am wondering whether you have or know of a tutorial to scrape from pages that are auto-generated with Javascript.
ОтветитьWhat'd you do to fix this error importing BS?
ОтветитьI use your exact code but I only get the links and the titles. The code fails to output the snippet of the article. Any help? Has the feed for Huffington Post changed?
ОтветитьThey may have changed the tags a bit. Take a look if the tag changed around the snippet maybe
Ответитьfrom bs4 import beautifulSoup
ОтветитьHai Derek, i have a question how to pass the credentials to scrap website.
Ответитьhello again , its been a while... i was wondering which is the best method to use for web scrapping.. curl ? beautiful soap ? get_html? for example i can block the curl to my site through the confing.ini ... so i wanna start scrapping but i dont know which is the right or best method to use ...
ОтветитьI actually use PHP most of the time, but with Python Beautiful Soup has improved lately and is quite good.
Ответить