Projects tagged ‘webscraper’

Scrapy

Analyzed 21 minutes ago

Scrapy is a fast high-level scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

61.6K lines of code

50 current contributors

1 day since last commit

20 users on Open Hub

High Activity

0 Reviews

I Use This

Mostly written in Python

Licenses: BSD-3-Clause

Weboob

Analyzed 1 day ago

Web Outside of Browsers. Weboob is a collection of applications able to interact with websites, without requiring the user to open them in a browser. It also provides well-defined APIs to talk to websites lacking one.

286K lines of code

50 current contributors

2 months since last commit

12 users on Open Hub

Moderate Activity

0 Reviews

I Use This

Mostly written in Python

Licenses: lgpv3_or_...

Tags extraction python scraping webscraper xpath

QuickCode (formerly ScraperWiki)

No analysis available

QuickCode is the new name for the original ScraperWiki product. We renamed it, as it isn’t a wiki or just for scraping any more. It’s a Python and R data analysis environment, ideal for economists, statisticians and data managers who are new to coding.

0 lines of code

1 current contributors

0 since last commit

1 users on Open Hub

Activity Not Available

0 Reviews

I Use This

Mostly written in language not available

Licenses: No declared licenses

Tags crawling extraction scraper scraping screenscraper screen_scraping webscraper webscraping wiki

Corpusexplorer.SDK.Extern

C

Analyzed about 14 hours ago

Dieses Projekt ist Teil des Software Development Kit - des CorpusExplorers (CorpusExplorer SDK) [Weitere Informationen finden Sie hier]. Das SDK sowie alle Teile können kostenlos für Forschungs- und Bildungsprojekte genutzt werden. Dieser Teil des SDK steht unter der GPL-3.0-Lizenz. Sie können ... [More]

33.8K lines of code

0 current contributors

over 9 years since last commit

1 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in C#

Licenses: GPL2

Tags crawler importer linguistic linguistics nlp scraper webscraper xmlparser

Preferred Stock Search Application

Analyzed about 2 hours ago

A Preferred Stock search and evaluation application which utilizes data from both QuantumOnline.com and Yahoo.com to present accurate, current preferred stock information.

39.3K lines of code

0 current contributors

over 9 years since last commit

0 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in Python

Licenses: No declared licenses

Tags finance financial html parser python scrap scraper scraping semantic sitescraper stock stock_assessment 10 more...

Tweeper

T

Analyzed about 10 hours ago

tweeper is a web scraper which can be used to conveniently follow the public activity of social network users without the need to log in or even be subscribed to the social network; tweeper converts the public information to RSS so that it can be accessed and collected by a feed reader. tweeper ... [More]

1.09K lines of code

1 current contributors

over 4 years since last commit

0 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in XSL Transformation

Licenses: No declared licenses

Tags instagram rss twitter webscraper

htmlSQL (PHP Library)

H

Analyzed about 17 hours ago

htmlSQL is a experimental PHP library which allows you to access HTML values by an SQL like syntax. This means that you don't have to write complex functions or regular expressions to extract specific values.

1.5K lines of code

0 current contributors

over 5 years since last commit

0 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in PHP

Licenses: bsd

Tags extraction html library parsing php scraping sql web webscraper webscraping xml

Xidel

X

No analysis available

Xidel is a command line tool to download web pages and extract data from them. It can download files over http/s connections, follow redirections, links, or extracted values, and also process local files. The data can be extracted using XPath 2.0, XQuery 1.0 expressions, JSONiq, CSS 3 selectors ... [More]

0 lines of code

1 current contributors

0 since last commit

0 users on Open Hub

Activity Not Available

0 Reviews

I Use This

Mostly written in language not available

Licenses: gpl

Tags cli commandlinetool crawling css3 downloader extraction html http https internet json scraper 8 more...

infoqscraper

I

Analyzed about 7 hours ago

A Web scraper for InfoQ. InfoQ hosts a lot of great presentations, unfortunately it is not possible to watch them outside of the browser or if you do not have Flash installed. The video cannot simply be downloaded because the audio stream and the slide stream are not in the same media. By ... [More]

1.36K lines of code

0 current contributors

about 9 years since last commit

0 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in Python

Licenses: bsd_2clau...

Tags infoq python webscraper

Tags : Browse Projects