Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

Scrapy

Compare

  Analyzed about 4 hours ago

Scrapy is a fast high-level scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

49.3K lines of code

50 current contributors

5 days since last commit

20 users on Open Hub

High Activity
5.0
 
I Use This

Weboob

Compare

  Analyzed about 2 hours ago

Web Outside of Browsers. Weboob is a collection of applications able to interact with websites, without requiring the user to open them in a browser. It also provides well-defined APIs to talk to websites lacking one.

279K lines of code

50 current contributors

3 months since last commit

12 users on Open Hub

Moderate Activity
5.0
 
I Use This

QuickCode (formerly ScraperWiki)

Compare

  No analysis available

QuickCode is the new name for the original ScraperWiki product. We renamed it, as it isn’t a wiki or just for scraping any more. It’s a Python and R data analysis environment, ideal for economists, statisticians and data managers who are new to coding.

0 lines of code

1 current contributors

0 since last commit

1 users on Open Hub

Activity Not Available
0.0
 
I Use This
Mostly written in language not available
Licenses: No declared licenses

Corpusexplorer.SDK.Extern

Compare

  Analyzed about 22 hours ago

Dieses Projekt ist Teil des Software Development Kit - des CorpusExplorers (CorpusExplorer SDK) [Weitere Informationen finden Sie hier]. Das SDK sowie alle Teile können kostenlos für Forschungs- und Bildungsprojekte genutzt werden. Dieser Teil des SDK steht unter der GPL-3.0-Lizenz. Sie können ... [More] dieses Projekt nutzen um: - Den CorpusExplorer zu erweitern. - Ihr eigenes Programm mit dem CorpusExplorer zu verbinden (API-Schnittstelle). - Oder unabhängig vom CorpusExplorer ihr eigenes Programm zu entwickeln/erweitern und so auf bewährte Lösungen zurückzugreifen. [Less]

33.8K lines of code

0 current contributors

about 7 years since last commit

1 users on Open Hub

Inactive
5.0
 
I Use This

Preferred Stock Search Application

Compare

  Analyzed about 6 hours ago

A Preferred Stock search and evaluation application which utilizes data from both QuantumOnline.com and Yahoo.com to present accurate, current preferred stock information.

39.3K lines of code

0 current contributors

about 7 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This
Licenses: No declared licenses

Tweeper

Compare

  Analyzed 2 months ago

tweeper is a web scraper which can be used to conveniently follow the public activity of social network users without the need to log in or even be subscribed to the social network; tweeper converts the public information to RSS so that it can be accessed and collected by a feed reader. tweeper ... [More] started as the TWitter fEEd scraPER but support for other web sites has been added. [Less]

1.09K lines of code

1 current contributors

over 2 years since last commit

0 users on Open Hub

Activity Not Available
0.0
 
I Use This

htmlSQL (PHP Library)

Compare

  Analyzed about 22 hours ago

htmlSQL is a experimental PHP library which allows you to access HTML values by an SQL like syntax. This means that you don't have to write complex functions or regular expressions to extract specific values.

1.5K lines of code

0 current contributors

about 3 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This

Xidel

Compare

  No analysis available

Xidel is a command line tool to download web pages and extract data from them. It can download files over http/s connections, follow redirections, links, or extracted values, and also process local files. The data can be extracted using XPath 2.0, XQuery 1.0 expressions, JSONiq, CSS 3 selectors ... [More] , and custom, pattern-matching templates that are like an annotated version of the processed page. The extracted values can then be exported as plain text/xml/html/json or assigned to variables to be used in other extract expressions or to be exported to the shell. There is also an online cgi service for testing. [Less]

0 lines of code

1 current contributors

0 since last commit

0 users on Open Hub

Activity Not Available
0.0
 
I Use This
Mostly written in language not available
Licenses: gpl

infoqscraper

Compare

  Analyzed about 3 hours ago

A Web scraper for InfoQ. InfoQ hosts a lot of great presentations, unfortunately it is not possible to watch them outside of the browser or if you do not have Flash installed. The video cannot simply be downloaded because the audio stream and the slide stream are not in the same media. By ... [More] downloading the video you only get the audio track and a video of the presenter but you don't get the slide. infoqscraper allows you to list and search for presentations, download the resources (video, audio track, slides) and build a movie including the slides and the audio track from the resources The project is split in two part; a reusable library and a command line interface. [Less]

1.36K lines of code

0 current contributors

about 7 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This