Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

ChaSen

Compare

  No analysis available

ChaSen is a morphological analysis system. It segments Japanese text string into morphemes and tags those morphemes with their parts of speech and pronunciations. It also tags conjugative morphemes with their base forms and conjugation types/forms.

0 lines of code

0 current contributors

0 since last commit

0 users on Open Hub

Activity Not Available
0.0
 
I Use This
Mostly written in language not available
Licenses: No declared licenses

clic/LM_PRU

Compare

  Analyzed 1 day ago

The LiverMemories/Portale della Ricerca Umanistica (LM/PRU) pipeline is a multilingual aware web service for processing archaeology texts. Currently, the input are documents in Portable Document Format (PDF) with machine-encoded text, for example, encoded via optical character recognition (OCR) or ... [More] directly from the source applicaton, and the output are tabular formatted multi-column files. [Less]

9.12K lines of code

0 current contributors

almost 12 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This
Licenses: No declared licenses

Zeitcrawler

Compare

  Analyzed about 13 hours ago

A specialized crawler for the German newspaper 'Die Zeit'. Starting from the front page or from a given list of links, the crawler retrieves newspaper articles and gathers new links to explore as it goes, stripping the text of each article out of the HTML formatting and saving it into a raw text ... [More] file. The project includes scripts to convert it into the XML format for further use with natural language processing tools. [Less]

1.64K lines of code

0 current contributors

about 10 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This

Équipe Crawler

Compare

  Analyzed about 2 hours ago

A specialized crawler for the French sport newspaper L'Équipe. Starting from the front page or from a given list of links, the crawler retrieves newspaper articles and gathers new links to explore as it goes, stripping the text of each article out of the HTML formatting and saving it into a raw ... [More] text file. The project includes scripts to convert it into the XML format for further use with natural language processing tools. [Less]

401 lines of code

0 current contributors

over 11 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This

German Political Speeches Corpus-Builder

Compare

  Analyzed about 1 hour ago

Tools to crawl German official speeches repositories in order to gather a corpus. More information to come. A complete version of the corpus including a visualization tool is available here : http://purl.org/corpus/german-speeches

1.08K lines of code

0 current contributors

over 10 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This

Microblog Explorer

Compare

  Analyzed about 22 hours ago

Perform crawls of social networks (identi.ca, reddit) to gather internal and external links and identify their language.

1.04K lines of code

0 current contributors

almost 11 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This

purepos

Compare

  Analyzed about 7 hours ago

PurePos morphological disambiguator.

6.84K lines of code

1 current contributors

about 4 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This

TextBlob

Compare

  Analyzed about 12 hours ago

Simple, Pythonic, text processing--Sentiment analysis, POS tagging, noun phrase extraction, translation, and more.

8K lines of code

6 current contributors

almost 4 years since last commit

0 users on Open Hub

Inactive
5.0
 
I Use This

php-nlp-tools

Compare

  Analyzed about 21 hours ago

NlpTools is a library for natural language processing written in php. Its development is driven by the author's needs for text classification, clustering, tokenizing, stemming etc.

5.57K lines of code

0 current contributors

over 6 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This
Licenses: No declared licenses

Semantic Czech

Compare

  Analyzed about 10 hours ago

Task of the project is a semantic annotation of texts using NLP tools. Czsem Mining Suite is mainly a GATE plugin that allows to use Treex and TectoMT tools inside GATE. Bsides that is also a Information Extraction tool based on dependency liguistics. It si capable to learn tree queries ... [More] (dependecy based extraction rules) using Inducive Logic Programming. [Less]

165K lines of code

0 current contributors

over 8 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This
Licenses: No declared licenses