Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

Natural Language Toolkit (NLTK)

Compare

  Analyzed 4 minutes ago

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.

234K lines of code

42 current contributors

14 days since last commit

45 users on Open Hub

Low Activity
5.0
 
I Use This

Apertium

Compare

  Analyzed about 10 hours ago

Apertium is an open-source machine translation platform, aimed at related-language pairs but expanded to deal with more divergent language pairs. The platform provides 1. a language-independent machine translation engine 2. tools to manage the linguistic data necessary to build a machine ... [More] translation system for a given language pair and 3. linguistic data for a growing number of language pairs. Apertium uses a shallow-transfer machine translation engine which processes the input text in stages, as in an assembly line: de-formatting, morphological analysis, part-of-speech disambiguation, shallow structural transfer, lexical transfer, morphological generation, and re-formatting. [Less]

96.8K lines of code

0 current contributors

29 days since last commit

13 users on Open Hub

Moderate Activity
4.9
   
I Use This
Licenses: GNU_Free_..., gpl, gpl3_or_l...

Apache OpenNLP

Compare

Claimed by Apache Software Foundation Analyzed about 3 hours ago

Apache OpenNLP is a Java machine learning toolkit for natural language processing (NLP).

157K lines of code

8 current contributors

1 day since last commit

12 users on Open Hub

Moderate Activity
5.0
 
I Use This

LanguageTool

Compare

  Analyzed 1 day ago

LanguageTool is an Open Source language checker for English, German, Polish, Dutch, and other languages. It's rule based, i.e. it will find errors for which a rule is defined in an XML configuration files. Rules for more complicated errors can be written in Java.

1.26M lines of code

37 current contributors

2 days since last commit

11 users on Open Hub

Very High Activity
4.66667
   
I Use This

Treex - NLP Framework

Compare

  Analyzed about 4 hours ago

Treex (formerly TectoMT) is a highly modular NLP software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project. At the same time, it is also hoped to ... [More] significantly facilitate and accelerate development of software solutions of many other NLP tasks, especially due to re-usability of the numerous integrated processing modules (called blocks), which are equipped with uniform object-oriented interfaces. [Less]

242K lines of code

4 current contributors

about 1 month since last commit

4 users on Open Hub

Moderate Activity
5.0
 
I Use This

RelEx Semantic Relationship Extractor

Compare

  Analyzed about 22 hours ago

RelEx is an English-language semantic relationship extractor, built on the Carnegie-Mellon Link Grammar parser. It can identify dependency-grammar dependencies, such as subject, object, indirect object and many other relationships between words in a sentence. It can also provide part-of-speech ... [More] tagging, noun-number tagging, verb tense tagging, gender tagging, and so on. Relex includes a basic implementation of the Hobbs anaphora (pronoun) resolution algorithm. RelEx also provides semantic relationship framing, similar to that of FrameNet. [Less]

11.8K lines of code

4 current contributors

4 months since last commit

2 users on Open Hub

Very Low Activity
0.0
 
I Use This

Link Grammar

  Analyzed about 5 hours ago

The Link Grammar Parser is a syntactic parser of English, based on link grammar, an original theory of English syntax. Given a sentence, the system assigns to it a syntactic structure, which consists of a set of labeled links connecting pairs of words. The parser also produces a "constituent" (Penn ... [More] tree-bank style phrase tree) representation of a sentence (showing noun phrases, verb phrases, etc.). [Less]

77.4K lines of code

4 current contributors

2 days since last commit

1 users on Open Hub

Moderate Activity
0.0
 
I Use This

opencorpora

Compare

  Analyzed about 1 hour ago

An engine for creating and annotating textual corpora

38.6K lines of code

3 current contributors

8 months since last commit

1 users on Open Hub

Very Low Activity
0.0
 
I Use This

spaCy

Compare

  Analyzed about 2 hours ago

spaCy is a library for advanced natural language processing in Python and Cython. spaCy is built on the very latest research, but it isn't researchware. It was designed from day one to be used in real products. spaCy currently supports English, German, French and Spanish, as well as tokenization for ... [More] Italian, Portuguese, Dutch, Swedish, Finnish, Norwegian, Hungarian, Bengali, Hebrew, Chinese and Japanese. It's commercial open-source software, released under the MIT license. [Less]

123K lines of code

0 current contributors

3 days since last commit

1 users on Open Hub

Moderate Activity
0.0
 
I Use This

WebAnno

Compare

  Analyzed about 2 hours ago

WebAnno: A Flexible, Web-based and Visually Supported System for Distributed Annotations

121K lines of code

10 current contributors

over 1 year since last commit

1 users on Open Hub

Very Low Activity
5.0
 
I Use This