Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

Natural Language Toolkit (NLTK)

Compare

  Analyzed about 22 hours ago

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.

234K lines of code

42 current contributors

14 days since last commit

45 users on Open Hub

Low Activity
5.0
 
I Use This

Apache UIMA Java SDK

Compare

Claimed by Apache Software Foundation Analyzed about 21 hours ago

Apache UIMA is an Apache-licensed open source implementation of the UIMA specification (that specification is, in turn, being developed concurrently by a technical committee within OASIS, a standards organization). We invite and encourage you to participate in both the implementation and ... [More] specification efforts. UIMA is a component framework for analysing unstructured content such as text, audio and video. It comprises an SDK and tooling for composing and running analytic components written in Java and C++, with some support for Perl, Python and TCL. [Less]

371K lines of code

3 current contributors

5 months since last commit

19 users on Open Hub

Low Activity
5.0
 
I Use This

Apache OpenNLP

Compare

Claimed by Apache Software Foundation Analyzed 1 day ago

Apache OpenNLP is a Java machine learning toolkit for natural language processing (NLP).

157K lines of code

8 current contributors

4 days since last commit

12 users on Open Hub

Moderate Activity
5.0
 
I Use This

TreeTagger for Java

Compare

  Analyzed about 1 month ago

TreeTagger for Java is a Java wrapper around the popular TreeTagger package by Helmut Schmid. It was written with a focus on platform-independence and easy integration into applications. It is written in Java 5 and has been tested on OS X, Ubuntu Linux, and Windows.

2.67K lines of code

0 current contributors

almost 2 years since last commit

12 users on Open Hub

Activity Not Available
5.0
 
I Use This

LanguageTool

Compare

  Analyzed 1 day ago

LanguageTool is an Open Source language checker for English, German, Polish, Dutch, and other languages. It's rule based, i.e. it will find errors for which a rule is defined in an XML configuration files. Rules for more complicated errors can be written in Java.

1.26M lines of code

37 current contributors

1 day since last commit

11 users on Open Hub

Very High Activity
4.66667
   
I Use This

CMU Sphinx

Compare

  Analyzed about 23 hours ago

CMUSphinx represents Carnegie Mellon University's development of open source, large-vocabulary, speaker-independent continuous speech recognition engines. The distribution contains a library (libsphinx5) and some small examples that link against it.

486K lines of code

9 current contributors

2 months since last commit

7 users on Open Hub

Low Activity
4.33333
   
I Use This
Licenses: No declared licenses

DKPro Core

Compare

  Analyzed about 23 hours ago

DKPro Core is a collection of software components for natural language processing (NLP) based on the Apache UIMA framework. Many powerful and state-of-the-art NLP components are already freely available in the NLP research community. New and improved components are being developed and released ... [More] continuously. The components cover the whole range of NLP-related processing tasks. DKPro Core provides wrappers for such third-party tool as well as original NLP components. DKPro Core builds heavily on uimaFIT which allows for rapid and easy development of NLP processing pipelines. [Less]

158K lines of code

8 current contributors

6 months since last commit

6 users on Open Hub

Very Low Activity
4.75
   
I Use This

Treex - NLP Framework

Compare

  Analyzed 1 day ago

Treex (formerly TectoMT) is a highly modular NLP software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project. At the same time, it is also hoped to ... [More] significantly facilitate and accelerate development of software solutions of many other NLP tasks, especially due to re-usability of the numerous integrated processing modules (called blocks), which are equipped with uniform object-oriented interfaces. [Less]

242K lines of code

4 current contributors

about 1 month since last commit

4 users on Open Hub

Moderate Activity
5.0
 
I Use This

MeCab

Compare

  Analyzed 3 days ago

MeCab is a fast and customizable Japanese morphological analyzer. MeCab is designed for generic purpose and applied to variety of NLP tasks, such as Kana-Kanji conversion. MeCab provides parameter estimation functionalities based on CRFs and HMM

291K lines of code

0 current contributors

11 months since last commit

3 users on Open Hub

Very Low Activity
0.0
 
I Use This
Licenses: No declared licenses

matxin

Compare

  Analyzed about 22 hours ago

Machine translation engine based on a dependency grammar and XML interchange format. The Spanish-Basque (es-eu) translation direction is currently supported.

3.41M lines of code

0 current contributors

almost 7 years since last commit

3 users on Open Hub

Inactive
5.0
 
I Use This
Licenses: No declared licenses