Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

DKPro Core

Compare

  Analyzed about 4 hours ago

DKPro Core is a collection of software components for natural language processing (NLP) based on the Apache UIMA framework. Many powerful and state-of-the-art NLP components are already freely available in the NLP research community. New and improved components are being developed and released ... [More] continuously. The components cover the whole range of NLP-related processing tasks. DKPro Core provides wrappers for such third-party tool as well as original NLP components. DKPro Core builds heavily on uimaFIT which allows for rapid and easy development of NLP processing pipelines. [Less]

166K lines of code

8 current contributors

4 months since last commit

6 users on Open Hub

Very Low Activity
4.75
   
I Use This

Lexical Analyzer Generator Quex

Compare

  Analyzed about 2 hours ago

Generator of extremely fast lexical analysers. Sophisticated input/buffer management. Many character encodings (incl. ASCII, UTF8, UTF16, RUSCII, ...) are directly supported. Regular expressions are specified in the lex/flex style. Features: * Support for Unicode and many other character ... [More] encodings. * Modes with inheritance relationships and transition rules. * Sophisticated buffer management. * Include stacks. * Customized token classes. * Template compression for code size reduction. * Path compression for code size reduction. * Possibility of indentation based lexical analysis (INDENT, DEDENT, NODENT). * Produces direct coded lexical analyzers. * Adjustable implicit line and column number counting. [Less]

76.2K lines of code

2 current contributors

almost 2 years since last commit

2 users on Open Hub

Very Low Activity
5.0
 
I Use This

CORSIS

Compare

  Analyzed over 2 years ago

CORSIS (formerly Tenka Text) is a performance‐oriented, open‐source library for corpus analysis. It utilizes typed assembly, task‐specific compilers and parallelization to deliver the best performance with elegant design. Demonstrative GUI of the project comes with Wordlister - an advanced ... [More] , extremely fast graphical wordlist tool and a regex concordance tool. CORSIS - the open-source answer to WordSmith Tools. [Less]

0 lines of code

0 current contributors

0 since last commit

1 users on Open Hub

Activity Not Available
0.0
 
I Use This
Mostly written in language not available
Licenses: gpl3_or_l...

SourceEditor

Compare

  Analyzed 2 days ago

Library to tokenize, edit and override PHP classes

1.01K lines of code

0 current contributors

about 8 years since last commit

1 users on Open Hub

Inactive
0.0
 
I Use This
Licenses: No declared licenses
Tags tokenizer

pyTokenizer

Compare

  Analyzed 20 days ago

A streaming tokenizer.

362 lines of code

1 current contributors

almost 2 years since last commit

1 users on Open Hub

Very Low Activity
0.0
 
I Use This

trie keys - values

Compare

  Analyzed about 4 hours ago

creates a compressed trie that maps keys to values and values to keys. Compression is on the front end of keys. Useful for lightweight reserved word creation in constrained memory/processor power situations. Written in C.

1.18K lines of code

0 current contributors

over 12 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This
Licenses: No declared licenses

tokkens-ruby

Compare

  Analyzed 3 days ago

Basic text to numbers tokenizer for machine learning. Tokkens makes it easy to apply a vector space model to text documents, targeted towards with machine learning. It provides a mapping between numbers and tokens (strings).

474 lines of code

0 current contributors

over 5 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This