Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

Text Encoding Initiative

Compare

  Analyzed 1 day ago

The TEI is an international and interdisciplinary community-based open standard used by research project, libraries, museums, publishers, and academics to represent all kinds of literary and linguistic texts, using an encoding scheme that is maximally expressive and minimally obsolescent.

553K lines of code

14 current contributors

12 days since last commit

3 users on Open Hub

Moderate Activity
5.0
 
I Use This

RelEx Semantic Relationship Extractor

Compare

  Analyzed about 6 hours ago

RelEx is an English-language semantic relationship extractor, built on the Carnegie-Mellon Link Grammar parser. It can identify dependency-grammar dependencies, such as subject, object, indirect object and many other relationships between words in a sentence. It can also provide part-of-speech ... [More] tagging, noun-number tagging, verb tense tagging, gender tagging, and so on. Relex includes a basic implementation of the Hobbs anaphora (pronoun) resolution algorithm. RelEx also provides semantic relationship framing, similar to that of FrameNet. [Less]

11.8K lines of code

4 current contributors

4 months since last commit

2 users on Open Hub

Very Low Activity
0.0
 
I Use This

LexAt Lexical/Corpus Statistics

Compare

  No analysis available

The LexAt "lexical attraction" aka the RelEx Statistical Linguistics package adds statistical algorithms to the RelEx. Corpus statistics, including mutual information, are maintained in an SQL database, and drawn on to enhance various RelEx functions, such as parse ranking and chunk ranking, and word-sense disambiguation (Mihalcea algo).

0 lines of code

0 current contributors

0 since last commit

1 users on Open Hub

Activity Not Available
0.0
 
I Use This
Mostly written in language not available
Licenses: apache_2

opencorpora

Compare

  Analyzed about 13 hours ago

An engine for creating and annotating textual corpora

38.6K lines of code

3 current contributors

8 months since last commit

1 users on Open Hub

Very Low Activity
0.0
 
I Use This

porter-stem.vim

Compare

  Analyzed about 17 hours ago

Implementation of Porter stemming algorithm in vim script. See https://www.ohloh.net/p/stem-search-vim for a script that makes use of this.

205 lines of code

0 current contributors

over 7 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This

stem-search.vim

Compare

  Analyzed about 7 hours ago

StmSrch is a reverse-stem searching script. It implements the Porter stemming algorithm, by Martin Porter. It also handles irregular verbs and noun pluralizations. This script can be useful for searching or scanning through corpus files. Each word input to the :StmSrch command will be stemmed ... [More] and then formulated in such a way as to match possible conjugations or pluralizations. Without any word given for input, it will attempt to stem the current word under the cursor. The matching is done using word boundaries so not just any substring will match. For example: - :StmSrch searcher will match any of: - search, searching, searches, searchers, searched, ... and a string of words will work as well, matching in order: - :StmSrch thieves are running from bunnies will match strings of word [Less]

308 lines of code

0 current contributors

almost 14 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This

He Kupu Tawhito

Compare

  Analyzed about 7 hours ago

979 lines of code

1 current contributors

almost 5 years since last commit

0 users on Open Hub

Inactive
5.0
 
I Use This

Zeitcrawler

Compare

  Analyzed about 19 hours ago

A specialized crawler for the German newspaper 'Die Zeit'. Starting from the front page or from a given list of links, the crawler retrieves newspaper articles and gathers new links to explore as it goes, stripping the text of each article out of the HTML formatting and saving it into a raw text ... [More] file. The project includes scripts to convert it into the XML format for further use with natural language processing tools. [Less]

1.64K lines of code

0 current contributors

about 10 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This

Équipe Crawler

Compare

  Analyzed about 7 hours ago

A specialized crawler for the French sport newspaper L'Équipe. Starting from the front page or from a given list of links, the crawler retrieves newspaper articles and gathers new links to explore as it goes, stripping the text of each article out of the HTML formatting and saving it into a raw ... [More] text file. The project includes scripts to convert it into the XML format for further use with natural language processing tools. [Less]

401 lines of code

0 current contributors

over 11 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This

German Political Speeches Corpus-Builder

Compare

  Analyzed about 7 hours ago

Tools to crawl German official speeches repositories in order to gather a corpus. More information to come. A complete version of the corpus including a visualization tool is available here : http://purl.org/corpus/german-speeches

1.08K lines of code

0 current contributors

over 10 years since last commit

0 users on Open Hub

Inactive
0.0
 
I Use This