Projects tagged ‘natural_language_processing’

ChaSen

C

No analysis available

ChaSen is a morphological analysis system. It segments Japanese text string into morphemes and tags those morphemes with their parts of speech and pronunciations. It also tags conjugative morphemes with their base forms and conjugation types/forms.

0 lines of code

0 current contributors

0 since last commit

0 users on Open Hub

Activity Not Available

0 Reviews

I Use This

Mostly written in language not available

Licenses: No declared licenses

Tags c grammar japanese kanji morphological_analysis natural_language_processing nlp

clic/LM_PRU

C

Analyzed 1 day ago

The LiverMemories/Portale della Ricerca Umanistica (LM/PRU) pipeline is a multilingual aware web service for processing archaeology texts. Currently, the input are documents in Portable Document Format (PDF) with machine-encoded text, for example, encoded via optical character recognition (OCR) or ... [More]

9.12K lines of code

0 current contributors

almost 12 years since last commit

0 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in Python

Licenses: No declared licenses

Tags english hlt italian langid namedentityrecognition natural-language-processing natural_language_processing nlp

Zeitcrawler

Z

Analyzed about 13 hours ago

A specialized crawler for the German newspaper 'Die Zeit'. Starting from the front page or from a given list of links, the crawler retrieves newspaper articles and gathers new links to explore as it goes, stripping the text of each article out of the HTML formatting and saving it into a raw text ... [More]

1.64K lines of code

0 current contributors

about 10 years since last commit

0 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in Perl

Licenses: gpl3

Tags academic computational_linguistics corpus corpus_linguistics crawler digital_humanities natural_language_processing nlp perl unix webcrawler xml

Équipe Crawler

É

Analyzed about 2 hours ago

A specialized crawler for the French sport newspaper L'Équipe. Starting from the front page or from a given list of links, the crawler retrieves newspaper articles and gathers new links to explore as it goes, stripping the text of each article out of the HTML formatting and saving it into a raw ... [More]

401 lines of code

0 current contributors

over 11 years since last commit

0 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in Perl

Licenses: gpl3

Tags academic computational_linguistics corpus corpus_linguistics crawler digital_humanities natural_language_processing nlp perl unix webcrawler xml

German Political Speeches Corpus-Builder

G

Analyzed about 1 hour ago

Tools to crawl German official speeches repositories in order to gather a corpus. More information to come. A complete version of the corpus including a visualization tool is available here : http://purl.org/corpus/german-speeches

1.08K lines of code

0 current contributors

over 10 years since last commit

0 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in Perl

Licenses: gpl3

Tags academic computational_linguistics corpus corpus_linguistics crawler digital_humanities natural_language_processing nlp perl unix webcrawler xml

Microblog Explorer

M

Analyzed about 22 hours ago

Perform crawls of social networks (identi.ca, reddit) to gather internal and external links and identify their language.

1.04K lines of code

0 current contributors

almost 11 years since last commit

0 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in Python

Licenses: gpl3_or_l...

Tags academic computational_linguistics crawler identica language_identification natural_language_processing nlp perl python reddit seedminer unix 1 more...

purepos

P

Analyzed about 7 hours ago

PurePos morphological disambiguator.

6.84K lines of code

1 current contributors

about 4 years since last commit

0 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in Java

Licenses: lgpv3_or_...

Tags analyzer command_line computational_linguistics corpus hmm java language languages morphological_analysis natural natural_language natural_language_processing 8 more...

TextBlob

Analyzed about 12 hours ago

Simple, Pythonic, text processing--Sentiment analysis, POS tagging, noun phrase extraction, translation, and more.

8K lines of code

6 current contributors

almost 4 years since last commit

0 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in Python

Licenses: mit

Tags computational_linguistics lemmatization natural_language_processing nlp nltk noun_phrase_extraction part_of_speech python sentiment_analysis spelling_correction tokenization translation 2 more...

php-nlp-tools

P

Analyzed about 21 hours ago

NlpTools is a library for natural language processing written in php. Its development is driven by the author's needs for text classification, clustering, tokenizing, stemming etc.

5.57K lines of code

0 current contributors

over 6 years since last commit

0 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in PHP

Licenses: No declared licenses

Tags ai artificial_intelligence artificialintelligence bayesian classification_system computational_linguistics corpus_linguistics information_retrieval informationretrieval languagedetection machine_learning machinelearning 7 more...

Semantic Czech

S

Analyzed about 10 hours ago

Task of the project is a semantic annotation of texts using NLP tools. Czsem Mining Suite is mainly a GATE plugin that allows to use Treex and TectoMT tools inside GATE. Bsides that is also a Information Extraction tool based on dependency liguistics. It si capable to learn tree queries ... [More]

165K lines of code

0 current contributors

over 8 years since last commit

0 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in Java

Licenses: No declared licenses

Tags classifier czech dependency_grammar fuzzylogic ie ilp inductivelogicprogramming information_extraction java natural_language_processing nlp ontologies 8 more...

Tags : Browse Projects