The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

Search results for "Lingua::DE::Sentence"

Lingua::DE::Sentence - Perl extension for tokenizing german texts into their sentences. River stage zero No dependents

The "Lingua::DE::Sentence" module contains the function get_sentences, which splits text into its constituent sentences. The result can be either the list of sentences in the text or the list of sentences plus and a list of their absolute positions i...

HOLSTEN/Lingua-DE-Sentence-0.07 - 25 Apr 2003 07:46:43 UTC - Search in distribution

FL3 - A shortcut module for Lingua::FreeLing3. River stage one • 1 direct dependent • 1 total dependent

Implements a set of utility functions to access "Lingua::FreeLing3" objects. Everytime one of the accessors is used just with the language code/language data file (or using the default language), the cached processor is returned if it exists. If any ...

AMBS/Lingua-FreeLing3-0.09 - 12 Jan 2014 16:21:27 UTC - Search in distribution

Text::Capitalize - capitalize strings ("to WORK AS titles" becomes "To Work as Titles") River stage one • 2 direct dependents • 3 total dependents

Text::Capitalize provides some routines for title-like formatting of strings. The simple capitalize function just makes the inital character of each word uppercase, and forces the rest to lowercase. The capitalize_title function applies English title...

DOOM/Text-Capitalize-1.5 - 27 Sep 2019 02:25:45 UTC - Search in distribution

Text::Shingle - Pure Perl implementation of shingles for pieces of text River stage zero No dependents

The module provides a way to extract shingles from a piece of text. Shingles can then be used for other operations such as clustering, deduplication, etc. Given a document, the w-shingles represent a set of sorted groups of *w* adjacent words in the ...

NIDS/Text-Shingle-0.07 - 08 Dec 2020 14:22:06 UTC - Search in distribution

Lingua::DE::ASCII - Perl extension to convert german umlauts to and from ascii River stage one • 1 direct dependent • 1 total dependent

This module enables conversion from and to the ASCII format of german texts. It has two methods: "to_ascii" and "to_latin1" which one do exactly what they say. Please note that both methods take only one scalar as argument and not whole a list. to_as...

BIGJ/Lingua-DE-ASCII-0.14 - 02 May 2020 07:38:52 UTC - Search in distribution

Text::NGrammer - Pure Perl extraction of n-grams and skip-grams River stage one • 1 direct dependent • 1 total dependent

The module provides a way to extract both n-grams and skip-grams from a text, a sentence or fro man array of tokens. A n-gram is defines as an ordered sequence of tokens in a piece or text. Some frequent n-grams such as 2-grams, are also called bigra...

NIDS/Text-NGrammer-0.06 - 07 Dec 2020 17:48:01 UTC - Search in distribution

Lingua::tlhInganHol::yIghun - "The Klingon Language: hey you, program in it!" River stage zero No dependents

The Lingua::tlhInganHol::yIghun module allows you to write Perl in the original Klingon. Introduction The Klingon language was first explained to Terrans in 1984 by Earth-born linguist Dr Marc Okrand. Those who dare can learn more about it at the Kli...

MSCHWERN/Lingua-tlhInganHol-yIghun-20090601 - 01 Jun 2009 19:55:11 UTC - Search in distribution

Text::GaleChurch - Perl extension for aligning translated sentences River stage zero No dependents

This module aligns the sentences of paragraphs in two languages in a way that the aligned sentences are likely translations of each other. This is useful for applications in machine translation and other applications where sentence-aligned parallel c...

ACHIMRU/Text-GaleChurch-1.00 - 13 Mar 2010 15:57:12 UTC - Search in distribution

Lingua::Sentence - Perl extension for breaking text paragraphs into sentences River stage one • 5 direct dependents • 5 total dependents

This module allows splitting of text paragraphs into sentences. It is based on scripts developed by Philipp Koehn and Josh Schroeder for processing the Europarl corpus (<http://www.statmt.org/europarl/>). The module uses punctuation and capitalizatio...

CAPOEIRAB/Lingua-Sentence-1.100 - 26 Feb 2017 23:06:04 UTC - Search in distribution

Lingua::EO::Orthography - A orthography/substitute converter for Esperanto characters River stage zero No dependents

6 letters in the Esperanto alphabet did not exist in ASCII. Their letters, which have supersigns (eo: supersignoj), are often spelled in substitute notations (eo: surogataj skribosistemoj) for the history, namely, for the ages of typography and typew...

MORIYA/Lingua-EO-Orthography-0.04 - 24 Dec 2013 17:11:43 UTC - Search in distribution

Lingua::Translate::Babelfish - Translation back-end for Altavista's Babelfish, version 0.01 River stage one • 7 direct dependents • 8 total dependents

Lingua::Translate::Babelfish is a translation back-end for Lingua::Translate that contacts babelfish.altavisa.com to do the real work. It is normally invoked by Lingua::Translate; there should be no need to call it directly. If you do call it directl...

SAMV/Lingua-Translate-0.09 - 23 May 2008 09:02:47 UTC - Search in distribution

Uplug::PreProcess::SentDetect - Moses/Europarl sentence boundary detector River stage two • 10 direct dependents • 10 total dependents

This module is basically a copy of Lingua::Sentence by Achim Ruopp adapted to Uplug which is based on tools developed for Moses and the Europarl corpus. All credits go to the original authors. This version includes some additional non-breaking prefix...

TIEDEMANN/uplug-main-0.3.8 - 16 Mar 2013 20:19:32 UTC - Search in distribution

POE::Component::Lingua::Translate - A non-blocking wrapper around Lingua::Translate River stage one • 1 direct dependent • 1 total dependent

POE::Component::Lingua::Translate is a POE component that provides a non-blocking wrapper around Lingua::Translate. It accepts "translate" events and emits "translated" events back....

HINRIK/POE-Component-Lingua-Translate-0.06 - 01 Sep 2010 16:52:30 UTC - Search in distribution
13 results (0.039 seconds)