Search results for "Lingua::EN::StopWords"
Lingua::EN::StopWords - Typical stop words for an English corpus
See synopsis....
SPLICE/Lingua-EN-Segmenter-0.1 - 03 Mar 2005 03:20:54 UTC - Search in distribution- Lingua::EN::Splitter - Split text into words, paragraphs, segments, and tiles
lib/Lingua/StopWords/EN.pm
WOLLMERS/Lingua-StopWords-0.12
-
18 Apr 2021 08:32:07 UTC
-
Search in distribution
- Lingua::StopWords - Stop words for several languages.
Pod::Spell - a formatter for spellchecking Pod
Pod::Spell is a Pod formatter whose output is good for spellchecking. Pod::Spell is rather like Pod::Text, except that it doesn't put much effort into actual formatting, and it suppresses things that look like Perl symbols or Perl jargon (so that you...
HAARG/Pod-Spell-1.26 - 13 Mar 2023 20:22:56 UTC - Search in distribution
Text::Compare - Language sensitive text comparison
Text::Compare is an attempt to write a high speed text compare tool based on Vector comparision which uses language dependend stopwords. Text::Compare uses Lingua::Identify to find the language of the given texts, then uses Lingua::StopWords to get t...
STRO/Text-Compare-1.03 - 23 Jun 2007 05:44:31 UTC - Search in distribution
Task::BeLike::PERLANCAR::Used - All my modules which I currently use and install on a new perl installation
PERLANCAR/Task-BeLike-PERLANCAR-Used-20231201.1
-
01 Dec 2023 09:34:30 UTC
-
Search in distribution
Lingua::EN::Bigram - Extract n-grams from a text and list them according to frequency and/or T-Score
This module is designed to: 1) pull out all of the ngrams (multi-word phrases) in a given text, and 2) list these phrases according to their frequency. Using this module is it possible to create lists of the most common phrases in a text as well as o...
EMORGAN/Lingua-EN-Bigram-0.03 - 24 Aug 2010 02:01:46 UTC - Search in distribution
Search::Tokenizer - Decompose a string into tokens (words)
This module builds an iterator function that will progressively extract terms from a given input string. Terms are defined by a regular expression (for example "\w+"). Extraction of terms relies on the builtin "global match" operator of Perl (the 'g'...
DAMI/Search-Tokenizer-1.03 - 18 May 2021 06:52:09 UTC - Search in distribution
Text::TFIDF::Ngram - Compute the TF-IDF measure for ngram phrases
This module computes the TF-IDF ("term frequency - inverse document frequency") measure for a corpus of text documents. This module will only work when given more than one document. Because the idf method is computed based on all documents, a single ...
GENE/Text-TFIDF-Ngram-0.0509 - 27 Oct 2022 00:50:15 UTC - Search in distribution
Lingua::EN::Ngram - Extract n-grams from texts and list them according to frequency and/or T-Score
This module is designed to extract n-grams from texts and list them according to frequency and/or T-Score. To elaborate, the purpose of Lingua::EN::Ngram is to: 1) pull out all of the ngrams (multi-word phrases) in a given text, and 2) list these phr...
EMORGAN/Lingua-EN-Ngram-0.03 - 29 Mar 2018 03:28:09 UTC - Search in distribution
NNexus::StopWordList - A stop word list for mathematical texts
This class provides an example stopword list for the specific domain of mathematical texts. It builds on the excellent list from Lingua::EN::StopWordList with a number of modifications particular to mathematical discourse. The modifications have been...
DGINEV/NNexus-2.0.3 - 13 Apr 2015 23:17:27 UTC - Search in distribution
Text::Language::Guess - Trained module to guess a document's language
Text::Language::Guess guesses a document's language. Its implementation is simple: Using "Text::ExtractWords" and "Lingua::StopWords" from CPAN, it determines how many of the known stopwords the document contains for each language supported by "Lingu...
MSCHILLI/Text-Language-Guess-0.02 - 20 Nov 2005 04:08:56 UTC - Search in distribution
Lingua::ZH::Keywords - Extract keywords from Chinese text
This is a very simple algorithm which removes stopwords from the text, and then counts up what it considers to be the most important keywords. The "keywords" subroutine returns a list of keywords in order of relevance. The stopwords list is accessibl...
AUTRIJUS/Lingua-ZH-Keywords-0.04 - 20 Jan 2003 22:42:35 UTC - Search in distribution
Lingua::EN::Keywords - Automatically extracts keywords from text
This is a very simple algorithm which removes stopwords from a summarized version of a text (generated with Lingua::EN::Summarize) and then counts up what it considers to be the most important "keywords". The "keywords" subroutine returns a list of f...
SIMON/Lingua-EN-Keywords-2.0 - 28 Apr 2003 10:23:29 UTC - Search in distribution
WordList::EN::StopWords - English stop words
This wordlist contains English stopwords from Lingua::EN::StopWordList. You can also retrieve the list directly from that module....
PERLANCAR/WordList-EN-StopWords-0.001 - 25 Jul 2021 00:05:33 UTC - Search in distribution
Lingua::EN::StopWordList - A sorted list of English stop words
"Lingua::EN::StopWordList" is a pure Perl module. It returns a sorted arrayref of 659 English stop words....
RSAVAGE/Lingua-EN-StopWordList-1.02 - 16 Aug 2015 04:55:38 UTC - Search in distribution
Plucene::Plugin::Analyzer::SnowballAnalyzer - Stemmed analyzer with Lingua::Stem::Snowball and Lingua::StopWords
Filters StandardTokenizer with SnowballAnalyzer. Change $Plucene::Plugin::Analysis::SnowballAnalyzer::LANG to the language of your choice. (see Lingua::Stem::Snowball documentation for all available languages)....
FABPOT/Plucene-Plugin-Analyzer-SnowballAnalyzer-1.1 - 01 May 2004 09:12:49 UTC - Search in distribution
Acme::CPANModules::Import::RSAVAGE::StopWordLists - CPAN modules which offer stopword lists (2015)
CPAN modules which offer stopword lists (2015). This list is generated by extracting module names mentioned in the article [http://savage.net.au/Perl-modules/html/stopwordlists.report.html] (retrieved on 2016-02-21). For the full article, visit the U...
PERLANCAR/Acme-CPANModulesBundle-Import-RSAVAGE-0.001 - 22 Sep 2018 01:18:00 UTC - Search in distribution