brian d foy > Unicode-Tussle > word

Download:
Unicode-Tussle-1.05.tar.gz

Annotate this POD

CPAN RT

New  1
Open  0
View/Report Bugs
Source  

NAME ^

word - display words starting or matching a string or pattern

SYNOPSIS ^

word [options] [string | pattern]

Given a string, show all words starting with that string (look mode). Given a pattern, show all lines matching that pattern (grep mode).

An argument with non-alphabetic characters is always a pattern. Force grep mode with --grep=pattern or by starting the pattern with a slash, which will be ignored.

Use --man to get the full manpage.

DESCRIPTION ^

Search a large list of words in one of two modes. In look mode, only words starting with the given string are displayed. This mode runs very quickly. Only purely alphabetic strings are allowed. The system look(1) program is co-opted into helping.

In grep mode, any entries matching the pattern are shown. This takes much longer to run, because the entire 26 megabyte file must be grepped through. The pattern is not a grep(1) pattern, but rather a perl(1) pattern. You may use Unicode named characters, plus several custom aliases, in your pattern.

EXAMPLES ^

Look up terms starting with "cat":

    % word cat

The same, but bump verbose display level to see parts of speech:

    % word -v cat

Look at only verbs starting with cat:

    % word -pv cat

Look at all "cat" entries, with verbose set high:

    % word -A cat

Look for all (irregular) plurals that start with "ex":

    % word -ppl ex

Look for obsolete prefixes that start with "s":

    % word -o -ppref s

Grep terms with "cat" anywhere at all:

    % word --grep cat
    % word /cat

Grep terms containing "cat" or "cats" surrounded by word boundaries:

    % word '\bcats?\b'

Grep terms with the Unicode "Mark" property:

    % word '\pM'

Grep all plurals ending in "-ata":

    % word -A -ppl 'ata\b'

Grep terms with the Unicode "Dash" property:

    % word '\p{Dash}'

Grep for an "e" with an acute accent:

    % word '\N{eacute}'

Grep for any acute accents no matter the letter:

    % word '\N{acute}'

Grep for terms containing an "a", "o", "u" in any case, followed by a diaeresis:

    % word '(?i)[oau]\N{dier}'

OPTIONS ^

Display options are:

    --verbose / -v      use up to three times for more verbosity

        level 0 is just the word, like look
        level 1 includes parts of speech
        level 2 also includes assorted markings
        level 3 is the entire original entry 

    --nopager           never call the pager

Part of speech filtering options are:

    --pos /   -p POS    only entries matching all POS shown
    --nopos / -P POS    no   entries matching any POS shown

    POS is a comma-separated list of parts of speech like
    n/noun, v/verb, a/adjective, adv/adverb, pro/pronoun, 
    and pl/plural.

Type of entry filtering options are:

    --headwords      -h  show headwords only
    --everything     -a  include all types of entry
    --all-verbose    -A  all entries, plus sets verbose to 2

Some entries contain markings telling what kind it is. Include or exclude such entries using:

    --normal         -n  normal entries (on by default)
    --foreign        -f  unassimilated entries (on by default)

    --obsolete       -o  obsolete entries (off by default)
    --catachrestic   -e  catechrestic entries (off by default)
    --illustrations  -i  illustrative examples (off by default)
    --crossref       -x  crossrefs w/old spellings (off by default)

The previous six entry types can be excluded using the corresponding --noXXX long option or the capitalized short option; e.g., --noforeign is equivalent to -F.

Other options:

    --version           print version info and exit
    --help              this help page
    --man               the full manpage
    --debug             internal debugging

    --fuzzy          -z use agrep(1) fuzzy matching in "best mode"
    --all-fuzzy      -Z like -zavv

PATTERN SHORTCUTS ^

Besides all normal Perl pattern syntax, an extensive set of named characters is provide for nmemonic convenience so you don't have to write numeric code points like \x{3b2} for non-ASCII characters.

ERRORS ^

TO BE WRITTEN: ERRORS

ENVIRONMENT ^

PAGER

FILES ^

words.utf8

PROGRAMS ^

look, agrep

BUGS ^

TO BE WRITTEN: BUGS

SEE ALSO ^

perlre(1), perlunicode(1)

AUTHOR ^

TO BE WRITTEN: AUTHOR

COPYRIGHT AND LICENCE ^

TO BE WRITTEN: COPYRIGHT AND LICENCE

syntax highlighting: