
Unicode::Properties - find out what properties a character has

use Unicode::Properties 'uniprops';
my @prop_list = uniprops ('☺'); # Unicode smiley face
print "@prop_list\n";
prints
InMiscellaneousSymbols Any Assigned Common
You can then use, for example, \p{InMiscellaneousSymbols} to match this character in a regular expression.

Given a character, returns a list of properties which the character has.
my @matching = matchchars ($property);
Returns a list of all the characters which match a particular property. If $property is not found in the list of possible Unicode properties, it treats it as a regular expression.

This module uses a list taken from the "perlunicode" documentation. It would be better to use Perl's internals to get the list, but I don't know how to do that.
Depending on your Perl and Unicode version, you'll get different results. For example "Balinese" was added in Unicode version 5.0.0, so if you are using Perl 5.8.8 unpatched, your Unicode version is 4.1.0 so you won't get "Balinese" in the results list.
Also, I don't know the behaviour of Unicode versions other than 4.1.0 and 5.0.0, so this module only covers those two. I couldn't get Perl 5.8.5 to install on my computer, so I've set the minimum version to 5.8.8 for this module.

This script was written because the author (Tom Christiansen) was dissatisfied with Unicode::Properties. Unfortunately, it uses the same method as this module, of parsing the Perl documentation to get the information. It only works for Perl versions 5.12 or 5.14.

Copyright © 2011 Ben Bullock, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Ben Bullock, <bkb@cpan.org>