03.01.2011 - 08:30
Consistency of the definition of a word
In my Tcl application one one hand I have defined tcl_wordchars to be:
set tcl_wordchars {[\w%#!?$]}
because my "words" can be constituted by the above character class and
I need the double click to select entire "words" in a text widget.
On the other hand, I'm using regular expressions to match certain
patterns in the text, for instance floating point numbers are matched
by the following pattern:
set floatingpointnumberREpat_rep
{((\.\d+)|(\m\d+(\.\d*)?))([deDE][+\-]?\d{1,3})?\M}
In this regexp I'm using \m and \M as constraints to match at the
beginning and end of a "word". The problem is that "word" in this
regexp context has a different meaning than above. From the Tcl
re_syntax man page: "A word is defined as a sequence of word
characters that is neither preceded nor followed by word characters. A
word character is an alnum character or an underscore (_)."
Therefore, a word is (or at least can be) a subtly different thing in
different areas of Tcl. Any thoughts about consistency? Don't you
think \m and \M should match at the beginning/end of words as defined
by $tcl_wordchars?
Can we call this a bug? Or a feature request?
Last question, I will have to modify my regexp above so that it still
matches floating point numbers while not matching in my "words", such
as abc#75 (currently, "75" is matched by the above floating point
pattern, while it should not). Advices appreciated here too, thanks.
Francois
03.01.2011 - 11:56
On 3 jan, 08:30, Francois Vogel <fsvogelnew5NOS...@free.fr> wrote:
