This page documents the contrib/rebox/
subdirectory of the Pymacs distribution. First
install Pymacs from the top-level of the
distribution, this has the side-effect of
adjusting a few files in this directory. Once
this done, return to this directory, then run
python setup.py
install. Also read Emacs
usage below.
For comments held within boxes, it is
painful to fill paragraphs, while stretching or
shrinking the surrounding box "by hand", as
needed. This piece of Python code eases my life
on this. It may be used interactively from
within Emacs through the Pymacs interface, or
in batch as a script which filters a single
region to be reformatted. I find only fair,
while giving all sources for a package using
such boxed comments, to also give the means I
use for nicely modifying comments. So here they
are!
Each supported box style has a number
associated with it. This number is arbitrary,
yet by convention, it holds three
non-zero digits such the the hundreds digit
roughly represents the programming language,
the tens digit roughly represents a box
quality (or weight) and the units digit
roughly a box type (or figure). An unboxed
comment is merely one of box styles.
Language, quality and type are collectively
referred to as style attributes.
When rebuilding a boxed comment,
attributes are selected independently of each
other. They may be specified by the digits of
the value given as Emacs commands argument
prefix, or as the -s argument to the
rebox script
when called from the shell. If there is no
such prefix, or if the corresponding digit is
zero, the attribute is taken from the value
of the default style instead. If the
corresponding digit of the default style is
also zero, than the attribute is recognised
and taken from the actual boxed comment, as
it existed before prior to the command. The
value 1, which is the simplest attribute, is
ultimately taken if the parsing fails.
A programming language is associated with
comment delimiters. Values are 100 for none
or unknown, 200 for /* and */ as in plain C, 300
for // as
in C++, 400 for # as in most
scripting languages, 500 for ; as in Lisp, Scheme,
assembler and 600 for % as in TeX,
PostScript, Erlang.
Box quality differs according to language.
For unknown languages (100) or for the C
language (200), values are 10 for simple, 20
for rounded, and 30 or 40 for starred. Simple
quality boxes (10) use comment delimiters to
left and right of each comment line, and also
for the top or bottom line when applicable.
Rounded quality boxes (20) try to suggest
rounded corners in boxes. Starred quality
boxes (40) mostly use a left margin of
asterisks or X'es, and use them also in box
surroundings. For all others languages, box
quality indicates the thickness in characters
of the left and right sides of the box:
values are 10, 20, 30 or 40 for 1, 2, 3 or 4
characters wide. With C++, quality 10 is not
useful, it is not allowed.
Box type values are 1 for fully opened
boxes for which boxing is done only for the
left and right but not for top or bottom, 2
for half single lined boxes for which boxing
is done on all sides except top, 3 for fully
single lined boxes for which boxing is done
on all sides, 4 for half double lined boxes
which is like type 2 but more bold, or 5 for
fully double lined boxes which is like type 3
but more bold.
The special style 221 is for C comments
between a single opening /* and a single
closing */.
The special style 111 deletes a box.
Usage is rebox [OPTION]... [FILE]. By
default, FILE is reformatted to standard
output by refilling the comment up to column
79, while preserving existing boxed comment
style. If FILE is not given, standard input
is read. Options may be:
-n
Do not refill the comment inside
its box, and ignore -w.
-s
STYLE
Replace box style according to
STYLE, as explained above.
-t
Replace initial sequence of
spaces by TABs on each line.
-v
Echo both the old and the new box
styles on standard error.
-w
WIDTH
Try to avoid going over WIDTH
columns per line.
So, a single boxed comment is reformatted
by invocation. vi users, for example, would
need to delimit the boxed comment first,
before executing the !}rebox command (is
this correct? my vi recollection is far
away).
Batch usage is also slow, as internal
structures have to be reinitialised at every
call. Producing a box in a single style is
fast, but recognising the previous style
requires setting up for all possible
styles.
For most Emacs language editing modes,
refilling does not make sense outside
comments, one may redefine the M-q command and link it to
this Pymacs module. For example, I use this
in my .emacs file:
The Emacs function rebox-comment automatically
discovers the extent of the boxed comment
near the cursor, possibly refills the text,
then adjusts the box style. When this command
is executed, the cursor should be within a
comment, or else it should be between two
comments, in which case the command applies
to the next comment. The function
rebox-region
does the same, except that it takes the
current region as a boxed comment. Both
commands obey numeric prefixes to add or
remove a box, force a particular box style,
or to prevent refilling of text. Without such
prefixes, the commands may deduce the current
box style from the comment itself so the
style is preserved.
The default style initial value is nil or
0. It may be preset to another value through
calling rebox-set-default-style from
Emacs Lisp, or changed to anything else
though using a negative value for a prefix,
in which case the default style is set to the
absolute value of the prefix.
A C-u
prefix avoids refilling the text, but forces
using the default box style. C-u - lets the user
interact to select one attribute at a
time.
Let's suppose you want to add your own
boxed comment style, say:
//--------------------------------------------+
// This is the style mandated in our company.
//--------------------------------------------+
You might modify rebox.py but
then, you will have to edit it whenever you
get a new release of pybox.py. Emacs
users might modify their .emacs file or
their rebox.el
bootstrap, if they use one. In either cases,
after the (pymacs-load
"Pymacs.rebox") line, merely add:
If you use the rebox script rather than
Emacs, the simplest is to make your own. This
is easy, as it is very small. For example,
the above style could be implemented by using
this script instead of rebox:
In all cases, NNN is the style three-digit
number, with no zero digit. Pick any free
style number, you are safe with 911 and up.
MMM is the recognition priority, only used to
disambiguate the style of a given boxed
comments, when it matches many styles at
once. Try something like 400. Raise or lower
that number as needed if you observe false
matches.
On average, the template uses three lines
of equal length. Do not worry if this implies
a few trailing spaces, they will be cleaned
up automatically at box generation time. The
first line or the third line may be omitted
to create vertically opened boxes. But the
middle line may not be omitted, it ought to
include the word box, which will get
replaced by your actual comment. If the first
line is shorter than the middle one, it gets
merged at the start of the comment. If the
last line is shorter than the middle one, it
gets merged at the end of the comment and is
refilled with it.
This example tool comes in two parts: a
batch script rebox and a
Pymacs.rebox
module. Go to the contrib/rebox/
directory of the distribution and use
python setup.py
install there. To check that both are
properly installed, type rebox </dev/null in
a shell; you should not receive any output nor
see any error.
For comments held within boxes, it is
painful to fill paragraphs, while stretching
or shrinking the surrounding box by
hand, as needed. This piece of Python
code eases my life on this. It may be used
interactively from within Emacs through the
Pymacs interface, or in batch as a script
which filters a single region to be
reformatted.
In batch, the reconstruction of boxes is
driven by command options and arguments and
expects a complete, self-contained boxed
comment from a file. Emacs function
rebox-region
also presumes that the region encloses a
single boxed comment. Emacs rebox-comment is different,
as it has to chase itself the extent of the
surrounding boxed comment.
The Python code is too big to be inserted
in this documentation: see file Pymacs/rebox.py
in the Pymacs distribution. You will observe
in the code that Pymacs specific features are
used exclusively from within the
pymacs_load_hook function and
the Emacs_Rebox
class. In batch mode, Pymacs is not even imported.
Here, we mean to discuss some of the design
choices in the context of Pymacs.
In batch mode, as well as with
rebox-region,
the text to handle is turned over to Python,
and fully processed in Python, with
practically no Pymacs interaction while the
work gets done. On the other hand,
rebox-comment
is rather Pymacs intensive: the comment
boundaries are chased right from the Emacs
buffer, as directed by the function
Emacs_Rebox.find_comment.
Once the boundaries are found, the remainder
of the work is essentially done on the Python
side.
Once the boxed comment has been
reformatted in Python, the old comment is
removed in a single delete operation, the new
comment is inserted in a second operation,
this occurs in Emacs_Rebox.process_emacs_region.
But by doing so, if point was within the
boxed comment before the reformatting, its
precise position is lost. To well preserve
point, Python might have driven all
reformatting details directly in the Emacs
buffer. We really preferred doing it all on
the Python side: as we gain legibility by
expressing the algorithms in pure Python, the
same Python code may be used in batch or
interactively, and we avoid the slowdown that
would result from heavy use of Emacs
services.
To avoid completely loosing point, I
kludged a Marker class, which goal is
to estimate the new value of point from the
old. Reformatting may change the amount of
white space, and either delete or insert an
arbitrary number characters meant to draw the
box. The idea is to initially count the
number of characters between the beginning of
the region and point, while ignoring any
problematic character. Once the comment has
been put back in a box, point is advanced
from the beginning of the region until we get
the same count of characters, skipping all
problematic characters. This Marker class works fully on
the Python side, it does not involve Pymacs
at all, but it does solve a problem that
resulted from my choice of keeping the data
on the Python side instead of handling it
directly in the Emacs buffer.
We want a comment reformatting to appear
as a single operation, in the context of
Emacs Undo. The method Emacs_Rebox.clean_undo_after
handles the general case for this. Not that
we do so much in practice: a reformatting
implies one delete-region and one
insert, and
maybe some other little adjustments at
Emacs_Rebox.find_comment
time. Even if this method scans and modifies
an Emacs Lisp list directly in the Emacs
memory, the code doing this stays neat and
legible. However, I found out that the undo
list may grow quickly when the Emacs buffer
use markers, with the consequence of making
this routine so Pymacs intensive that most of
the CPU is spent there. I rewrote that
routine in Emacs Lisp so it executes in a
single Pymacs interaction.
Function Emacs_Rebox.remainder_of_line
could have been written in Python, but it was
probably not worth going away from this
one-liner in Emacs Lisp. Also, given this
routine is often called by find_comment, a few Pymacs
protocol interactions are spared this way.
This function is useful when there is a need
to apply a regular expression already
compiled on the Python side, it is probably
better fetching the line from Emacs and do
the pattern match on the Python side, than
transmitting the source of the regular
expression to Emacs for it to compile and
apply it.
For refilling, I could have either used
the refill algorithm built within in Emacs,
programmed a new one in Python, or relied on
Ross Paterson's fmt, distributed by GNU and
available on most Linuxes. In fact,
refill_lines
prefers the latter. My own Emacs setup is
such that the built-in refill algorithm is
already overridden by GNU
fmt, and it
really does a much better job. Experience
taught me that calling an external program is
fast enough to be very bearable, even
interactively. If Python called Emacs to do
the refilling, Emacs would itself call GNU
fmt in my case,
I preferred that Python calls GNU
fmt directly. I
could have reprogrammed GNU fmt in Python. Despite
interesting, this is an uneasy project:
fmt implements
the Knuth refilling algorithm, which depends
on dynamic programming techniques; Ross did
carefully fine tune them, and took care of
many details. If GNU fmt fails, for not being
available, say, refill_lines falls back on a
dumb refilling algorithm, which is better
than none.
I first observed rounded corners, as in
style 223 boxes, in code from Warren Tucker, a
previous maintainer of the shar package, circa 1980.
Except for very special files, I carefully
avoided boxed comments for real work, as I
found them much too hard to maintain. My friend
Paul Provost was working at Taarna, a computer
graphics place, which had boxes as part of
their coding standards. He asked that we try
something to get him out of his misery, and
this is how rebox.el was
originally written. I did not plan to use it
for myself, but Paul was so enthusiastic that I
timidly started to use boxes in my things, very
little at first, but more and more as time
passed, still in doubt that it was a good move.
Later, many friends spontaneously started to
use this tool for real, some being very serious
workers. This convinced me that boxes are
acceptable, after all.
I do not use boxes much with Python code. It
is so legible that boxing is not that useful.
Vertical white space is less necessary, too. I
even often avoid white lines within functions.
Comments appear prominent enough when using
highlighting editors like Emacs or nice printer
tools like enscript.
After Emacs could be extended with Python,
in 2001, I translated rebox.el into
rebox.py, and
added the facility to use it as a batch script.
The least old copy I could find of rebox.el is also
provided here, to ease pondering and
comparisons with the Python translation and
adaptation.