TIP 457: [eatargs] vs [proc] extensions; introspection; order of parameters; ambiguity

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

TIP 457: [eatargs] vs [proc] extensions; introspection; order of parameters; ambiguity

Kevin Kenny-6
Several people, notably Christian Gollwitzer, have asked for some more
of my notes about TIP #457.

In particular, he and I both remember that at one point I had
discussed the handling of optional args at some length, but I
certainly can't find that discussion now, and I'm forced to
reconstruct it as best I can.

I realize that this discussion is coming quite late to the party, but
my personal time is limited, and an analysis like this one is quite
time-consuming.

I hope that it serves to clarify both my position on the controversy
of extending [proc] versus providing an additional [eatargs] command,
about introspection, and about what the required order of parameters
must be.

In many cases, I'm sure that I'm coming across as quite dogmatic in
this posting. I hope that I'm providing a coherent rationale when
I explicitly reject what at first seem like reasonable alternatives.

1. Command, or extension to [proc]/[method]/[apply] ?
-----------------------------------------------------

Let me lead off by saying that I've certainly considered Alex's
counterproposal for an [eatargs] command, and at this point I'm
beginning to incline toward it, particularly if a less silly name
could be chosen - [parseargs] or [getopts] perhaps? (But don't take my
advice here: some of the greatest controversies that I've stirred on
tcl-core have been those related to the Naming of Names.)

Colin makes a valid argument that the functionality is useful beyond
[proc], [method], and [apply] of λ-forms. In particular, invocations
of coroutines would conceivably want to parse keyword parameters. (I'm
not quite following the argument about [interp alias], but that really
doesn't matter. A single coherent example is enough to demonstrate
that the set is incomplete.) That militates in favor of doing argument
processing with [eatargs]. 

I initially disfavored this approach for three reasons. The first is
that it's some amount of additional work for program analysis. As far
as I can grasp from Colin's verbose screeds, he seems to be labouring
under the misconception that I was referring to human program
understanding. Instead, I was referring to interprocedural data flow
analysis. I would fully expect that if this functionality is released,
the task of analysis will fall upon the quadcode compiler and similar
tools. The advantage of being able to match '-flag 1' in a caller 
with '$flag' in a called procedure, and be able to use that fact to
prove '$flag always exists, is always a boolean, and in fact is always
true - and in turn possibly even generate code specialized to the
individual call site - is one of the key things that lets the quadcode
engine do its work. 

That said, quadcode already needs to reach fairly deeply into called
procedures, to try to constrain what they might [upvar], to analyze
what namespace variables they might alter, and to extract relevant
facts about return values. Dissecting argument specifications, and
reasoning about bindings from the fact that $args reaches [eatargs]
unmodified from procedure entry, is the sort of thing that the
compiler 'middle end' does, so I really have no grounds to object on
that ground.

The second reason for preferring the specification in the [proc],
[method] or [apply] itself is that it makes the specification easily
accessible by [info] for introspection.  I would imagine that tools
like Nagelfar or TDK would need this sort of information to guide both
editing assistance and analysis. 

The third reason is that whatever syntax we choose is likely to come
to enjoy a uniquely privileged status in any case. One reason is that
it is "in the Core" - to this day, despite our best efforts with ideas
like bundled packages, a substantial fraction of the community feels
that anything that is not "in the Core" is relegated to second-class
status. Alas, I don't have a good solution to this problem, which is
more social than technical. I've tried to make a worked example of
what I see as "out of Core but still first-class" with the tdbc
package, and Don, Richard, Arnulf and Jan have made really significant
strides with bundling SQLite, the [incr] suite, tclconfig, and so
on. But the stigma remains.

A more significant reason that the chosen implementation is likely to
be privileged is simply resource constraints. In the short term, I
expect that the chosen parser will be rendered more efficient by
intimacy with the internals of Tcl - intimacy that we are not prepared
to provide to extensions simply because nobody has time to document
and refactor the interfaces in such a way as to make them stable and
useful to the public API. We don't want the internals to wind up set
in stone because some ill-considered public interface codifies a bad
implementation. In the intermediate term, I do assume that quadcode,
at least, will be able to cope with one [eatargs] implementation. If
they proliferate, Donal and I are not going to have the bandwidth to
do them all, and disfavoured ones will have a really significant
performance penalty in that world. 

An aside: Sorry about that!  We are both more technicians than
evangelists, and have struggled to recruit more quadcode
developers. It's a difficult process - many people are intimidated by
the complexity of what he and I work on day to day, and it's been
difficult to communicate that there are indeed tasks that don't
require "research grade" programming skills.

In the end, the balance of factors seems as if it favors a separate
command rather than integration with [proc]/[method]/[apply]. I do,
however, mourn the loss of introspection that comes with that
approach.

2. Introspection without an extended [info]
-------------------------------------------

It occurs to me that we could get most of our introspection back if
we adopt Tk's way of doing business - ask a command itself what its
syntax is. This would require, in the best of all possible worlds,
two things of the command:

 (1) An option, probably something like '-?' or '-help' or '--help',
     that would return a human-readable description of the available
     options.

 (2) An option, probably something like '-argspec', that provides
     a machine-readable description of the available options. This
     option could simply echo the description that was provided by
     '-eatargs'.

For this combination really to be workable, it needs to report on
all command line arguments. That implies that (a) it needs to know
what the command prefix was - what were the items that preceded
the arguments on the command line? (This question can be complex
in a world of ensembles, methods, and interpreter aliases. I don't
really have the time right now to sourcedive to see how it's done,
but since the existing 'wrong # args' from these things figures
it out, I presume that we can in the implementation.) Second, it
needs to report on all the args to a command. That would imply that
[eatargs] should, except in special circumstances, be coded so
that the only parameter to the [proc] is 'args', with [eatargs]
responsible for deconstructing the entire argument list. For proper
error messages, [eatargs] may need to have special processing in the
internals to get the command prefix correct in the case of methods and
ensembles. 

3. Combinations of Syntax
-------------------------

There is a lot of discussion about combinations of keyword parameters,
optional positional parameters, and 'args', and about the cost of deep
parameter inspection and the risks of injection attacks. When proposed
limitations are raised, though, people keep presenting counterexamples
of particular Core commands that can't be modeled under these
restrictions. Unfortunately, I don't believe that there's any single
system that would model the complete variety of what the Core commands
do, and truly their syntax isn't entirely coherent. Many of them would
not be designed the same way if we had them to do over.

So, let's try to reason from first principles what we can and cannot
allow, and what the choices to make are.

Required positional parameters by themselves are trivial - that's a
tiny subset of even what today's [proc] supports.

A combination of required positional parameters and optional positional
parameters, in any sequence, can be resolved by the rule that optional
positional parameters are filled from left to right. Thus, as long as
'args' and keyword parameters aren't about, we can mix required and
optional parameters in any order, and the parameter count alone will
be enough to resolve them.

We can continue in this vein to add 'args', and still keep the
parameters in any order. Optional parameters can be filled from left
to right, and then anything left over can be assigned to 'args'.
There is a fairly simple loop that can do this assignment in one pass.
I show it in pseudo-Tcl as Figure 1. This addition brings us a little
bit beyond what TIP 288 can parse, although the authors of TIP 288
promised that a following TIP would address optional positional
parameters.

    -------------------------------------------------------------------------
    set i 0;            # index into the formal parameters
    set j 0;            # index into the actual arguments
    set m [number of mandatory args]
    ;                   # count of unfilled mandatory args
    set n [number of optional args]
    ;                   # count of unfilled optional args
    set p [llength $args]
    ;                   # count of unpaired params

    if ($p < $m
        || $p > $m + $n && !([args present])) {
        wrong-#-args
    }

    while ($p > 0) {
        if [is-mandatory $i] {
            pair actual param $j with formal param $i
            incr j
            incr p -1
            incr m -1
        } elseif [is-optional $i] {
            if {$p > $m} {
                pair actual param $j with formal param $i
                incr j
                incr p -1
            } else {
                supply default for formal param $i
            }
            incr n -1
        } else {     # 'args'
            put actual params $j to $j + $p - $m - $n - 1 into args
            set j [expr {$j + $p - $m - $n}]
        }
        incr i
    }

    Figure 1. Pseudo-code for a loop to unpack optional parameters,
              mandatory parameters, and 'args'
    -------------------------------------------------------------------------

4. Keyword parameters
---------------------

With the portion of the proposal that's duplicative of TIP 288 out of
the way, we come to the most controversial part: keyword parameters.
If we see a Tcl command as consisting of a verb that applies to a
collection of nouns:

    do-something this that the-other-thing

then keyword parameters are adverbs '-quickly' and prepositional phrases
'-with alacrity'. For clarity, I shall use 'noun', 'adverb',
'preposition' and 'object' to describe these components.

Let's consider a few desired features that we want command syntax to
have.

It should ideally not require deep inspection of the parameter
data to resolve the syntax.

Starting the parameter list
---------------------------

The parser should know when an adverb or preposition is expected and
simply be able to examine what word it is, without any worries about
constructing the string representation of large data items to find
out.

As a corollary to this rule, we cannot have a varying number of
arguments before the keyword parameters begin. There is simply no way
in that case to indicate the start of keyword parameters that is not
data dependent. Mandatory positional parameters are therefore the only
things that may come before the keyword parameters.

Clearly, the keyword parameters must follow next, in a block.  The
Unix command-line practice of allowing non-keyword arguments
interspersed among the keyword arguments has nothing to recommend it
to Tcl. In fact, I believe that it has nothing to recommend it to
Unix, and a great many Unix commands fail horribly if there is a file
name with a leading hyphen, and require special command-line syntax to
deal with the case. That sort of nonsense will be impossible to
compile - how can we tell whether a parameter in a variable '$a' that
has just been read from an external medium is a keyword or not?

It is error-prone in other ways as well. If the proposal as finally
voted allows for intermixed keywords and values based on string
inspection, I shall vote NO.

Ending the parameter list
-------------------------

Now, we need to resolve where the keyword parameters end. One method
that is guaranteed free of ambiguity is to require the caller to mark
the end of the keyword parameters with an explicit '--' token. This is
probably the simplest approach that can be taken.

There are those that consider the requirement of '--' to be poor
style, so let's visit a couple of special cases where it could be
dispensed with.

First, if all the parameters that follow are required parameters, we
must be at the end of the keyword parameter list. This rule allows for
syntax like the Core [lsort] command:

    lsort -integer -stride 2 -index 1 -increasing $list

where $list is a required parameter. The last parameter to the command
must be the list to sort.  This is a common and clean enough case that
we almost always should allow it

The other special case that I would like to see is that, if all the
allowed keyword parameters are prepositional phrases (that is, each
consumes two actual arguments: -keyword value), and there is a single
optional positional parameter to follow (along with any number of
mandatory ones), without 'args', then we can end the keyword
parameters when there are just enough mandatory arguments left to fill
the parameter list. Consuming a prepositional phrase would leave too
few arguments to satisfy the mandatory parameters, so that cannot be a
valid parse.

I mention this case because a number of tdbc methods (and, I suspect,
a few other commands) depend on it to allow '--' to be omitted.
The 'allrows' method, for instance, on a tdbc connection looks like

    db allrows ?-as lists|dicts? ?-columnsvariable name? ?--?
        sql-statement ?dictionary?

The '--' can be omitted in all cases, because if only one argument
remains, it must be the sql-statement; if two remain, they must be
sql-statement and dictionary because if they were a keyword-value
pair, the mandatory sql-statement would be omitted.

Deprecated alternative: non-hyphen args
---------------------------------------

We could also adopt the rule that if an argument without a leading
hyphen is encountered where a keyword is expected, that ends the
keyword arguments. Core commands such as [puts] follow this rule.
Note, however, that adopting this rule means that we are requiring
deep inspection of the parameter data. [puts $very_large_list xyz]
will therefore need to stringify the list - to determine that it
is not '-nonewline'.

In the case of the Core [switch] command, the rule that 'an argument
without a hyphen ends the keyword arguments' is particularly noxious.
For instance, if there is exactly one switch case, and the case list
is unbraced,

    switch -exact $arg thing1 {script1}

the command cannot be compiled. The reason is that we don't know
until run time whether the content of $arg might be one of the adverbs
'-exact', '-glob', '-regexp', '-nocase' or the marker '--'. In all
five of these cases, 'thing1' is the value to be switched on, and
instead of a script, the string in braces is the list of switch cases.

In fact, compilation of the case,

    switch -exact $arg {list of cases}

is fairly recent. For several releases, [switch] was compiled to
bytecode only if the compiler could statically locate the '--'.

I would argue that if this current analysis had been performed at the
time that the relevant Core commands were designed, they would have
been designed NOT to require parameter inspection. These cases have
certainly been a source of bugs in the past!

Rejected alternative: multiple tranches of keyword parameters
-------------------------------------------------------------

Once we've reached the end of the keyword parameters by encountering a
'--' (or, in the deprecated alternative, an argument without a hyphen)
we're essentially in the same state as we were in at the start of the
parse. It is conceivable that a specification could include a further
group of (zero or more) mandatory parameters and a further keyword
parameter list.

I can't see an obvious use for this syntax, and it appears to me to
produce API's that are error-prone and difficult to document. I would
require a very convincing argument before I could bring myself to vote
YES on an interface that supports this case.

After the parameter list
------------------------

Once we've determined in the argument parse that we've reached the
end of the keyword parameters, we are left with a group of optional
positional parameters, mandatory positional parameters, and 'args'.
They may be freely intermixed, and the algorithm of Figure 1 will
still sort them out.

5. Additional notes
-------------------

One fairly nice feature at all this is that it doesn't require
steps of 'consume args from the end' or 'scan the argument list
for an occurrence of ...'.  It can function in one trip through the
actual arguments, from left to right, binding variables as it goes.

The syntax winds up being something like:

    [command] [mandatory arg]* ([adverb] | [prepositional phrase])* \
        '--' ([mandatory arg] | [optional arg] | 'args')*

with additional semantic constraints that oly one instance of 'args'
is permitted, and that '--' may be omitted if the material to the
right has no instance of 'args' and at most one instance of
[optional arg]. The grammar, even with the semantic constraints,
is regular, and a simple state machine can therefore handle all the
argument binding.

I argue that the set of rules I present here is the least restrictive
set that we can adopt and still guarantee sane behaviour. It shouldn't
be impossible to document clearly:

    The keyword parameters must all be adjacent in the parameter list.
    Only mandatory positional parameters may precede them. Following
    them may be mandatory positional parameters, optional positional
    parameters, and the special keyword 'args'. These parameters may
    be intermixed in any order. The rule at run time is that the
    number of actual arguments is counted. Optional parameters
    are marked to be filled, from left to right, as long as there
    are enough arguments to supply them. If there are more arguments
    than needed for all optional parameters, the excess count is
    given to 'args'. Once it is known which parameters are to be
    filled, the argument list is bound to the actual parameters,
    proceeding from left to right.

    Example: If the formal parameter list is

        {a {b B} args {c C} d {e E}},

    then the result of the binding will be as follows for various
    argument lists:

    p -> wrong # args
    p q  ->  a=p b=B args={} c=C d=q e=E
    p q r -> a=p b=q args={} c=C d=r e=E
    p q r s -> a=p b=q args={} c=r d=s e=E
    p q r s t -> a=p b=q args={} c=r d=s e=t
    p q r s t u -> a=p b=q args=r c=s d=t e=u
    p q r s t u v -> a=p b=q args={r s} c=t d=u e=v

From the discussions that I've seen go by on the mailing list, I
suspect that what I describe in this post is fairly close to what has
actually been implemented so far. That said, I've not got to the point
of inspecting the code and test suite to find out. I first needed to
make sure that I'm thinking clearly about the requirements. Only then
will inspecting the implementation be informative.

Anyway, I hope this post helps others clarify their thinking about
these controversies. I welcome corrections, of course. There is a lot
of reasoning here, and I may well have strayed into error.


Kevin

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions; introspection; order of parameters; ambiguity

Colin McCormack-3


On Thu, 25 May 2017 at 12:27 Kevin Kenny <[hidden email]> wrote:

Colin makes a valid argument that the functionality is useful beyond
[proc], [method], and [apply] of λ-forms. In particular, invocations
of coroutines would conceivably want to parse keyword parameters. (I'm
not quite following the argument about [interp alias], but that really
doesn't matter. A single coherent example is enough to demonstrate
that the set is incomplete.) That militates in favor of doing argument
processing with [eatargs].

My argument is that to the extent the functionality is useful it is useful in any script (or C function) which has the effect of defining a Tcl command.  I don't even understand why that's a contentious assertion.  Apparently it is.

Secondarily, and more importantly: whether the functionality is useful or not, introducing it in some command forms and not others doesn't serve Tcl well, as a language.
 
I initially disfavored this approach for three reasons. The first is
that it's some amount of additional work for program analysis. As far
as I can grasp from Colin's verbose screeds, he seems to be labouring
under the misconception that I was referring to human program
understanding.

I'm sorry my screeds are so verbose.  I'm happy one of my valid points rose above the din.

I came late to this party, I don't think I'm even aware of program analysis in the punch.

I was more worried about the kool aid, and it's possible I was talking to some of the claims that this TIP will bring Great Convenience to all.

I read the rest of your post with interest.  You directly addressed the impact of selecting a sophisticated arg-processing form to be imposed on the language, which I'm glad to hear from someone who sees merit in the proposal.  There's also much goodness in the text, and while it too may be verbose, I would say it's not prolix. 

Colin


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions; introspection; order of parameters; ambiguity

Peter da Silva-2
In reply to this post by Kevin Kenny-6

The restrictions on the semantics of named args that Kevin discusses in this post seem reasonable to me. This is the right time to implement such restrictions, before any legacy code exists.

 

My performance-related objection to eatargs is that it would seem to preclude optimizations by the compiler guided by the arg syntax. After reading Kevin’s discussion of what the compiler is currently doing, it seems that’s somewhat covered a transition to the new proc syntax could eventually allow for simplifications of the compiler as upvar transitions over time to -upvar.

 

The existence of a standalone entry point for the parameter parsing is a good idea, but it’s really a separate issue from whether proc and other tools that use the underlying API should include this parameter parsing. A separate standalone entry point for handling things like script level command line parsing could still be created even if this code were always used by proc.

 

Given that it’s been carefully designed to never conflict with existing proc syntax, and there’s no overhead at runtime for code that doesn’t use the extended syntax, I’m still having trouble understanding the objections to including it in proc.


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions;

Andreas Leitgeb
In reply to this post by Kevin Kenny-6
Kevin Kenny <[hidden email]> wrote:
> 1. Command, or extension to [proc]/[method]/[apply] ?
> -----------------------------------------------------
> In the end, the balance of factors seems as if it favors a separate
> command rather than integration with [proc]/[method]/[apply]. I do,
> however, mourn the loss of introspection that comes with that
> approach.

The loss of introspection isn't the only thing to mourn about. There's also:
 - loss of being able to inject code at start of a body.
 - loss of the simple pattern "proc-name proc-params proc-body", towards
     (bungle-like) "args" as proc-params and specify the real params
     inside the body.  That grossly violates ergonomy. This is most
     likely what FA meant behind their wording "lame".

> 2. Introspection without an extended [info]
> -------------------------------------------
> It occurs to me that we could get most of our introspection back if
> we adopt Tk's way of doing business - ask a command itself what its
> syntax is.

Calling a proc to obtain a usage message is convenient in specific
interactive cases, but far away from a generally available mechanism.

Of all potential future procedures taking an "args" on first line, a
not-neglectible fraction would actually take args as a list of objects
to deal with.  And debugging/instrumentation tools just cannot rely
on it.  (Reminds me of my recent attempt of getting a usage from/for
'namespace ensemble create' - still had to look for the man page.)

That problem about a correct usage message even for ensemble-members
appears to me like dwarfed by the problem of general availability.

PS: auto-checking for certain options in some "eatargs" implementation
  would induce even worse data-triggered bugs, than those (meanwhile
  solved!) ones in a previous version of the tip457.

> 3. Combinations of Syntax
> -------------------------
> There is a lot of discussion about combinations of keyword parameters,
> optional positional parameters, and 'args', and about the cost of deep
> parameter inspection and the risks of injection attacks. When proposed
> limitations are raised, though, people keep presenting counterexamples
> of particular Core commands that can't be modeled under these
> restrictions.

Nobody here raised those commands that really cannot be modelled in a
param-spec ("if" as most prominent example).  All the commands raised
so far do make sense as being describeable with a param spec.

My dream of tip-457-and-beyond would have been, that even all C-coded
commands would eventually get a means to describe their interface as a
param-spec, retrieveable by [info args ...] and only very few of them
(like "if") would fall back to "args".

There is of course a chasm between param-specs that are merely unambiguous
for binding, and those that are even unambiguous at compile-time before the
values are known. Even with a perfectly restricted system of compile-time
predetermined param-specs there are still cases (around {*}-expansion, or
with a "--" potentially introduced by a subst), that cannot be compiled,
anyway, but can still be legally called.

Trying to tie it down to prevent not-compileable invocations seems like
a "wag-the-dog" case to me.

> We can continue in this vein to add 'args', and still keep the
> parameters in any order. Optional parameters can be filled from left
> to right, and then anything left over can be assigned to 'args'.

I certainly prefer this approach to branch aspect-tip288.

> There is a fairly simple loop that can do this assignment in one pass.
> I show it in pseudo-Tcl as Figure 1.

Apart from that you omitted the necessary "sub-zero"-guard for args, it's
essentially what I've done in my procx.


> 4. Keyword parameters
> ---------------------
> For clarity, I shall use 'noun', 'adverb', 'preposition' and 'object'
> to describe these components.

Splendid!  Not sure about the "nouns", though. Are these the required
positionals, versus "object" the defaulted ones?

> The parser should know when an adverb or preposition is expected and
> simply be able to examine what word it is, ...
> As a corollary to this rule, we cannot have a varying number of
> arguments before the keyword parameters begin.

This corollary is not entirely true: given a contrived proc like:
  proc log {message {qualifier {}} {qualopts -name {...}}} { ... }
then qualifier options could only be passed after a qualifier.
This is *almost* the same thing as having a series of defaulted
params in current proc: to provide a value for a latter one, one
must provide values to all former ones.

There is, however, an important difference between *required* objects
and *required* adverbs/prepositions - something I tried to solve in
procx and consider failed: the point being is, that one doesn't know
how much to "reserve" for required named params following lateron.

While optional (which they usually are, unless explicit "-required 1")
adverbs and prepositions can unambiguously follow optional positional
params, they shouldn't follow "args" (that's part 2 of procx's problem).
The reason here again, that "args" just cannot know how many of the
arguments to leave back for those.


> Clearly, the keyword parameters must follow next, in a block.  The
> Unix command-line practice of allowing non-keyword arguments
> interspersed among the keyword arguments has nothing to recommend it
> to Tcl.

Most of the unix commands stop accepting options once the first
non-option or "--" is encountered. (iirc that's a feature of usual
getopt() impl's) Only rare commands accept further options interspersed
with objects.

I feel "guilty" for having brought in the possibility of multiple
blocks of named params, and I did it only in the light that it would
be a dead giveaway. The "cases" I had in mind involved separating
multiple blocks of named params by literally given subcommands,
which even a compiler could have got right, but it's not worth
jumping through loops if that would actually be necessary.

> Deprecated alternative: non-hyphen args
> ---------------------------------------
> We could also adopt the rule that if an argument without a leading
> hyphen is encountered where a keyword is expected, that ends the
> keyword arguments. Core commands such as [puts] follow this rule.
> Note, however, that adopting this rule means that we are requiring
> deep inspection of the parameter data.

This deep inspection of the parameter data isn't an issue if the data
is constrained by the proc's semantics.  Who cares, if a channel name
or a widget path name is inspected for a leading dash?

The compiler cannot know about these, but in cases where the values are
given literally it shouldn't have a problem at all, and in other cases
like that of puts with a channel, the compiler just has to sigh and emit
an "invokeStk #" (or whatever equivalent in quadcode).

One even much worse paradigm, that new param-specs don't (and really
shouldn't) follow is that of abbreviating option names down to the
shortest unique prefix.  Instead a proc/commmand should specify some
abbreviations that make sense, and that will be preserved even if it
later grows a new option with same prefix. Tip 457 does this correctly,
and I send a "no!" to those who request this utterly broken unique-prefix
bungle for tip-457 named params.

> Rejected alternative: multiple tranches of keyword parameters
> -------------------------------------------------------------
Ok.

> 5. Additional notes
> -------------------
> One fairly nice feature at all this is that it doesn't require
> steps of 'consume args from the end' or 'scan the argument list
> for an occurrence of ...'.  It can function in one trip through the
> actual arguments, from left to right, binding variables as it goes.
Yes!

> The syntax winds up being something like:
>     [command] [mandatory arg]* ([adverb] | [prepositional phrase])* \
>         '--' ([mandatory arg] | [optional arg] | 'args')*

Rather:
     [command] [mandatory arg]* [optional arg]*  \
         ( ( [adverb] | [prepositional phrase] )+ '--' )? \
         ([mandatory arg] | [optional arg] | 'args')*

I'm not going to "fight" for the extra [optional arg]*, but it is
technically unambiguous *even if used with non-literals*.

The '--' obviously shouldn't even be allowed *unless* there exist
named params.

The addendum about only one 'args' and about '--' being optional in
some cases depending on last part of course still applies.

> From the discussions that I've seen go by on the mailing list, I
> suspect that what I describe in this post is fairly close to what has
> actually been implemented so far.

Not quite my understanding of it.  Mathieu's implementation is a
tad stricter than that by not going into the tip-288 domain beyond
those fixed required params after options.  It doesn't allow 'args'
to be followed by other params while still be treated specially.

> I welcome corrections, of course.
well...

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions;

Kevin Kenny-6
On Thu, May 25, 2017 at 9:31 AM, Andreas Leitgeb <[hidden email]> wrote:
> The loss of introspection isn't the only thing to mourn about. There's also:
>  - loss of being able to inject code at start of a body.
>  - loss of the simple pattern "proc-name proc-params proc-body", towards
>      (bungle-like) "args" as proc-params and specify the real params
>      inside the body.  That grossly violates ergonomy. This is most
>      likely what FA meant behind their wording "lame".

I'm willing to entertain the idea of having extended versions of
[proc], [method] and [apply] that have the extended parameter
specifications built in. Nevertheless, I've been convinced that
coroutine invocations, at least, require that the [eatargs]
functionality be available stand-alone. There just isn't any place to
hang the information about what [yieldto] is expecting for its next
set of arguments.

Given that [eatargs] has to be available for that function, my natural
inclination is to avoid the syntactic sugar of decorating [proc],
[method] and [apply] with the arg specs, in favor of leading off with
[eatargs]. I started off favoring the approach of having the arg specs
on the procedure-defining commands, but once I realized that [eatargs]
is needed in other contexts, I reconsidered. It's not that strong a
preference, and putting decorated params back in wouldn't be a
showstopper for me. Not having a stand-alone [eatargs] available would
be a showstopper at this point.

As far as the injection of code goes, anything that today has similar
functionality (keyword parameters, upvar and uplevel, etc.) puts the
requirement on the instrumentor that the injected code be able to deal
with 'args', with variable names that have not yet been passed to
'upvar', and so on. If I understand you correctly, you're complaining
about new functionality that [eatargs] would not provide, rather than
existing functionality that would be lost.

>> 2. Introspection without an extended [info]
>> -------------------------------------------
>> It occurs to me that we could get most of our introspection back if
>> we adopt Tk's way of doing business - ask a command itself what its
>> syntax is.
>
> Calling a proc to obtain a usage message is convenient in specific
> interactive cases, but far away from a generally available mechanism.
>
> Of all potential future procedures taking an "args" on first line, a
> not-neglectible fraction would actually take args as a list of objects
> to deal with.  And debugging/instrumentation tools just cannot rely
> on it.  (Reminds me of my recent attempt of getting a usage from/for
> 'namespace ensemble create' - still had to look for the man page.)
>
> That problem about a correct usage message even for ensemble-members
> appears to me like dwarfed by the problem of general availability.
>
> PS: auto-checking for certain options in some "eatargs" implementation
>   would induce even worse data-triggered bugs, than those (meanwhile
>   solved!) ones in a previous version of the tip457.

I'll concede this to be the weakest plank in the platform. I really
don't have a good general answer. Perhaps I'm putting too much weight
on introspectability.

>> 3. Combinations of Syntax
>> -------------------------
>> There is a lot of discussion about combinations of keyword parameters,
>> optional positional parameters, and 'args', and about the cost of deep
>> parameter inspection and the risks of injection attacks. When proposed
>> limitations are raised, though, people keep presenting counterexamples
>> of particular Core commands that can't be modeled under these
>> restrictions.
>
> Nobody here raised those commands that really cannot be modelled in a
> param-spec ("if" as most prominent example).  All the commands raised
> so far do make sense as being describeable with a param spec.
>
> My dream of tip-457-and-beyond would have been, that even all C-coded
> commands would eventually get a means to describe their interface as a
> param-spec, retrieveable by [info args ...] and only very few of them
> (like "if") would fall back to "args".
>
> There is of course a chasm between param-specs that are merely unambiguous
> for binding, and those that are even unambiguous at compile-time before the
> values are known. Even with a perfectly restricted system of compile-time
> predetermined param-specs there are still cases (around {*}-expansion, or
> with a "--" potentially introduced by a subst), that cannot be compiled,
> anyway, but can still be legally called.
>
> Trying to tie it down to prevent not-compileable invocations seems like
> a "wag-the-dog" case to me.

You misunderstand me. I'm not trying to prevent non-compilable
invocations, and some form of 'eval' will always be with us. What I am
trying to do is to make sure that the common static cases will be
compilable. The nastiness that surrounded [switch] is an example,
where code could specify all the -keywords as constants, and still not
be compilable because it was not provable that variable args were not
-keywords. I don't want the latter to become the common case. Falling
back on interpreted code should be the exception rather than the rule.

> Apart from that you omitted the necessary "sub-zero"-guard for args, it's
> essentially what I've done in my procx.

OK, it was a quick hack. Feel free to correct the details. At least the
examples of the behaviour are correct, aren't they?

>> 4. Keyword parameters
>> ---------------------
>> For clarity, I shall use 'noun', 'adverb', 'preposition' and 'object'
>> to describe these components.
>
> Splendid!  Not sure about the "nouns", though. Are these the required
> positionals, versus "object" the defaulted ones?

I had intended 'nouns' to encompass required, optional, and 'args';
'objects' are associated with the prepositions. (I'm sufficiently
rusty on German grammar not to be able to translate the
technical term, and the corresponding concept in German
confuses me in any case: I can manage to distinguish
'ich lege das Buch auf deN Tisch' from 'das Buch liegt
auf deM TischE', but phrases like 'außerhalb VOM Garten'
versus 'außerhalb DES GartenS' or 'trotz deM Wetter'/
'trotz deS WetterS' strain both my memory and my comprehension.

>> The parser should know when an adverb or preposition is expected and
>> simply be able to examine what word it is, ...
>> As a corollary to this rule, we cannot have a varying number of
>> arguments before the keyword parameters begin.
>
> This corollary is not entirely true: given a contrived proc like:
>   proc log {message {qualifier {}} {qualopts -name {...}}} { ... }
> then qualifier options could only be passed after a qualifier.
> This is *almost* the same thing as having a series of defaulted
> params in current proc: to provide a value for a latter one, one
> must provide values to all former ones.

Yes, we could contrive such a thing. I'd be hard-put, though, to
come up with a general set of rules that would make it unambiguous.
Would the qualifier be interepreted as such only if it doesn't begin
with a hyphen? Only if it doesn't match one of the qualopts?
Can a qualifier have text that matches one of the qualopts, and
how would you specify that?

> There is, however, an important difference between *required* objects
> and *required* adverbs/prepositions - something I tried to solve in
> procx and consider failed: the point being is, that one doesn't know
> how much to "reserve" for required named params following lateron.

Exactly. Stu has also pointed out on the Chat, at least, that
it is useful to allow multiple instances of the same preposition,
with later instances overriding earlier ones. That allows for
defaults to be supplied at the start of a command line with
a later interpolation of {*}$args to override them. So even if
all your keyword parameters are -required 1, you still don't
know how many you have.

>> Clearly, the keyword parameters must follow next, in a block.  The
>> Unix command-line practice of allowing non-keyword arguments
>> interspersed among the keyword arguments has nothing to recommend it
>> to Tcl.
>
> Most of the unix commands stop accepting options once the first
> non-option or "--" is encountered. (iirc that's a feature of usual
> getopt() impl's) Only rare commands accept further options interspersed
> with objects.

The toolchain commands are the chief offenders here. 'gcc', 'ld'
and similar commands allow -options and file names to be
interspersed helter-skelter, and even are sensitive to the ordering
of the -options. I want that sort of thing to be Out Of Scope
for a standard args-parsing procedure.

> I feel "guilty" for having brought in the possibility of multiple
> blocks of named params, and I did it only in the light that it would
> be a dead giveaway. The "cases" I had in mind involved separating
> multiple blocks of named params by literally given subcommands,
> which even a compiler could have got right, but it's not worth
> jumping through loops if that would actually be necessary.

Don't feel guilty at all. In a proposal like this, it's important to
consider all the cases. We can decide that certain cases are
out of scope, but that should be a conscious decision, and
we should have a clear specification of what the eventual
implementation will and will not address.

>> Deprecated alternative: non-hyphen args
>> ---------------------------------------
>> We could also adopt the rule that if an argument without a leading
>> hyphen is encountered where a keyword is expected, that ends the
>> keyword arguments. Core commands such as [puts] follow this rule.
>> Note, however, that adopting this rule means that we are requiring
>> deep inspection of the parameter data.
>
> This deep inspection of the parameter data isn't an issue if the data
> is constrained by the proc's semantics.  Who cares, if a channel name
> or a widget path name is inspected for a leading dash?

I should perhaps have said 'disfavoured' rather than 'deprecated'.
I'm willing to permit it, but with the clear warning that it lacks full
generality and invites errors.

Inspecting a widget name or a channel name for a
leading hyphen is fine. But the temptation to use similar syntax
with file names, command and variable names, and data from
external sources will be almost irresistible if this becomes
a popular approach. We're all aware that those things can indeed
contain leading hyphens, legitimately. We do not want the
correct interpretation of a command to be jeopardized by
an unusual but legal name for something. The extra cost
of having to write something like '-fromfile $filename'
in place of '$filename' will sometimes be the price of
getting an unambiguous and safe parse.

> The compiler cannot know about these, but in cases where the values are
> given literally it shouldn't have a problem at all, and in other cases
> like that of puts with a channel, the compiler just has to sigh and emit
> an "invokeStk #" (or whatever equivalent in quadcode).

See above. I'm less concerned with the ability to compile as I am
with the ability to interpret the command syntax correctly.

> One even much worse paradigm, that new param-specs don't (and really
> shouldn't) follow is that of abbreviating option names down to the
> shortest unique prefix.  Instead a proc/commmand should specify some
> abbreviations that make sense, and that will be preserved even if it
> later grows a new option with same prefix. Tip 457 does this correctly,
> and I send a "no!" to those who request this utterly broken unique-prefix
> bungle for tip-457 named params.

Amen..

>> The syntax winds up being something like:
>>     [command] [mandatory arg]* ([adverb] | [prepositional phrase])* \
>>         '--' ([mandatory arg] | [optional arg] | 'args')*
>
> Rather:
>      [command] [mandatory arg]* [optional arg]*  \
>          ( ( [adverb] | [prepositional phrase] )+ '--' )? \
>          ([mandatory arg] | [optional arg] | 'args')*
>
> I'm not going to "fight" for the extra [optional arg]*, but it is
> technically unambiguous *even if used with non-literals*.

I already asked above. I'm not certain that I follow your argument
about the ambiguity. It strikes me that you haven't ruled out all the
evil cases.

In any case, anywhere that mandatory and optional
args can appear together, they can be intermixed freely.
As long as we know how many optional args we need to
fill, we can come up with the mapping unambiguously.

> The '--' obviously shouldn't even be allowed *unless* there exist
> named params.

Yeah.

> The addendum about only one 'args' and about '--' being optional in
> some cases depending on last part of course still applies.

Right. It's important to support at least the syntax of

    verb -adverb -preposition object ... noun

without needing the -- to introduce the nouns.

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions;

Colin McCormack-3
As a general principle, I prefer not to replace things but to add new things.

As I understand it (being unable to oilpaint or watercolour) in watercolour you have very few opportunities to modify the image once you commit to it, whereas in oils you start with a general/abstract background and overpaint it with more details - oils are necessarily layered.   I like being able to start with a broad and general thing, then to add more specificity to it.

In this sense, I do not think redefining ::proc is advisable.  There is value in being able to say something simple, knowing that it is simple.

I, personally, don't have the hubris I think it requires to say, of an arg protocol (say), not only do I think this arg protocol is the best possible (or achievable), but I think it is the best for *everyone*.  I think it requires considerable hubris, or (as an alternative) requires a deep desire to prevent people from doing things I didn't expect them to do.

When, to create a parallel set of command-generating-commands with precisely the same semantics (except for a variant arg protocol) poses *no* cost to the proponents who truly believe this variant protocol is a general good, and who wish to use institutional power to impose it on those who have contracted to be dictated to (employees of FA, e.g.) And when there is no technical justification presented for leaving others undisturbed, even if it be in their world of degenerate ::proc arg protocol, I can neither understand, respect or accept the will to redefine ::proc.cause

I have tried to understand it, by asking for any kind of real justification, but all I have gleaned is "avoidance of lame" where lame is undefined, and "it's justified because that's what we decided to do", and (tacitly) because that's what the bounty or some private conversation specified, and "it's up for voting, and voting justifies anything."

I have failed to understand it.
Because I can't find any way to understand it, I cannot respect it.
Because I can't respect it, I can't accept it.  FWIW.

Colin (trying to be only as prolix as necessary, but not sesquipedalian.)

On Fri, 26 May 2017 at 02:06 Kevin Kenny <[hidden email]> wrote:
On Thu, May 25, 2017 at 9:31 AM, Andreas Leitgeb <[hidden email]> wrote:
> The loss of introspection isn't the only thing to mourn about. There's also:
>  - loss of being able to inject code at start of a body.
>  - loss of the simple pattern "proc-name proc-params proc-body", towards
>      (bungle-like) "args" as proc-params and specify the real params
>      inside the body.  That grossly violates ergonomy. This is most
>      likely what FA meant behind their wording "lame".

I'm willing to entertain the idea of having extended versions of
[proc], [method] and [apply] that have the extended parameter
specifications built in. Nevertheless, I've been convinced that
coroutine invocations, at least, require that the [eatargs]
functionality be available stand-alone. There just isn't any place to
hang the information about what [yieldto] is expecting for its next
set of arguments.

Given that [eatargs] has to be available for that function, my natural
inclination is to avoid the syntactic sugar of decorating [proc],
[method] and [apply] with the arg specs, in favor of leading off with
[eatargs]. I started off favoring the approach of having the arg specs
on the procedure-defining commands, but once I realized that [eatargs]
is needed in other contexts, I reconsidered. It's not that strong a
preference, and putting decorated params back in wouldn't be a
showstopper for me. Not having a stand-alone [eatargs] available would
be a showstopper at this point.

As far as the injection of code goes, anything that today has similar
functionality (keyword parameters, upvar and uplevel, etc.) puts the
requirement on the instrumentor that the injected code be able to deal
with 'args', with variable names that have not yet been passed to
'upvar', and so on. If I understand you correctly, you're complaining
about new functionality that [eatargs] would not provide, rather than
existing functionality that would be lost.

>> 2. Introspection without an extended [info]
>> -------------------------------------------
>> It occurs to me that we could get most of our introspection back if
>> we adopt Tk's way of doing business - ask a command itself what its
>> syntax is.
>
> Calling a proc to obtain a usage message is convenient in specific
> interactive cases, but far away from a generally available mechanism.
>
> Of all potential future procedures taking an "args" on first line, a
> not-neglectible fraction would actually take args as a list of objects
> to deal with.  And debugging/instrumentation tools just cannot rely
> on it.  (Reminds me of my recent attempt of getting a usage from/for
> 'namespace ensemble create' - still had to look for the man page.)
>
> That problem about a correct usage message even for ensemble-members
> appears to me like dwarfed by the problem of general availability.
>
> PS: auto-checking for certain options in some "eatargs" implementation
>   would induce even worse data-triggered bugs, than those (meanwhile
>   solved!) ones in a previous version of the tip457.

I'll concede this to be the weakest plank in the platform. I really
don't have a good general answer. Perhaps I'm putting too much weight
on introspectability.

>> 3. Combinations of Syntax
>> -------------------------
>> There is a lot of discussion about combinations of keyword parameters,
>> optional positional parameters, and 'args', and about the cost of deep
>> parameter inspection and the risks of injection attacks. When proposed
>> limitations are raised, though, people keep presenting counterexamples
>> of particular Core commands that can't be modeled under these
>> restrictions.
>
> Nobody here raised those commands that really cannot be modelled in a
> param-spec ("if" as most prominent example).  All the commands raised
> so far do make sense as being describeable with a param spec.
>
> My dream of tip-457-and-beyond would have been, that even all C-coded
> commands would eventually get a means to describe their interface as a
> param-spec, retrieveable by [info args ...] and only very few of them
> (like "if") would fall back to "args".
>
> There is of course a chasm between param-specs that are merely unambiguous
> for binding, and those that are even unambiguous at compile-time before the
> values are known. Even with a perfectly restricted system of compile-time
> predetermined param-specs there are still cases (around {*}-expansion, or
> with a "--" potentially introduced by a subst), that cannot be compiled,
> anyway, but can still be legally called.
>
> Trying to tie it down to prevent not-compileable invocations seems like
> a "wag-the-dog" case to me.

You misunderstand me. I'm not trying to prevent non-compilable
invocations, and some form of 'eval' will always be with us. What I am
trying to do is to make sure that the common static cases will be
compilable. The nastiness that surrounded [switch] is an example,
where code could specify all the -keywords as constants, and still not
be compilable because it was not provable that variable args were not
-keywords. I don't want the latter to become the common case. Falling
back on interpreted code should be the exception rather than the rule.

> Apart from that you omitted the necessary "sub-zero"-guard for args, it's
> essentially what I've done in my procx.

OK, it was a quick hack. Feel free to correct the details. At least the
examples of the behaviour are correct, aren't they?

>> 4. Keyword parameters
>> ---------------------
>> For clarity, I shall use 'noun', 'adverb', 'preposition' and 'object'
>> to describe these components.
>
> Splendid!  Not sure about the "nouns", though. Are these the required
> positionals, versus "object" the defaulted ones?

I had intended 'nouns' to encompass required, optional, and 'args';
'objects' are associated with the prepositions. (I'm sufficiently
rusty on German grammar not to be able to translate the
technical term, and the corresponding concept in German
confuses me in any case: I can manage to distinguish
'ich lege das Buch auf deN Tisch' from 'das Buch liegt
auf deM TischE', but phrases like 'außerhalb VOM Garten'
versus 'außerhalb DES GartenS' or 'trotz deM Wetter'/
'trotz deS WetterS' strain both my memory and my comprehension.

>> The parser should know when an adverb or preposition is expected and
>> simply be able to examine what word it is, ...
>> As a corollary to this rule, we cannot have a varying number of
>> arguments before the keyword parameters begin.
>
> This corollary is not entirely true: given a contrived proc like:
>   proc log {message {qualifier {}} {qualopts -name {...}}} { ... }
> then qualifier options could only be passed after a qualifier.
> This is *almost* the same thing as having a series of defaulted
> params in current proc: to provide a value for a latter one, one
> must provide values to all former ones.

Yes, we could contrive such a thing. I'd be hard-put, though, to
come up with a general set of rules that would make it unambiguous.
Would the qualifier be interepreted as such only if it doesn't begin
with a hyphen? Only if it doesn't match one of the qualopts?
Can a qualifier have text that matches one of the qualopts, and
how would you specify that?

> There is, however, an important difference between *required* objects
> and *required* adverbs/prepositions - something I tried to solve in
> procx and consider failed: the point being is, that one doesn't know
> how much to "reserve" for required named params following lateron.

Exactly. Stu has also pointed out on the Chat, at least, that
it is useful to allow multiple instances of the same preposition,
with later instances overriding earlier ones. That allows for
defaults to be supplied at the start of a command line with
a later interpolation of {*}$args to override them. So even if
all your keyword parameters are -required 1, you still don't
know how many you have.

>> Clearly, the keyword parameters must follow next, in a block.  The
>> Unix command-line practice of allowing non-keyword arguments
>> interspersed among the keyword arguments has nothing to recommend it
>> to Tcl.
>
> Most of the unix commands stop accepting options once the first
> non-option or "--" is encountered. (iirc that's a feature of usual
> getopt() impl's) Only rare commands accept further options interspersed
> with objects.

The toolchain commands are the chief offenders here. 'gcc', 'ld'
and similar commands allow -options and file names to be
interspersed helter-skelter, and even are sensitive to the ordering
of the -options. I want that sort of thing to be Out Of Scope
for a standard args-parsing procedure.

> I feel "guilty" for having brought in the possibility of multiple
> blocks of named params, and I did it only in the light that it would
> be a dead giveaway. The "cases" I had in mind involved separating
> multiple blocks of named params by literally given subcommands,
> which even a compiler could have got right, but it's not worth
> jumping through loops if that would actually be necessary.

Don't feel guilty at all. In a proposal like this, it's important to
consider all the cases. We can decide that certain cases are
out of scope, but that should be a conscious decision, and
we should have a clear specification of what the eventual
implementation will and will not address.

>> Deprecated alternative: non-hyphen args
>> ---------------------------------------
>> We could also adopt the rule that if an argument without a leading
>> hyphen is encountered where a keyword is expected, that ends the
>> keyword arguments. Core commands such as [puts] follow this rule.
>> Note, however, that adopting this rule means that we are requiring
>> deep inspection of the parameter data.
>
> This deep inspection of the parameter data isn't an issue if the data
> is constrained by the proc's semantics.  Who cares, if a channel name
> or a widget path name is inspected for a leading dash?

I should perhaps have said 'disfavoured' rather than 'deprecated'.
I'm willing to permit it, but with the clear warning that it lacks full
generality and invites errors.

Inspecting a widget name or a channel name for a
leading hyphen is fine. But the temptation to use similar syntax
with file names, command and variable names, and data from
external sources will be almost irresistible if this becomes
a popular approach. We're all aware that those things can indeed
contain leading hyphens, legitimately. We do not want the
correct interpretation of a command to be jeopardized by
an unusual but legal name for something. The extra cost
of having to write something like '-fromfile $filename'
in place of '$filename' will sometimes be the price of
getting an unambiguous and safe parse.

> The compiler cannot know about these, but in cases where the values are
> given literally it shouldn't have a problem at all, and in other cases
> like that of puts with a channel, the compiler just has to sigh and emit
> an "invokeStk #" (or whatever equivalent in quadcode).

See above. I'm less concerned with the ability to compile as I am
with the ability to interpret the command syntax correctly.

> One even much worse paradigm, that new param-specs don't (and really
> shouldn't) follow is that of abbreviating option names down to the
> shortest unique prefix.  Instead a proc/commmand should specify some
> abbreviations that make sense, and that will be preserved even if it
> later grows a new option with same prefix. Tip 457 does this correctly,
> and I send a "no!" to those who request this utterly broken unique-prefix
> bungle for tip-457 named params.

Amen..

>> The syntax winds up being something like:
>>     [command] [mandatory arg]* ([adverb] | [prepositional phrase])* \
>>         '--' ([mandatory arg] | [optional arg] | 'args')*
>
> Rather:
>      [command] [mandatory arg]* [optional arg]*  \
>          ( ( [adverb] | [prepositional phrase] )+ '--' )? \
>          ([mandatory arg] | [optional arg] | 'args')*
>
> I'm not going to "fight" for the extra [optional arg]*, but it is
> technically unambiguous *even if used with non-literals*.

I already asked above. I'm not certain that I follow your argument
about the ambiguity. It strikes me that you haven't ruled out all the
evil cases.

In any case, anywhere that mandatory and optional
args can appear together, they can be intermixed freely.
As long as we know how many optional args we need to
fill, we can come up with the mapping unambiguously.

> The '--' obviously shouldn't even be allowed *unless* there exist
> named params.

Yeah.

> The addendum about only one 'args' and about '--' being optional in
> some cases depending on last part of course still applies.

Right. It's important to support at least the syntax of

    verb -adverb -preposition object ... noun

without needing the -- to introduce the nouns.

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions; introspection; order of parameters; ambiguity

Donal K. Fellows-2
In reply to this post by Colin McCormack-3
On 25/05/2017 09:33, Colin McCormack wrote:
> I'm sorry my screeds are so verbose.  I'm happy one of my valid points
> rose above the din.

I think there's a few of key points:

0. The general need for improvements in this area is reasonably
established (I believe this was covered by presentations or discussion
at Tcl 2016; it's unfortunate that not everyone was there, but that's
how it goes). While I'm personally of the opinion that code with large
numbers of formal parameters to a procedure is likely to be a problem
anyway (*excluding* "here's a list of things/coordinates to work on" and
"here's a list of options that modify things", both of which seem to be
manageable idioms) That's Not My Call To Make.

1. Whatever we do, we should not force people's code to be ambiguous.
(This prevents all sorts of subtle bugs, and is something that Tcl has
got wrong in the past in the [switch] command among other places. XOTcl
code has this sort of thing too with its option processing. No more
landmines! No forcing people to introduce landmines!)

2. We shouldn't force people to bear the cost of the complex argument
processing if they don't need the capabilities. (Argument processing is
a very hot part of Tcl's code.)

3. Any functionality for [proc] and procedures should also be available
to other command-creating-commands where meaningful. (I don't want have
to explain more rules to users than required. ;-))

4. Providing mechanisms for people to easily both make *and* use
commands with these sorts of things is a good thing. The need for an
easy mechanism for making them is obvious, but the ease of use (i.e.,
for discovery by both sensible error messages and introspection) is even
more important.

I'm not expressing any particular preference for any particular
mechanism for achieving a solution. This is mostly because I don't have
a horse in this race. :-) Also, I'm taking ages to respond to email. :-(

Donal.

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core

donal_k_fellows.vcf (241 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions;

Andreas Leitgeb
In reply to this post by Colin McCormack-3
On Fri, May 26, 2017 at 02:18:35AM +0000, Colin McCormack wrote:
> As a general principle, I prefer not to replace things but to add new
> things.

Apart from the shortcomings of that painting metapher, you'll probably
forever stick to your "no justifications seen" claim, even if justifi-
cations for the specific target of the TIP have been delivered:

* Having two proc commands (differing only in that the new one can do
   more) creates confusion and raises the question: why not let the
   original one do the extra stuff and avoid a second command?

* Having some "eatargs" command creates confusion about having to
   provide a boilerplate dummy arg-spec and providing the real arg spec
   somewhere else.

In both cases, the outcome is confusion about obviously incoherent design.

So much for why the TIP strives to improve the existing commands, which
to me seems the main point of your resistance.

The justification for why we (FA, mlafon & me) even see a need for a change
at all is that we want tcl procedures to be as expressive as most of the
standard commands, but a scripting language depends much more on powerful
building blocks than a language like C, where such building blocks
(helper functions) are mere convenience or avoidance of code duplication.

Btw., if it were about some special purpose argument parser for a specific
domain, then an eatargs-like approach or an alternative proc would be
fine: it wouldn't be a single design then and thus "incoherency" just
wouldn't apply.

This TIP just isn't about adding "yet another ...", it is about *seeking* a
consensus about what could be the one "sanctifiable" enhancement, that
is worthy of the status "that's how Tcl is", rather than mere "that's one
of the zillion things that one can do in Tcl".

The seek is still in progress. Current "construction sites" are whether
args should be special also in non-last position (tip-288), and whether
the param spec shall be expressive enough to be able to even describe
most of the existing historical sins among the standard commands, or
whether it should rather be pure to the point of sacrificing even some
safe uses. (that's a slightly more formal description of my previously
sloppily used term "nanniness")


If such a consensus isn't eventually achieved, or if the consensus isn't
accepted, then nothing happens. Tcl would lose a chance for improvement
or avert impeding destruction... depending on perspective.
Parts of the work could survive as "something that one can do in Tcl".


> In this sense, I do not think redefining ::proc is advisable.  There is
> value in being able to say something simple, knowing that it is simple.

And you want to imply that with an enhanced proc you *couldn't* say
something simple anymore, or that you'd be deprived of the knowledge
about the said simple thing being still simple?


> I, personally, don't have the hubris I think it requires to say, of an arg
> protocol (say), not only do I think this arg protocol is the best possible
> (or achievable), but I think it is the best for *everyone*.

I (and apparently mlafon) do have that slightly different hubris, that
we believe that Tcl can be improved for everyone, and that we work on
arriving to such an improvement and finally leave the decision about it
to the TCT. Nothing special here. Has happened already a lot of times.
For example, a -justify option was recently added to (Tk's) listbox,
rather than adding a new "justifyablelistbox" widget with only difference
being the added -justify option...


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

TIP 457: [eatargs] vs [proc] extensions; A third way?

fredericbonnet
In reply to this post by Kevin Kenny-6
Howdy Tcl'ers,

I've followed the latest controversy around TIP #457 and came to the conclusion that there is a third way between the two main approaches:

- extending [proc] (TIP's approach)
- delegating advanced args parsing to [eatargs] (Colin and others' approach)

I don't want to summarize all the pros and cons of these approaches, but here is a short, non-exhaustive list of the main arguments **from my own point of view**:


## [proc] extension ##

Pros:

- consistency for all procs
- performances (for the argument parsing proper)
- introspectable

Cons:

- added complexity to the [proc] man page
- rigid argument style, lack of freedom
- monolithic, not reusable by other commands
- concerns about the overhead to a critical language building block
- bytecode compilation

## [eatargs] ##

Pros:

- extensible
- reusable
- leaves [proc] alone
- freedom and agnosticism wrt. argument styles 

Cons:

- not introspectable
- conflicts with other approaches (e.g. code injection)
- performances (needs an extra call)

 

This morning I had the following illumination about the current [proc] and its special argument *args*: it happens that *args* is the only proc argument for which one can't provide a default value. When given, it is simply ignored. Consider this:

% proc foo {{a 1} {b 2} {args 3}} {
puts [list $a $b $args]
}
% foo
1 2 {}
% foo bar baz sprong fred
bar baz {sprong fred}
% info args foo
a b args
Nevertheless, introspection works as expected!
% info default foo a v; set v
1
% info default foo b v; set v
2
% info default foo args v; set v
3

This behavior is not explicitly stated in the [proc] man page so I'm not sure if that's by design or if it's a bug.

So, what if we made *args* a bit more special? Let's say that instead of simply ignoring the *args* specifier's second field, [proc] would expect an arguments processing command (let's call it *protocol*)? So [proc] syntax would become something like this:

proc name {arg1 arg2 {arg3 default3} ... {args protocol}} body
Today that extra *protocol* is silently ignored. When empty or unspecified, [proc] behavior would be unchanged. When non-empty, *protocol* would give the name of a command (or a prefix script) that would pre-process $args before executing the proc body. Here are the pros:

- unlikely to break existing code
- negligible overhead for regular procs (just a test at creation/call time)
- positional arguments are explicit and unambiguous
- extensible
- introspectable
- reusable: the same protocol command could be shared by several procs
- generic and agnostic: one could provide several protocol commands to accomodate for different args-passing styles
- available at both Tcl and C level: protocol commands could be written in Tcl or C, and C procs could call the same protocol command as Tcl procs.
- compatible with code injection

And the cons:

- Looser integration between [proc] and its arguments-parsing will make [proc] definition more verbose compared to the super-[proc] approach (but on par with explicit [eatargs])
- Feel free to elaborate


Back to TIP #457: instead of having to choose between super-[proc] and [eatargs] approaches, let's split the TIP:

- One TIP adding {args *protocol*} support to the existing [proc]
- One TIP adding [eatargs] (please find a better name!) to the core as the 'standard' argument protocol for Tcl procs.

You can then reproduce the current TIP approch like so:

% proc name {{args eatargs}} body

Last but not least, anybody is free to implement whatever style they prefer: for example, write a generic GNU-style protocol proc, provide it as a package, share it across several procs/methods/etc... You could even switch a proc's protocol at runtime without affecting its body.

Comments welcome.

Fred

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions; A third way?

Peter da Silva-2

What’s the advantage of this over plan C:

 

* extend proc per tip #457

* also add a standalone parser function that uses the same engine

 

Both cases it’s extending proc.

 

The only downside is that if someone comes up with a better way of extending proc they can’t do it. Even if someone in the future manages to come up with something better than TIP#457 (which is, by the way, designed for extensions), it will have to swim upstream against the TIP#457 syntax in the parser that everyone is using already.

 

From: "[hidden email]" <[hidden email]>
Date: Friday, May 26, 2017 at 7:56 AM
To: Tcl Core Mailing List <[hidden email]>
Subject: [TCLCORE] TIP 457: [eatargs] vs [proc] extensions; A third way?

 

Howdy Tcl'ers,

 

I've followed the latest controversy around TIP #457 and came to the conclusion that there is a third way between the two main approaches:

 

- extending [proc] (TIP's approach)

- delegating advanced args parsing to [eatargs] (Colin and others' approach)

 

I don't want to summarize all the pros and cons of these approaches, but here is a short, non-exhaustive list of the main arguments **from my own point of view**:

 

 

## [proc] extension ##

 

Pros:

 

- consistency for all procs

- performances (for the argument parsing proper)

- introspectable

 

Cons:

 

- added complexity to the [proc] man page

- rigid argument style, lack of freedom

- monolithic, not reusable by other commands

- concerns about the overhead to a critical language building block

- bytecode compilation

 

## [eatargs] ##

 

Pros:

 

- extensible

- reusable

- leaves [proc] alone

- freedom and agnosticism wrt. argument styles 

 

Cons:

 

- not introspectable

- conflicts with other approaches (e.g. code injection)

- performances (needs an extra call)

 

 

 

This morning I had the following illumination about the current [proc] and its special argument *args*: it happens that *args* is the only proc argument for which one can't provide a default value. When given, it is simply ignored. Consider this:

 

% proc foo {{a 1} {b 2} {args 3}} {

puts [list $a $b $args]

}

% foo

1 2 {}

% foo bar baz sprong fred

bar baz {sprong fred}

% info args foo

a b args

Nevertheless, introspection works as expected!

% info default foo a v; set v

1

% info default foo b v; set v

2

% info default foo args v; set v

3

 

This behavior is not explicitly stated in the [proc] man page so I'm not sure if that's by design or if it's a bug.

 

So, what if we made *args* a bit more special? Let's say that instead of simply ignoring the *args* specifier's second field, [proc] would expect an arguments processing command (let's call it *protocol*)? So [proc] syntax would become something like this:

 

proc name {arg1 arg2 {arg3 default3} ... {args protocol}} body

Today that extra *protocol* is silently ignored. When empty or unspecified, [proc] behavior would be unchanged. When non-empty, *protocol* would give the name of a command (or a prefix script) that would pre-process $args before executing the proc body. Here are the pros:

 

- unlikely to break existing code

- negligible overhead for regular procs (just a test at creation/call time)

- positional arguments are explicit and unambiguous

- extensible

- introspectable

- reusable: the same protocol command could be shared by several procs

- generic and agnostic: one could provide several protocol commands to accomodate for different args-passing styles

- available at both Tcl and C level: protocol commands could be written in Tcl or C, and C procs could call the same protocol command as Tcl procs.

- compatible with code injection

 

And the cons:

 

- Looser integration between [proc] and its arguments-parsing will make [proc] definition more verbose compared to the super-[proc] approach (but on par with explicit [eatargs])

- Feel free to elaborate

 

 

Back to TIP #457: instead of having to choose between super-[proc] and [eatargs] approaches, let's split the TIP:

 

- One TIP adding {args *protocol*} support to the existing [proc]

- One TIP adding [eatargs] (please find a better name!) to the core as the 'standard' argument protocol for Tcl procs.

 

You can then reproduce the current TIP approch like so:

 

% proc name {{args eatargs}} body

 

Last but not least, anybody is free to implement whatever style they prefer: for example, write a generic GNU-style protocol proc, provide it as a package, share it across several procs/methods/etc... You could even switch a proc's protocol at runtime without affecting its body.

 

Comments welcome.

 

Fred


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions;

Andreas Leitgeb
In reply to this post by Kevin Kenny-6
On Thu, May 25, 2017 at 12:35:36PM -0400, Kevin Kenny wrote:
> As far as the injection of code goes, anything that today has similar
> functionality (keyword parameters, upvar and uplevel, etc.) puts the
> requirement on the instrumentor that the injected code be able to deal
> with 'args', with variable names that have not yet been passed to
> 'upvar', and so on. If I understand you correctly, you're complaining
> about new functionality that [eatargs] would not provide, rather than
> existing functionality that would be lost.

So far:
  proc foo {x y args} {
     # injected code at beginning sees $x, $y, $args
     # actual body (scanning $args for options, then doing whatever)
     # ...
  }

With eatargs:
  proc foo args {
     # injected code at beginning sees only $args
     # actual body (declaring the arg spec, then doing whatever)
     eatargs {x y {o1 -name o1} {o2 -name o2} ...}
     # ...
  }
In both cases, the injected code doesn't see the options, but
in eatargs case it typically wouldn't anymore see $x and $y.

Splitting up the params seems like a way out:
  proc foo {x y args} {
     # injected code at beginning sees $x, $y, $args
     # actual body (declaring only the options, then doing whatever)
     eatargs {{o1 -name o1} {o2 -name o2} ...}
     # ...
  }
but I have no idea how well that would work with compilation. I seem to
remember your suggestion that the whole param spec had better be served
to eatargs.

So far, my primary use of instrumentation was to turn a &x into an upvar,
which would no longer be necessary with tip 457 either way, but I'd expect
that there are similar (but more domain specific) uses of injecting code
for particular parameter name patterns.


> I'll concede this to be the weakest plank in the platform. I really
> don't have a good general answer. Perhaps I'm putting too much weight
> on introspectability.

Well, Colin seemed to claim that introspection was solveable for the
eatargs approach, but both Alex and you are now resorting to doubting
the sweetness of these out-of-reach grapes, instead ;-)


>> My dream of tip-457-and-beyond would have been, that even all C-coded
>> commands would eventually get a means to describe their interface as a
>> param-spec, retrieveable by [info args ...] and only very few of them
>> (like "if") would fall back to "args".
(answering myself)
Btw., even "if" could at least specify its API as {condition args}


> What I am trying to do is to make sure that the common static cases
> will be compilable. The nastiness that surrounded [switch] is an example,
> where code could specify all the -keywords as constants, and still not
> be compilable because it was not provable that variable args were not
> -keywords.  I don't want the latter to become the common case.

I do understand the preference. I merely estimate practice such, that
if users can't specify their syntax with options and omit a perceivedly
redundant '--', then they'd just fall back to do it their old likely
even unsafer and compiler-unfriendly way.

>> Apart from [...], it's essentially what I've done in my procx.

I've got to clarify that: The part of reserving arguments for optional
positional params after args actually is not yet in the published
version of my procx (but only in my head).  I hope/expect to get to it
this weekend.


> I had intended 'nouns' to encompass required, optional, and 'args';
> 'objects' are associated with the prepositions.

Ok, thanks for clarification.

> (I'm sufficiently rusty on German grammar ...

Your being not-rusty on tcl more than compensates this ;-)

If you're actually curious about the German grammar part, ask me
in a mail off-list.


> > This corollary is not entirely true: given a contrived proc like:
> >   proc log {message {qualifier {}} {qualopts -name {...}}} { ... }
> > then qualifier options could only be passed after a qualifier.
> > This is *almost* the same thing as having a series of defaulted
> > params in current proc: to provide a value for a latter one, one
> > must provide values to all former ones.
> Yes, we could contrive such a thing. I'd be hard-put, though, to
> come up with a general set of rules that would make it unambiguous.

The rules are imho simple, as I quick-answered in the chat:
If there is an argument for the qualifier, then it *is* the qualifier.
If there are further arguments, they might be an option.

The more interesting case comes up when the option is explicitly tagged
as required. Then it still doesn't get any of the arguments reserved like
the required positionals do. The semantics for required+named params is
merely that it will complain if - after all is said and bound - the
required param is left without a value.

There is a "hard reservation" (for required positionals), that has power
to overrule even interpretation of an arg as an option, and "soft-
reservation" (only for optional positionals after 'args') that has
the power of cutting back the greediness of 'args' -- [:edited:] and
maybe for the sake of binding-predictability: even on options.

Named params never get reservations, but will complain about
omission if required and not given.


> Don't feel guilty at all. In a proposal like this, it's important to
> consider all the cases.

Like the  [optional arg] [ Adverb | Preposition ]+  pattern...

> We can decide that certain cases are out of scope, but that should
> be a conscious decision, ...

Yes. It seemed to me like you misjudged above as potentially ambiguous.

If (merely for sake of discussion) we dropped named args, and resorted
to plain tip-288, leaving it to user code to extract options
from $args, then they'd still potentially have optional arguments
before 'args' (and thus before options) and coincidentally with exactly
the same resulting semantics w.r.t preceding optional positionals that
I tried to explain.

If we modeled named arguments as being constrained to a tip-288 'args'
part, then the optional positionals would even be predetermined - though
at the "semantic cost" that even optional positionals to the right of
the named params would get binding priority over the named params.

> > One even much worse paradigm, that new param-specs don't (and really
> > shouldn't) follow is that of abbreviating option names down to the
> > shortest unique prefix. [...]
> Amen..

I wonder, if the helper proc to handle options on C-side could be
changed to solve it retroactively for commands... at least by Tcl9+


> I already asked above. I'm not certain that I follow your argument
> about the ambiguity. It strikes me that you haven't ruled out all the
> evil cases.

The only "evil case" I cannot rule out is that of the user *expecting*
his literal argument to be used as flag and the optional arg skipped.


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions;

Alexandre Ferrieux
On Fri, May 26, 2017 at 5:20 PM, Andreas Leitgeb <[hidden email]> wrote:
>> I'll concede this to be the weakest plank in the platform. I really
>> don't have a good general answer. Perhaps I'm putting too much weight
>> on introspectability.
>
> Well, Colin seemed to claim that introspection was solveable for the
> eatargs approach, but both Alex and you are now resorting to doubting
> the sweetness of these out-of-reach grapes, instead ;-)

Well, it is a question of priorities, and varied return on investment.
For those who absolutely depend on Nagelfar or similar to statically
isolate call sites where the signature is violated, then indeed with
[eatargs] an extra investment is needed from Nagelfar, to reach out
into the proc's body to find [eatargs]. Note this is doable, since
Kevin's and Donal's quadcode is ready to do the same. And Nagelfar
will also need an update to cope with TIP457...

Then there's the harder question of "injecting code at the beginning".
Can you clarify the use case ?
Indeed if you inject [if {![llength $args]} {error "At least one
option please"}], then the semantics of the [eatargs]-calling proc
will be wildly different from that of the TIP457-based proc.
Then it will be completely sound for Nagelfar or Quadcode to decide
that "no, the signature is not simply ARGSPEC", hence their failure to
find the [eatargs ARGSPEC] at the beginning is no felony.

See how the taste of those grapes depend on how much of the stem and
branch and leaves you swallow with them ;-) ?

> If (merely for sake of discussion) we dropped named args, and resorted
> to plain tip-288, leaving it to user code to extract options
> from $args, then they'd still potentially have optional arguments
> before 'args' (and thus before options) and coincidentally with exactly
> the same resulting semantics w.r.t preceding optional positionals that
> I tried to explain.
>
> If we modeled named arguments as being constrained to a tip-288 'args'
> part, then the optional positionals would even be predetermined - though
> at the "semantic cost" that even optional positionals to the right of
> the named params would get binding priority over the named params.

Yep, that's the idea of a refinement that aspect and I simultaneously
came up with yesterday:

 - let proc just allow for TIP288, or even better, Kevin's improved variant:

      {a {b B} args {c C} d {e E}}} ;# Kevin's message details the
priorities: mandatory, then default, then args

 - let [eatargs] chew on args only, spitting back the new value of
args (which may be non-empty if "args" appears in ARGSPEC):

      set args [eatargs $args ARGSPEC]

(
 Note that the latter can be decided to be abbreviated as

      eatargs args ARGSPEC ;# since args is a r/w variable here

 or even more concisely as

      eatargs ARGSPEC ;# since args is the only var we'll need to
reach. Yep, special case. As in proc.

 I'm completely agnostic about which of the three variants should make
it. Kevin prefers the first.
)

So, with this refinement, we'd write:

       proc f {a {b B} args {c C} d {e E}}} {
           set args [eatargs $args ARGSPEC-FOR-NAMED-ARGS] ;# Kevin's notation
           ...
       }

This way:

       - the arg protocol for proc remains on the safe side, with only
decisions based on arg counting and no string comparison (hence no
string rep generation). This can simply be an update of TIP288
matching Kevin's spec.

       - named-args aficionados keep the right to use them in their
string processing glory

       - direct introspection of mandatory and default args remains as
usual ; indirect introspection of named args is still doable as just
described.

       - a slight "energy barrier" remains so that people don't
carelessly indulge into named-args ignoring the perf hit

       - [eatargs] candidates can compete for ages, company per
company, group per group. Then if/when a local optimum is found,
shared, and consensual, it can be promoted either to TIP457 form or to
Frédéric's idea of {args ARGSPEC}:

            proc f {a {b B} {args ARGSPEC-FOR-NAMED-ARGS} {c C} d {e E}}} {..}


-Alex

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions; A third way?

fredericbonnet
In reply to this post by Peter da Silva-2

> De: "Peter da Silva" <[hidden email]>
> À: [hidden email], "Tcl Core Mailing List"

> What’s the advantage of this over plan C:

> * extend proc per tip #457
> * also add a standalone parser function that uses the same engine

> Both cases it’s extending proc.

From a documentation standpoint it keeps [proc] man page simple (this was one of the major objections).

It's also more easily transposable because [proc] is not privileged compared to other command-creating commands. Applying the {args ?parser?} reform to methods, lambdas etc. is very natural. Ditto for C-level commands.

The parser function is also freed from any compatibility constraints with the old proc syntax.

Last, it adds a layer of self-description if the parser function names are chosen adequately. E.g.:

    proc p1 {{args gnu-args}} body
    proc p2 {{args tk-args}} body

With [gnu-style] and [tk-args] being generic argument parsers that follow the GNU getopt and Tk configure/cget conventions.

> The only downside is that if someone comes up with a better way of
> extending proc they can’t do it. Even if someone in the future
> manages to come up with something better than TIP#457 (which is, by
> the way, designed for extensions), it will have to swim upstream
> against the TIP#457 syntax in the parser that everyone is using
> already.

Indeed. That's the point behind having a standalone parser instead of something that is closely tied to [proc].

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions;

Andreas Leitgeb
In reply to this post by Alexandre Ferrieux
On Fri, May 26, 2017 at 09:10:13PM +0200, Alexandre Ferrieux wrote:
> On Fri, May 26, 2017 at 5:20 PM, Andreas Leitgeb <[hidden email]> wrote:
>> Well, Colin seemed to claim that introspection was solveable for the
>> eatargs approach, but both Alex and you are now resorting to doubting
>> the sweetness of these out-of-reach grapes, instead ;-)
> Well, it is a question of priorities, and varied return on investment.

Compared to having the extended argspec in the 2nd proc argument, it
seems pretty much next to unreachable.

> Then there's the harder question of "injecting code at the beginning".
> Can you clarify the use case ?

My current need for code injection is for injecting "upvar" for each
arg name that starts with a '&'.  That one might no longer be necessary
with some variants of tip-457, but I'd expect that others might use similar
approaches of inserted actions for certain patterns of arg names, that
may be more domain-specific and less likely to ever make it into proposed
arg spec options.

> See how the taste of those grapes depend on how much of the stem and
> branch and leaves you swallow with them ;-) ?

The sweet grapes only go back out of reach, as eatargs is being
favoured over proposed 457.

Btw., dkf mentioned introspectability as being important. (Item 4)

> > If we modeled named arguments as being constrained to a tip-288 'args'
> > part, then the optional positionals would even be predetermined - though
> > at the "semantic cost" that even optional positionals to the right of
> > the named params would get binding priority over the named params.
> Yep, that's the idea of a refinement that aspect and I simultaneously
> came up with yesterday:
>  - let proc just allow for TIP288, or even better, Kevin's improved variant:
>  - let [eatargs] chew on args only, spitting back the new value of
>       args (which may be non-empty if "args" appears in ARGSPEC):
>     * eatargs args ARGSPEC ;# since args is a r/w variable here
>     * eatargs ARGSPEC ;# since args is the only var we'll need to reach.

Well, these are not the suggestions I'd favour.

My thought was constraining 457 such that:  if named options and args were
both present in a given arg spec, then they'd need to be adjacent with args
on the right side - together with the discussed binding priority  but at the
cost of spoiling some use cases (see last paragraph of my mail).

This would then boil down to the same binding semantics as your approach,
and it would not exclude positional args from the enhanced arg-specs with
-upvar and later maybe -assume typespec ...

> So, with this refinement, we'd write:
>        proc f {a {b B} args {c C} d {e E}}} {
>            set args [eatargs $args ARGSPEC-FOR-NAMED-ARGS] ;# Kevin's notation
>            ...
>        }
> This way:
>        - the arg protocol for proc remains on the safe side,

With some rough edges at the binding priority between options and
trailing optional parameters that would e.g. spoil it for interfaces
resembling  return or regsub.


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions; introspection; order of parameters; ambiguity

Mathieu Lafon
In reply to this post by Kevin Kenny-6
Hello Kevin,

> Colin makes a valid argument that the functionality is useful beyond
> [proc], [method], and [apply] of λ-forms. In particular, invocations
> of coroutines would conceivably want to parse keyword parameters.

Unless I'm misunderstanding what you mean by 'invocations of
coroutines', the new extended specifiers can be used when calling
[coroutine] to create a coroutine, either with a proc name or a
lambda.

% proc allNumbers {{st -name start -default 0} {inc -name incr -default 1}} {
  yield
  set i $st
  while 1 {
    yield $i
    incr i $inc
  }
}
% coroutine oddValues allNumbers -start 1 -incr 2

> (I'm not quite following the argument about [interp alias], but that really
> doesn't matter.

Same for [interp alias], the arguments can use keyword arguments if
the aliased command support them.

% proc log {{level -switch {debug}} msg} { ... }
% alias {} debug {} log -debug --

-- Mathieu

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Fwd: TIP 457: [eatargs] vs [proc] extensions; introspection; order of parameters; ambiguity

Kevin Kenny-6
Oops, accidentally replied privately:

On Mon, May 29, 2017 at 5:26 PM, Mathieu Lafon <[hidden email]> wrote:
Unless I'm misunderstanding what you mean by 'invocations of
coroutines', the new extended specifiers can be used when calling
[coroutine] to create a coroutine, either with a proc name or a
lambda.

After the coroutine has given control back with [yieldto], it may be
called again with multiple arguments. Colin argues strongly that the
subsequent calls must also be able to parse argument lists.



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fwd: TIP 457: [eatargs] vs [proc] extensions; introspection; order of parameters; ambiguity

Mathieu Lafon
On Mon, May 29, 2017 at 11:29 PM, Kevin Kenny <[hidden email]> wrote:
> After the coroutine has given control back with [yieldto], it may be
> called again with multiple arguments. Colin argues strongly that the
> subsequent calls must also be able to parse argument lists.

Resuming a coroutine through [yieldto] does not permit the usage of
basic features of proc arguments (default values, special 'args',
...). [yieldto] only return a list, it does not set any variables.

Why would the support of extended specifiers be a requirement in that case?

-- Mathieu

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fwd: TIP 457: [eatargs] vs [proc] extensions; introspection; order of parameters; ambiguity

Kevin Kenny-6
On Mon, May 29, 2017 at 5:42 PM, Mathieu Lafon <[hidden email]> wrote:
Resuming a coroutine through [yieldto] does not permit the usage of
basic features of proc arguments (default values, special 'args',
...). [yieldto] only return a list, it does not set any variables.

Why would the support of extended specifiers be a requirement in that case?

Typically, there is a considerable burden on the coroutine writer if he wants
to have the syntax include optional arguments, even with Tcl's usual syntax.
(If all that is included are mandatory arguments, then [lassign] can do the job.)
Since if 457 is accepted, we will have an argument parser at hand, it would
be a reasonable thing for the coroutine writer to want access to it.


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Fwd: TIP 457: [eatargs] vs [proc] extensions; introspection; order of parameters; ambiguity

Joe English-2

Kevin Kenny wrote:

> Typically, there is a considerable burden on the coroutine writer if he
> wants
> to have the syntax include optional arguments, even with Tcl's usual syntax.
> (If all that is included are mandatory arguments, then [lassign] can do the
> job.)
> Since if 457 is accepted, we will have an argument parser at hand, it would
> be a reasonable thing for the coroutine writer to want access to it.

Anonymous procedures and [apply] should suffice, no?


--JE

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TIP 457: [eatargs] vs [proc] extensions; A third way?

Florent Merlet
In reply to this post by fredericbonnet

For now, a proc argument is at most a list of 2 elements : the variable name and eventualy a default value.


The solution given by Mathieu is to consider a list of more than 2 elements a « TIP#457 defined » argument.

 

You find another way : as args has no default value, it is at most a list of 1 element, so the argspec could be added here too, as a second element of a list begining by args.

 

There is maybe another solution : proc has 3 arguments, its name, its argument list, its body.

Let's say, Tcl would accept to define a proc with 2 arguments only.


proc procname {

   body

}


Let's say, by default, with this kind of definition, the argument of this proc would be « args », i.e. the variadic list of arguments.

We could also define somewhere else the proc arguments, for instance by using a dedicated command, wich would fill the argument specification of the proc structure :
 
argset procname {
    var -name -v -default 0
    *upwindow -path ::tk
    ...
}

Now, let's say this « argset » command returns the procname if it is called with two arguments.
 
We could write :
 
proc [argset procname {
    var -name -v -default 0
    *upwindow -path ::tk
    ...
}] {
   body
}

We could also make all the declarations of argument of every proc in a separated file to be sourced.

Now, lets say this « argset » command could apply its protocol to the args variable if it used with an unique argument.

proc procname {
   argset {
      var -name -v -default 0
      *upwin -path ::tk
   }
   ...
   end of body
}

It means we could dynamically change the handling of argument of a proc.
But then : How to track those changes with «info args» ?

It would also work for coroutines also, via yieldto

proc coro {
   while 1 {
      yieldto argset {
          var -name -v -default 0;
          *upwin -path ::tk
      }
      puts $args
}

apply {
   args {
      argset {
          var -name -v -default 0;
          *upwin -path ::tk
      }
      lambda_body
   }
}

Summary :
- Accept a list of 2 elements to define a proc where variadic args is the default argument.
    1° proc <proc name > <arg list> <body proc>
    2° proc <proc name> <body proc>
                 which is equivalent to :
                       proc <proc name> args <body proc>
   3° proc <proc name> [list args <arg protocol>] <body proc>
- Create a new argset command whose definition is :
   1° argset <argument protocol>
          Then it applies «in place» the argument protocol on the variadic list of argument named args
   2° argset <proc name> <arg protocol>
          Then it defines the argument protocol of a proc and return the proc name



Le 26/05/2017 à 14:56, [hidden email] a écrit :
Howdy Tcl'ers,

I've followed the latest controversy around TIP #457 and came to the conclusion that there is a third way between the two main approaches:

- extending [proc] (TIP's approach)
- delegating advanced args parsing to [eatargs] (Colin and others' approach)

I don't want to summarize all the pros and cons of these approaches, but here is a short, non-exhaustive list of the main arguments **from my own point of view**:


## [proc] extension ##

Pros:

- consistency for all procs
- performances (for the argument parsing proper)
- introspectable

Cons:

- added complexity to the [proc] man page
- rigid argument style, lack of freedom
- monolithic, not reusable by other commands
- concerns about the overhead to a critical language building block
- bytecode compilation

## [eatargs] ##

Pros:

- extensible
- reusable
- leaves [proc] alone
- freedom and agnosticism wrt. argument styles 

Cons:

- not introspectable
- conflicts with other approaches (e.g. code injection)
- performances (needs an extra call)

 

This morning I had the following illumination about the current [proc] and its special argument *args*: it happens that *args* is the only proc argument for which one can't provide a default value. When given, it is simply ignored. Consider this:

% proc foo {{a 1} {b 2} {args 3}} {
puts [list $a $b $args]
}
% foo
1 2 {}
% foo bar baz sprong fred
bar baz {sprong fred}
% info args foo
a b args
Nevertheless, introspection works as expected!
% info default foo a v; set v
1
% info default foo b v; set v
2
% info default foo args v; set v
3

This behavior is not explicitly stated in the [proc] man page so I'm not sure if that's by design or if it's a bug.

So, what if we made *args* a bit more special? Let's say that instead of simply ignoring the *args* specifier's second field, [proc] would expect an arguments processing command (let's call it *protocol*)? So [proc] syntax would become something like this:

proc name {arg1 arg2 {arg3 default3} ... {args protocol}} body
Today that extra *protocol* is silently ignored. When empty or unspecified, [proc] behavior would be unchanged. When non-empty, *protocol* would give the name of a command (or a prefix script) that would pre-process $args before executing the proc body. Here are the pros:

- unlikely to break existing code
- negligible overhead for regular procs (just a test at creation/call time)
- positional arguments are explicit and unambiguous
- extensible
- introspectable
- reusable: the same protocol command could be shared by several procs
- generic and agnostic: one could provide several protocol commands to accomodate for different args-passing styles
- available at both Tcl and C level: protocol commands could be written in Tcl or C, and C procs could call the same protocol command as Tcl procs.
- compatible with code injection

And the cons:

- Looser integration between [proc] and its arguments-parsing will make [proc] definition more verbose compared to the super-[proc] approach (but on par with explicit [eatargs])
- Feel free to elaborate


Back to TIP #457: instead of having to choose between super-[proc] and [eatargs] approaches, let's split the TIP:

- One TIP adding {args *protocol*} support to the existing [proc]
- One TIP adding [eatargs] (please find a better name!) to the core as the 'standard' argument protocol for Tcl procs.

You can then reproduce the current TIP approch like so:

% proc name {{args eatargs}} body

Last but not least, anybody is free to implement whatever style they prefer: for example, write a generic GNU-style protocol proc, provide it as a package, share it across several procs/methods/etc... You could even switch a proc's protocol at runtime without affecting its body.

Comments welcome.

Fred


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
12
Loading...