named arguments - unambiguous forms - questions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

named arguments - unambiguous forms - questions

Andreas Leitgeb
I want to raise some questions about named arguments in general,
(regardless its target: tip-457, eatargs or procx)

Each of them may affect the need, not-need or not-allow of '--'.

 1.) if a proc mimicks the api of "return", then it doesn't accept
    adverbs and all prepositional phrases are of a length that is
    (generalized) a multiple of (or equal to) a number N > 1.

    In that case, the procedure can also define a number op < N of
    optional positional parameters, without requiring '--'. Any
    '--' would then even be used already for the positionals.

    A procedure offering such a param-spec would also need an
    explicit commitment to such an API. It cannot lateron add
    adverbs or add more optional positional parameters, without
    breaking its specifically '--'-less API.

    Now for the real question: How to express such a commitment
    on a procedure?

 2.) Commands casually have arguments that are "usually" given
    literal values (e.g. the subcommands of an ensemble) and
    other arguments that are almost never given literally
    (e.g. switch's value). If such an expectation could be
    declared explicitly, then it could not only help some
    static checker tools (like nagelfar, ...) warn about unmet
    expectations, but might even be a point to consider about
    requiring or not requiring a '--'.

 3.) Allowing an arg-spec to declare its preferred handling of
    being specified multiple times: -multi <last|all|uniq>
    with "last" being default.
    "uniq" could be be a means to allow omission of an otherwise
    necessary '--' if all declared options are provided. That's
    rather thin ice, though: that mechanism slams shut and locks
    the door against growing more options later.
    "all" is inspired by unix "sort" and its -k options - it allows
    an option to be repeated and captures all the values (wrapped
    in a list).  Might eventually go into a future version of procx.


Finally, some brainstorming about a Tcl perspective on "compilation"
of such an enhanced procedure: Given a tcl command line:
   regexp -all -- $RE $string _ info
then the Tcl parser turns it into bytecode, but the principle is
the same as if it turned it into a "prepared statement" like those
known from sql: {regexp -all -- ? ? _ info} (forget about possible
literal question marks for now). The challenge is, given some sort
of arg-spec for regexp:
  {{al -bool all} {ab -bool about} ... {st -name start} exp str args}
and thus the bare list of params {al ab ... st exp str args}, to
"assign" each of such a regexp-proc's internal params either a constant
or an index into a later supplied list of actual values for the "?"s.
The result could then be:
  regexp-internal 1 0 ... "" ?0 ?1 {_ info}
This is likely equivalent to some fragment of quadcode's "todo"(or "done")

Without the literal '--' this step wouldn't succeed, and the compiler
would just fold and leave the binding to a value-non-agnostic later
step.


And post-Finally: yesterday I updated my "procx" on
http://wiki.tcl.tk/48569  (TclImplForNamedArguments) to adapt to recent
discussion: -upvar <level>, but even more importantly it no longer allows
multiple groups of named arguments.  I also restricted the purely
positional cases to exactly tip-288, so optional positional parameters
are no longer allowed after 'args' - Even though it would be unambiguous,
I just fail to see a valid usecase or precedent for {... args {foo ""}}.


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: named arguments - unambiguous forms - questions

Kevin Kenny-6
On Mon, May 29, 2017 at 9:53 AM, Andreas Leitgeb <[hidden email]> wrote:
I want to raise some questions about named arguments in general,
(regardless its target: tip-457, eatargs or procx)

Each of them may affect the need, not-need or not-allow of '--'.

 1.) if a proc mimicks the api of "return", then it doesn't accept
    adverbs and all prepositional phrases are of a length that is
    (generalized) a multiple of (or equal to) a number N > 1. 

    In that case, the procedure can also define a number op < N of
    optional positional parameters, without requiring '--'. Any
    '--' would then even be used already for the positionals.

    A procedure offering such a param-spec would also need an
    explicit commitment to such an API. It cannot lateron add
    adverbs or add more optional positional parameters, without
    breaking its specifically '--'-less API.

    Now for the real question: How to express such a commitment
    on a procedure?

I don't know whether we need to express whether the
API 'commits' to such an interface; I know that in my
earlier remarks I allowed for the possibility of such an
API, with N restricted to 2. N>2 is moving off into the weeds.
 
 2.) Commands casually have arguments that are "usually" given
    literal values (e.g. the subcommands of an ensemble) and
    other arguments that are almost never given literally
    (e.g. switch's value). If such an expectation could be
    declared explicitly, then it could not only help some
    static checker tools (like nagelfar, ...) warn about unmet
    expectations, but might even be a point to consider about
    requiring or not requiring a '--'.

For every place where a literal 'usually' appears, someone
will want to subsitute it programmatically.
 
 3.) Allowing an arg-spec to declare its preferred handling of
    being specified multiple times: -multi <last|all|uniq>
    with "last" being default.
    "uniq" could be be a means to allow omission of an otherwise
    necessary '--' if all declared options are provided. That's
    rather thin ice, though: that mechanism slams shut and locks
    the door against growing more options later.
    "all" is inspired by unix "sort" and its -k options - it allows
    an option to be repeated and captures all the values (wrapped
    in a list).  Might eventually go into a future version of procx.

In many cases, when wrapping a Tk-style command, it's useful
to specify defaults and then include user-supplied args with
{*}$args. If the user repeats arguments that are defaulted,
the second will override the first.

Aside from that, this idea seems to be making two more
additions to what begins to resemble an endless
series of Ptolemaic epicycles. Can we try to keep the proposal
simple and minimal?
 
Finally, some brainstorming about a Tcl perspective on "compilation"
of such an enhanced procedure: Given a tcl command line:
   regexp -all -- $RE $string _ info
then the Tcl parser turns it into bytecode, but the principle is
the same as if it turned it into a "prepared statement" like those
known from sql: {regexp -all -- ? ? _ info} (forget about possible
literal question marks for now). The challenge is, given some sort
of arg-spec for regexp:
  {{al -bool all} {ab -bool about} ... {st -name start} exp str args}
and thus the bare list of params {al ab ... st exp str args}, to
"assign" each of such a regexp-proc's internal params either a constant
or an index into a later supplied list of actual values for the "?"s.
The result could then be:
  regexp-internal 1 0 ... "" ?0 ?1 {_ info}
This is likely equivalent to some fragment of quadcode's "todo"(or "done") 

Without the literal '--' this step wouldn't succeed, and the compiler
would just fold and leave the binding to a value-non-agnostic later
step.

There are some special cases here, too. In order to facilitate
compilation of a common idiom, some time ago,
[switch -nocase {-case {puts 1} -nocase {puts 2}}] was
modified NOT to throw an error, just to allow for compilation
of a switch with only 2 args.

Over the weekend, I realized that [regsub] could benefit from
the same change: [regsub -all free-for-all -none] is unambiguous
except that the current parser chooses to resolve the ambiguity
in favour of an error, and causing -- effectively to be required
whenever there's variable substitution about.

Since [regsub] isn't bytecompiled (except for the cases where
it degenerates into [string map]), the focus of quadcode compilation
in that case is to assess by static examination what variables
in the program it might possibly modify. If the command cannot
be analyzed, then the effect will be that every variable in the
procedure will be retrieved from the callframe after the command
returns, and information about the variable's type and existence
will be lost. Loading a Tcl_Obj* from the callframe is cheap, but
repeating all the existence checking and shimmering is not, and
the result of mispredicting the effect of a call can easily be a
procedure that's 5x slower.

For out-of-line commands, what the engine has today is only
a handful of cases:

(0) The called command has no effect on the callframe.

Most of the 'utility' commands, such as [after], [glob],
[seek], [puts] don't care where they're called from.
All the information they need is in their arguments.

(1) The called command has arbitrary or unknown effect
on the callframe.

This is the default if nothing has described the command.

(2) Argument #N or arguments #N and beyond are variable names.

This is the case for commands such as [gets], [scan], [file stat],
and [info default].

(3) A procedure is supplied that will accept the arg list
from the quadcode in a stylized format, and give either
a (possibly empty) list of named variables or an indication
that it cannot analyze the command.

This was the case for some of the Core ensembles,
and still is for some of the cases of [clock], [interp],
[lsort], [regexp] and [regsub].

It is safe for the analysis procedure to overestimate the
variables affected. For the case of [regsub $a b c d],
we know that $a is either:
  - not a hyphenated option, in which case 'd' is a
    variable name
  - a hyphenated option, in which case:
      + the command will continue to have correct syntax,
         and affects no variables.
      + the command will not have correct syntax, and
         throws an error without affecting variables.
In any case, detecting that 'd' is a literal lets us return
what is possibly an overestimate, but a fairly tight one.
We can prove that no variable other than 'd' is affected.

What I want to see is that this sort of analysis, for
the common use cases, should be possible given
the argspecs and any available literals in the calling
command.

And post-Finally: yesterday I updated my "procx" on
http://wiki.tcl.tk/48569  (TclImplForNamedArguments) to adapt to recent
discussion: -upvar <level>, but even more importantly it no longer allows
multiple groups of named arguments.  I also restricted the purely
positional cases to exactly tip-288, so optional positional parameters
are no longer allowed after 'args' - Even though it would be unambiguous,
I just fail to see a valid usecase or precedent for {... args {foo ""}}.

I don't really see one, either, but if there's a theory for how to
do something that is both complete and sound, I tend to want
to follow it for the sake of regularity.

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Tcl-Core mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/tcl-core
Loading...