notes.org 6.2 KB

Commands

Commands need to be implemented, so that they:

  1. read inputs from the current input port
  2. write output to the current output port, and
  3. write errors to the current error port.

Furthermore they should return a value upon completion, so that the value can be processed by other functions as structured data, instead of those other functions having to read output of the commands from a pipe (corresponding input port to an output port).

This should give shell commands defined as procedures additional power, over traditional shell commands, whose outputs are merely strings, rather than structured data.

State of a shell

Some commands work on the state, that the shell is in. How should this state be managed?

If it was mutable global state, then it will become complicated to ever use multiple cores or run commands concurrently later. To avoid mutating global state, this state should be passed in as argument.

Structure of a command

INPROGRESS Input

A command can have 3 input sources. In descending order of priority:

  1. an argument, which is an optional argument
  2. results of a previous command in a pipeline, structured data
  3. current input port

The reasoning is as follows:

  1. If an argument is explicitly passed to the procedure implementing
  2. the command, then it seems to be the caller's intention to make the command use that argument.
  3. Using a previous command's result allows for using structured data,
  4. avoiding the overhead of serializing a result and then deserializing the result again.
  5. If no other input source remains, read from the
  6. ~current-input-port~. This requires deserializing the data, if it is meant to be more than a simple string.

Another option is to pass in flags choosing where to read input from. This way a pipeline constructing function could call commands specifying the input source and govern the behavior.

However, reading the previous result naively requires the previous result being completely computed, before the current command can start processing it. It would be great to have a way of passing structured data like is done with strings, current-output-port and ~current-input-port~, to avoid serialization and deserialization cost, as well as enabling a command pipeline to run concurrently.—Seems like this is the classical problem of passing structured data between separate processes.

INPROGRESS Output

One option is to specify via flags (arguments to the procedure implementing a command), where to write output to.

Another idea is to always write to current-output-port and the list of results, so that a next command can choose where to read input from. But does a command have any knowledge basis on which to make such a decision?

Ideas

current input port for structured data

Is there anything, that one can use like the current-input-port, but for structured data? Maybe Stis-Engine has this?

INPROGRESS Pipelines of commands

When pipelining commands it is necessary to tell the procedure that is the command, whether to make use of structured data passed in as argument, or to read from an input port and use that as data. Reading from an input port would be the normal shell behavior, but is also inefficient, as the command would need to parse that input again, or work on plain strings, instead of structured data. It would be good to have a way, to make the command use structured data, but still allow for a fallback to read from an input port.

However, wouldn't the amount of logic inside the command blow up, if it had to handle both scenarios? Maybe commands by default should always read from an input port and a second implementation should be made, which only works by processing a structured data argument. Then a general version could be made, which can make the decision, which version to use, the input port reading one, or the argument processing one.

  • Is it really necessary to have plain string inputs from an input
  • port, for any step of a pipeline, except for the first one?
  • Is it really necessary to have plain string outputs to an output
  • port, for any step of a pipeline, except for the last one?
  • Yes: A shell might redirect output or error to a file, but also copy
  • it to direct it to stdout.
  • Maybe not: Maybe this guile-shell does not need to be 100% like the
  • original shell, or even close. Maybe it only needs to be very useful and readable.
  • No: Commands can still write to output port and error port, but if
  • there is a structured data representation available to work with, there should be no harm in working with it instead of reading from an input port.
  • Idea 1: every command needs to implement a decision logic and
  • processing logic for both, reading from input port and processing argument as input.
  • Idea 2: A command consists of 2 parts. A part that gets the next
  • input value and a part processing that value. Getting the next value can happen from an input port, for example reading a line, or from an argument.
  • Idea 3: A command takes an additional keyword argument, which is the
  • results of the previous command, if there are any results, otherwise the empty list. If there are results from the previous command, use them, instead of reading from an input port.

The way a command takes values from a read input stream should be implemented in a separate procedure, passed in by default as keyword argument, making it easy to change.

IDEA: Since commands may be used from other commands as function (pushd might use cd for example) and potentially should not pollute the output port of the actual command that makes use of them as helpers, there should be a #:silent keyword argument, that prevents them from writing to an output or error port.

IDEA: Maybe there should also be a keyword argument which prevents commands from changing shell state.