Title

SRFI 166: Monadic Formatting

Author

Alex Shinn

Status

This SRFI is currently in draft status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-166@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Received: 2019/3/27
60-day deadline: 2019/5/26
Draft #1 published: 2019/3/27

Abstract

A library of procedures for formatting Scheme objects to text in various ways, and for easily concatenating, composing and extending these formatters efficiently without resorting to capturing and manipulating intermediate strings.

This SRFI is an updated version of SRFI 159, primarily with the difference that state variables are hygienic.

Summary of differences from SRFI 159:

State variables are first class and hygienic
Added written-shared, pretty-shared
Added as-color, pretty-colored
Added ambiguous-is-wide? state variable
Restored non-uniform comma rules as needed in India
Restored upcased and downcased
Several clarifications and more examples

Rationale

There are several approaches to text formatting. Concatenating strings to display is not acceptable, since it doesn't scale to very large output. The simplest realistic idea, and what people resort to in typical portable Scheme, is to interleave display and write and manual loops, but this is both extremely verbose and doesn't compose well. A simple concept such as padding space can't be achieved directly without somehow capturing intermediate output.

The traditional approach in other languages is to use templates - typically strings, though in theory any object could be used and indeed Emacs's mode-line format templates allow arbitrary sexps. Templates can use either escape sequences (as in C's printf and Common Lisp's format) or pattern matching (as in Visual Basic's Format, Perl6's form, and SQL date formats). The primary disadvantage of templates is the relative difficulty (usually impossibility) of extending them, their opaqueness, and the unreadability that arises with complex formats. Templates are not without their advantages, but they are already addressed by other libraries such as SRFI 28 and SRFI 48.

Another important aspect of formatting is state. Common Lisp format provides a "fresh-line" format spec which outputs a newline only if the output stream is not already at the beginning of a line. C++ iostreams allow changing the radix and floating-point precision for numeric output, not just for a single value but as a persistent setting for all future output. Custom formatters which could manipulate their own state would allow for many new possibilities.

This SRFI takes a combinator approach to solving both problems. Formatters are defined, which are called to produce their output as needed, composed with other formatters, and refer to and update arbitrary state. The primary goal of this SRFI is to have a maximally expressive and extensible formatting library. The next most important goal is scalability - to be able to handle arbitrarily large output and not build intermediate results except where necessary. The third goal is brevity and ease of use.

Index

Base
show each each-in-list
displayed written written-shared written-simply
escaped maybe-escaped
numeric numeric/comma numeric/si numeric/fitted
nl fl space-to tab-to nothing
joined joined/prefix joined/suffix
joined/last joined/dot joined/range
padded padded/right padded/both
trimmed trimmed/right trimmed/both
trimmed/lazy fitted fitted/right fitted/both
fn with with! forked call-with-output
port row col width output writer
string-width pad-char ellipsis
radix precision decimal-sep decimal-align
Pretty
pretty pretty-shared pretty-simple pretty-color
Columnar
columnar tabular wrapped wrapped/list wrapped/char
justified from-file line-numbers
Unicode
as-unicode unicode-terminal-width
upcased downcased
Color
as-red as-blue as-green as-cyan as-yellow
as-magenta as-white as-black as-bold as-underline
as-color

Types and Naming Conventions

We introduce two new types, formatters, which are disjoint from any type except possibly procedures, and state variables, which are distinct from any type except possibly SRFI 39 parameters. These may optionally be identical to the SRFI 165 computations and computation environment variables, respectively.

In the prototypes below the following naming conventions imply type restrictions:

fmt: either a formatter, or a string or char coerced to a formatter with displayed
formatter: a formatter
mapper: a procedure of one argument which returns a formatter
num: a number
state-var: a state variable
The naming of formatters and mappers is generally chosen such that they read as adjectives or adverbs describing how the objects they act on are formatted. This provides a natural reading of the code, and allows for a simple mapping between standard operations and their formatting counterparts:
write: written
display: displayed
string-pad: padded
string-trim: trimmed
string-join: joined

Specification

The SRFI is divided into a core implementation and three utility libraries, which could be defined portably in terms of the core but are provided as convenience extensions. The libraries are as follows:

  (srfi 166)           ; composite of all of the following
  (srfi 166 base)      ; all bindings not in one of the following
  (srfi 166 pretty)    ; all bindings in Pretty Printing
  (srfi 166 columnar)  ; all bindings in Columnar Formatting
  (srfi 166 unicode)   ; all bindings in Unicode
  (srfi 166 color)     ; all bindings in Formatting with Color

Usage

(show output-dest fmt ...)

The entry point for all formatting. Applies the fmt formatters in sequence, accumulating the output to output-dest. As with SRFI 28 format, output-dest can be an output port, #t to indicate the current output port, or #f to accumulate the output into a string and return that as the result of show.

Each fmt should be a formatter as discussed below. As a convenience, non-formatter arguments are also allowed and are formatted as if wrapped with displayed, described below, so that

    (show #f "π = " (with ((precision 2)) (acos -1)) nl)

would return the string "π = 3.14\n".

As mentioned, formatters are an opaque type and cannot directly be applied outside of show. Custom formatters are built on the existing formatters, and as first class objects may be named or computed dynamically, so that:

  (let ((~.2f (lambda (x) (with ((precision 2)) x))))
    (show #f "π = " (~.2f (acos -1)) nl))

produces the same result. For typical uses you only need to combine the existing high level formatters described in the succeeding sections, but see the section Higher Order Formatters and State for control flow and state manipulation primitives.

The return value of show is the accumulated string if output-dest is #f and unspecified otherwise.

Formatting Objects

(displayed obj): If obj is a formatter, returns obj as is. Otherwise, outputs obj using display semantics. Specifically, strings are output as if by write-string and characters are written as if by write-char. Other objects are output as with written (including nested strings and chars inside obj). This is the default behavior for top-level formats in show, each and most other high-level formatters.

Title

Author

Status

Table of Contents

Abstract

Rationale

Index

Types and Naming Conventions

Specification

Usage

Formatting Objects

Formatting Numbers

Formatting Space

Concatenation

Padding and Trimming

Pretty Printing

Columnar Formatting

Formatting with Color

Unicode

Higher Order Formatters and State

State Variables

Implementation

Acknowledgements

References

Copyright