Syntax-K

Know-How für Ihr Projekt

Perl Documentation

NAME

perliol - C API for Perl's implementation of IO in Layers.

SYNOPSIS

/* Defining a layer ... */
#include <perliol.h>

DESCRIPTION

This document describes the behavior and implementation of the PerlIO abstraction described in perlapio when USE_PERLIO is defined.

History and Background

The PerlIO abstraction was introduced in perl5.003_02 but languished as just an abstraction until perl5.7.0. However during that time a number of perl extensions switched to using it, so the API is mostly fixed to maintain (source) compatibility.

The aim of the implementation is to provide the PerlIO API in a flexible and platform neutral manner. It is also a trial of an "Object Oriented C, with vtables" approach which may be applied to Perl 6.

Basic Structure

PerlIO is a stack of layers.

The low levels of the stack work with the low-level operating system calls (file descriptors in C) getting bytes in and out, the higher layers of the stack buffer, filter, and otherwise manipulate the I/O, and return characters (or bytes) to Perl. Terms above and below are used to refer to the relative positioning of the stack layers.

A layer contains a "vtable", the table of I/O operations (at C level a table of function pointers), and status flags. The functions in the vtable implement operations like "open", "read", and "write".

When I/O, for example "read", is requested, the request goes from Perl first down the stack using "read" functions of each layer, then at the bottom the input is requested from the operating system services, then the result is returned up the stack, finally being interpreted as Perl data.

The requests do not necessarily go always all the way down to the operating system: that's where PerlIO buffering comes into play.

When you do an open() and specify extra PerlIO layers to be deployed, the layers you specify are "pushed" on top of the already existing default stack. One way to see it is that "operating system is on the left" and "Perl is on the right".

What exact layers are in this default stack depends on a lot of things: your operating system, Perl version, Perl compile time configuration, and Perl runtime configuration. See PerlIO, "PERLIO" in perlrun, and open for more information.

binmode() operates similarly to open(): by default the specified layers are pushed on top of the existing stack.

However, note that even as the specified layers are "pushed on top" for open() and binmode(), this doesn't mean that the effects are limited to the "top": PerlIO layers can be very 'active' and inspect and affect layers also deeper in the stack. As an example there is a layer called "raw" which repeatedly "pops" layers until it reaches the first layer that has declared itself capable of handling binary data. The "pushed" layers are processed in left-to-right order.

sysopen() operates (unsurprisingly) at a lower level in the stack than open(). For example in Unix or Unix-like systems sysopen() operates directly at the level of file descriptors: in the terms of PerlIO layers, it uses only the "unix" layer, which is a rather thin wrapper on top of the Unix file descriptors.

Layers vs Disciplines

Initial discussion of the ability to modify IO streams behaviour used the term "discipline" for the entities which were added. This came (I believe) from the use of the term in "sfio", which in turn borrowed it from "line disciplines" on Unix terminals. However, this document (and the C code) uses the term "layer".

This is, I hope, a natural term given the implementation, and should avoid connotations that are inherent in earlier uses of "discipline" for things which are rather different.

Data Structures

The basic data structure is a PerlIOl:

	typedef struct _PerlIO PerlIOl;
	typedef struct _PerlIO_funcs PerlIO_funcs;
	typedef PerlIOl *PerlIO;
	struct _PerlIO
	{
	 PerlIOl *	next;       /* Lower layer */
	 PerlIO_funcs *	tab;        /* Functions for this layer */
	 U32		flags;      /* Various flags for state */
	};

A PerlIOl * is a pointer to the struct, and the application level PerlIO * is a pointer to a PerlIOl * - i.e. a pointer to a pointer to the struct. This allows the application level PerlIO * to remain constant while the actual PerlIOl * underneath changes. (Compare perl's SV * which remains constant while its sv_any field changes as the scalar's type changes.) An IO stream is then in general represented as a pointer to this linked-list of "layers".

It should be noted that because of the double indirection in a PerlIO *, a &(perlio->next) "is" a PerlIO *, and so to some degree at least one layer can use the "standard" API on the next layer down.

A "layer" is composed of two parts:

  1. The functions and attributes of the "layer class".

  2. The per-instance data for a particular handle.

Functions and Attributes

The functions and attributes are accessed via the "tab" (for table) member of PerlIOl. The functions (methods of the layer "class") are fixed, and are defined by the PerlIO_funcs type. They are broadly the same as the public PerlIO_xxxxx functions:

struct _PerlIO_funcs
{
 Size_t     fsize;
 char *     name;
 Size_t     size;
 IV         kind;
 IV         (*Pushed)(pTHX_ PerlIO *f,
                            const char *mode,
                            SV *arg,
                            PerlIO_funcs *tab);
 IV         (*Popped)(pTHX_ PerlIO *f);
 PerlIO *   (*Open)(pTHX_ PerlIO_funcs *tab,
                          PerlIO_list_t *layers, IV n,
                          const char *mode,
                          int fd, int imode, int perm,
                          PerlIO *old,
                          int narg, SV **args);
 IV         (*Binmode)(pTHX_ PerlIO *f);
 SV *       (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags)
 IV         (*Fileno)(pTHX_ PerlIO *f);
 PerlIO *   (*Dup)(pTHX_ PerlIO *f,
                         PerlIO *o,
                         CLONE_PARAMS *param,
                         int flags)
 /* Unix-like functions - cf sfio line disciplines */
 SSize_t    (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count);
 SSize_t    (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
 SSize_t    (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
 IV         (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence);
 Off_t      (*Tell)(pTHX_ PerlIO *f);
 IV         (*Close)(pTHX_ PerlIO *f);
 /* Stdio-like buffered IO functions */
 IV         (*Flush)(pTHX_ PerlIO *f);
 IV         (*Fill)(pTHX_ PerlIO *f);
 IV         (*Eof)(pTHX_ PerlIO *f);
 IV         (*Error)(pTHX_ PerlIO *f);
 void       (*Clearerr)(pTHX_ PerlIO *f);
 void       (*Setlinebuf)(pTHX_ PerlIO *f);
 /* Perl's snooping functions */
 STDCHAR *  (*Get_base)(pTHX_ PerlIO *f);
 Size_t     (*Get_bufsiz)(pTHX_ PerlIO *f);
 STDCHAR *  (*Get_ptr)(pTHX_ PerlIO *f);
 SSize_t    (*Get_cnt)(pTHX_ PerlIO *f);
 void       (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt);
};

The first few members of the struct give a function table size for compatibility check "name" for the layer, the size to malloc for the per-instance data, and some flags which are attributes of the class as whole (such as whether it is a buffering layer), then follow the functions which fall into four basic groups:

  1. Opening and setup functions

  2. Basic IO operations

  3. Stdio class buffering options.

  4. Functions to support Perl's traditional "fast" access to the buffer.

A layer does not have to implement all the functions, but the whole table has to be present. Unimplemented slots can be NULL (which will result in an error when called) or can be filled in with stubs to "inherit" behaviour from a "base class". This "inheritance" is fixed for all instances of the layer, but as the layer chooses which stubs to populate the table, limited "multiple inheritance" is possible.

Per-instance Data

The per-instance data are held in memory beyond the basic PerlIOl struct, by making a PerlIOl the first member of the layer's struct thus:

	typedef struct
	{
	 struct _PerlIO base;       /* Base "class" info */
	 STDCHAR *	buf;        /* Start of buffer */
	 STDCHAR *	end;        /* End of valid part of buffer */
	 STDCHAR *	ptr;        /* Current position in buffer */
	 Off_t		posn;       /* Offset of buf into the file */
	 Size_t		bufsiz;     /* Real size of buffer */
	 IV		oneword;    /* Emergency buffer */
	} PerlIOBuf;

In this way (as for perl's scalars) a pointer to a PerlIOBuf can be treated as a pointer to a PerlIOl.

Layers in action.

             table           perlio          unix
         |           |
         +-----------+    +----------+    +--------+
PerlIO ->|           |--->|  next    |--->|  NULL  |
         +-----------+    +----------+    +--------+
         |           |    |  buffer  |    |   fd   |
         +-----------+    |          |    +--------+
         |           |    +----------+

The above attempts to show how the layer scheme works in a simple case. The application's PerlIO * points to an entry in the table(s) representing open (allocated) handles. For example the first three slots in the table correspond to stdin,stdout and stderr. The table in turn points to the current "top" layer for the handle - in this case an instance of the generic buffering layer "perlio". That layer in turn points to the next layer down - in this case the low-level "unix" layer.

The above is roughly equivalent to a "stdio" buffered stream, but with much more flexibility:

Per-instance flag bits

The generic flag bits are a hybrid of O_XXXXX style flags deduced from the mode string passed to PerlIO_open(), and state bits for typical buffer layers.

Methods in Detail

Utilities

To ask for the next layer down use PerlIONext(PerlIO *f).

To check that a PerlIO* is valid use PerlIOValid(PerlIO *f). (All this does is really just to check that the pointer is non-NULL and that the pointer behind that is non-NULL.)

PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words, the PerlIOl* pointer.

PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type.

Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either calls the callback from the functions of the layer f (just by the name of the IO function, like "Read") with the args, or if there is no such callback, calls the base version of the callback with the same args, or if the f is invalid, set errno to EBADF and return failure.

Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls the callback of the functions of the layer f with the args, or if there is no such callback, set errno to EINVAL. Or if the f is invalid, set errno to EBADF and return failure.

Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls the callback of the functions of the layer f with the args, or if there is no such callback, calls the base version of the callback with the same args, or if the f is invalid, set errno to EBADF.

Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the callback of the functions of the layer f with the args, or if there is no such callback, set errno to EINVAL. Or if the f is invalid, set errno to EBADF.

Implementing PerlIO Layers

If you find the implementation document unclear or not sufficient, look at the existing PerlIO layer implementations, which include:

If you are creating a PerlIO layer, you may want to be lazy, in other words, implement only the methods that interest you. The other methods you can either replace with the "blank" methods

PerlIOBase_noop_ok
PerlIOBase_noop_fail

(which do nothing, and return zero and -1, respectively) or for certain methods you may assume a default behaviour by using a NULL method. The Open method looks for help in the 'parent' layer. The following table summarizes the behaviour:

method      behaviour with NULL
Clearerr    PerlIOBase_clearerr
Close       PerlIOBase_close
Dup         PerlIOBase_dup
Eof         PerlIOBase_eof
Error       PerlIOBase_error
Fileno      PerlIOBase_fileno
Fill        FAILURE
Flush       SUCCESS
Getarg      SUCCESS
Get_base    FAILURE
Get_bufsiz  FAILURE
Get_cnt     FAILURE
Get_ptr     FAILURE
Open        INHERITED
Popped      SUCCESS
Pushed      SUCCESS
Read        PerlIOBase_read
Seek        FAILURE
Set_cnt     FAILURE
Set_ptrcnt  FAILURE
Setlinebuf  PerlIOBase_setlinebuf
Tell        FAILURE
Unread      PerlIOBase_unread
Write       FAILURE
FAILURE        Set errno (to EINVAL in Unixish, to LIB$_INVARG in VMS)
               and return -1 (for numeric return values) or NULL (for
               pointers)
INHERITED      Inherited from the layer below
SUCCESS        Return 0 (for numeric return values) or a pointer

Core Layers

The file perlio.c provides the following layers:

In addition perlio.c also provides a number of PerlIOBase_xxxx() functions which are intended to be used in the table slots of classes which do not need to do anything special for a particular method.

Extension Layers

Layers can be made available by extension modules. When an unknown layer is encountered the PerlIO code will perform the equivalent of :

use PerlIO 'layer';

Where layer is the unknown layer. PerlIO.pm will then attempt to:

require PerlIO::layer;

If after that process the layer is still not defined then the open will fail.

The following extension layers are bundled with perl:

TODO

Things that need to be done to improve this document.