Wednesday, March 19, 2008

Print services

Computer science — or informatics, as it is called more appropriately in Europe — has a less linear progress history than other technologies. Indeed, many a breakthrough technology was forgotten only to be reinvented several decades later. I had already posted on concurrent programming (in the comments) and color encoding.

For example, the idea of punching the octal codes of a program on a paper tape instead of toggling it in every time on the console switches was so straightforward it got quickly adopted. But already the idea of using an assembler or compiler to generate the octal codes from a formal language took a bit longer to sink in.

Back in the Sixties and Seventies, when computer users were debating on whether 96 column punch cards were better than 80 column punch cards, computer scientists were busy inventing tools to make their professional life easier by using computer technology. However, the semantic gap from what they were doing to the reality of punch cards was so big, that most of not many of their ideas did not make it into the real world, only to be reinvented thirty years later.

One hot topic at that time was distributed computing. Long before protocols like TCP/IP, HTTP, etc. were invented, things were harder to do and had to happen a step at a time. An example was Grapevine, a multicomputer system on the Xerox research internet. It provided facilities for the delivery of digital messages such as computer mail; for naming people, machines, and services; for authenticating people and machines; and for locating services on the internet. You can read about it in Andrew D. Birrell, Roy Levin, Roger M. Needham, and Michael D. Schroeder, Grapevine: an exercise in distributed computing, Communications of the ACM, Volume 25, Issue 4 (April 1982), Pages: 260-274.

Once we can exchange digital messages and name entities, we can call procedures or invoke methods on a different machine. As we can read in the first paragraph of Andrew D. Birrell and Bruce Jay Nelson, Implementing remote procedure calls, ACM Transactions on Computer Systems, Volume 2, Issue 1 (February 1984), Pages: 39-59,

The idea of remote procedure calls (hereinafter called RPC) is quite simple. It is based on the observation that procedure calls are a well-known and wellunderstood mechanism for transfer of control and data within a program running on a single computer. Therefore, it is proposed that this same mechanism be extended to provide for transfer of control and data across a communication network. When a remote procedure is invoked, the calling environment is suspended, the parameters are passed across the network to the environment where the procedure is to execute (which we will refer to as the callee), and the desired procedure is executed there. When the procedure finishes and produces its results, the results are passed back to the calling environment, where execution resumes as if returning from a simple single-machine call.

The components of the RPC system, and their interactions for a simple call

What was powerful in the Cedar implementation of RPC described in this paper, was that it came with a program called Lupine, which automatically generated the user and server stubs to marshall and unmarshall the procedure parameters into messages. Lupine was so powerful that even a dummy like me could implement a distributed service in an afternoon.

It is not that Xerox did not try to productize this technology. Indeed, it created a product version of RPC called Courier and build a whole network systems architecture on this foundation. As an example, let us look how the first print service product evolved from a research effort.

In the early days of personal computers (PC), printing was very cumbersome. It entailed powering down the PC, carrying the disk to the printer room and inserting it into the PC controlling the printer and booting it up, and finally printing. At the end the printer controller had to be powered down, the disk transferred to the original PC, which could then be booted up again.

In the mid 1970 this lead to the invention of the Ethernet for connecting a PC to a printer's controller and the development protocols to transfer data and control over the Ethernet. The basic concept underlying these protocols was RPC. The main protocol was the PARC Universal Packet (PUP).

In the late 1970s Xerox released a commercial version of this architecture, meeting the most stringent Federal requirements, under the name Xerox Network Systems (XNS). XNS supported a large number of services, among which name, authentication, gateway, time; and filing, mailing, printing, scanning, etc. Like PUP, all XNS protocols were based on RPC, specifically Courier. Later TCP/IP was able to be rapidly developed based on the experience with PUP and XNS.

XNS clients use the Printing Protocol to cause documents to be printed on a Print Service. The Printing Protocol model assumes an abstract printer service which has three distinct processing phases: spooling, formatting, and marking.

A client requests service and, if the Print Service is able to grant the request, the client is given a print request identifier. The Print Service provides status of the job, which the client can request via the identifier, as well as a capabilities ticket (Properties).

The print request consists of a list of links to the documents to be printed, as well as a request ticket (Options). The Printing Protocol includes all security requirements of the Government and a priority. The documents are transferred with the Bulk Data Transfer Protocol. The Authentication Protocol is used for security and the Time Protocol is used to manage time.

In XNS the documents have to be in the Interpress page description language, which can be regarded as a precursor of PDF. An important feature of Interpress is that all pages are independent and can be processed independently in any order, as most suitable for the printer.

XNS Print Service architecture

An important feature of XNS Print Services is to assure that a document printed by different printers will look the same and has the same consistent high quality. To achieve this, the XNS architecture specifies the Print Service Integration Standard (PSIS). The PSIS defines the base case to which all XNS Print Services must adhere to assure compatibility. The principal areas addressed by PSIP are: Interpress level, character encoding, naming syntax, font usage, file usage, minimal service provisions, color encoding, and Printing Protocol usage (exception handling).