This is the Qpy package.

Qpy provides a convenient mechanism for generating safely-quoted xml
text from python code.  It does this by implementing a quote-no-more 
string data type and a slight modification of the python compiler.  
(This main idea comes from Quixote's htmltext/PTL.)

Quoting
-------

XML reserves 5 characters ('<', '>', '&', quote and apostrophe) so
that they can be used as markup delimiters. When a document needs to
use these characters for some other purpose, they must be escaped,
that is, replaced by the an equivalent entity or character
reference. This package defines a xml_quote() function that, for a string
argument, returns a string with these 5 characters with equivalents:
for example, '<' becomes '&lt;'.

When assembling an XML (or similar markup such as HTML) document, it
is important to remember to quote everything that should be quoted,
such as text that comes from a database or some (untrusted) outside
source.  In the case of web pages, underquoting this dangerous, as it
leaves the door open for cross-site scripting and other attacks.

It would be nice if you could assemble your document as a string and
then call xml_quote() on it at the end, just to make sure that everything
was quoted, but this generally results in over-quoting, where you lose
the intended markup structure.  For web pages, over-quoting produces a
result that is ugly, but much safer than the underquoted alternative.

Programs that produce XML documents must keep track of just what has
been quoted already and what has not been quoted already, and mistakes
are common.  Our objective is to make quoting errors rare, especially
underquoting errors.

The Quoted-No-More Class: xml
-----------------------------

Our xml_quote() function always returns an xml instance.  The class named
"xml" is a subclass of Python's unicode string class.  An instance of
xml is a string that is known to need no more XML quoting.  When the
xml_quote() function gets an xml instance as an argument, it just returns
the instance immediately, without any changes.  When the xml_quote()
function gets None as an argument, it always returns an empty xml
instance.  All other arguments to quote are converted to unicode
strings and then the reserved characters are escaped to produce the
resulting xml instance.

The xml class defines some functions that make it easy to build
quoted documents. 

When an xml instance is combined with another object using the '+'
operator, the result is the xml instance formed by concatenating the
quoted operands.  The value of the expression

    xml('<x>') + '<'

is equal to the value of 

    xml('<x>&lt;')

When an xml instance is used as a format string with the '%' operator,
the (non-number) arguments to the format string are quoted as they are
used.  

The xml class includes a join() method that quotes the items
in the sequence before joining them.  The common case of using an
empty xml instance to join a sequence is implemented in the join_xml()
function.  The join_str() function acts the same way, except that 
it does not escape any characters.  

The Qpy Compiler
----------------

The Qpy compiler is Python compiler with an added preprocessor that
can best be understood understood as a source-code transformation.  The
transformation is limited to the definitions of certain functions we
call "templates".  An xml template is designated in qpy source code
by ":xml" just after the function name in the function's definition.
For example, this is an xml template:

    def f:xml(x):
        "<div>"
        x
        "</div>"

The Qpy preprocessor essentially replaces this by:

    from qpy import xml as _qpy_xml, join_xml as _qpy_join_xml
    def f(x):
        qpy_accumulation = []
        qpy_append = qpy_accumulation.append
        qpy_append(_qpy_xml("<div>"))
        qpy_append(x)
        qpy_append(_qpy_xml("</div>"))
        return _qpy_join_xml(qpy_accumulation)

There are two main things going on here.  One is that every
string-literal in the body of the function is wrapped by the xml
constructor.  The assumption is that a literal string, provided by the
programmer, does not need any more quoting.  The other part of the
conversion is that expression values are accumulated on a local list,
and the default return value is the xml instance formed by
concatenating these values, after quoting them.

The values returned by f are xml instances, and here are some samples:

    f(None)          -> "<div></div>"              # None becomes ''.
    f("<hr />")      -> "<div>&lt;hr /&gt;</div>"  # Quoting happens.
    f(1)             -> "<div>1</div>"             # Converted.
    f(xml("<hr />")) -> "<div><hr /></div>"        # Already quoted.

The nice thing about this is that the expressions appearing in a
template, possibly including values provided from outside sources,
will always be quoted unless they are already instances of the xml
class.  If the programmer makes a mistake with respect to quoting,
it will very likely appear as over-quoting instead of lurking as
a security problem.

Templates can't have normal python docstrings after the arguments: we
just use comments.

A template may also be designated by ":str", instead of ":xml"
appearing before the function name.  The difference is that a str
template will accumulate the values of expression statements and
return the join_str() of the list, and there is no XML-quoting.

Templates can be nested arbitrarily along with other functions.  A
template's code transformation does not apply inside ordinary functions
that are defined inside the template body.


Using Qpy
---------

Source code files that include templates should be named with a ".qpy"
suffix and placed in a python package directory.  The package
__init__.py should contain the following lines to make sure that the
compiled versions of the qpy modules are up-to-date:

     from qpy.compile import compile_qpy_files
     compile_qpy_files(__path__[0])


The qpcheck.py Utility
----------------------

This package also includes qpcheck.py, a script that looks for unknown
names and unused imports in directories containing python and qpy source code.


Installation
------------

       If you have been using a previous version of QPY, remove all
       ".pyc" files that have been compiled from ".qpy" files.

       python setup.py install

   or

       python setup.py build_ext -i # build extension in place.
       Put this directory on your python path.


Example
-------

An "example" package is included.  To try it, install as described
above, start a python interpreter, and try importing the
"qpy.example.example1" module.  The real purpose of the example is to
provide an example package, __init__.py, and a Qpy module.


Content-in-code instead of code-in-content.
-------------------------------------------

Most "template" systems are designed to embed program-like
value-substitution and control flow into what would otherwise be
static content.  Qpy (like Quixote's PTL templates) uses the opposite
pattern, embedding static content in what would otherwise be an
ordinary program.  This program-centric pattern is especially
attractive when content maintenance team is the same as the
programming team.


Notes for Quixote Users
-----------------------

The basic idea for Qpy comes from Quixote.

In Qpy, quoting also quotes the apostrophe character.

The xml class is like Quixote's htmltext class.

Unlike htmltext instances, xml instances can be pickled.

Most .ptl files work without changes with Qpy.  Qpy treats
PTL's html templates as xml templates, and PTL's plain templates
as str templates.

Qpy doesn't use ihooks or any other kind of import hook.

Notes for Users of Previous Versions of QPY
-------------------------------------------

Use xml in place of h8.  The name "h8" is deprecated.

The older syntax for templates still works, but it is deprecated.
The html_escape_string function is not present in this release.

The u8 class is not really present in this release because Python 3
makes it unnecessary.   The name "u8" is deprecated.

There is no class method for quoting.  Use xml_quote() instead.

The stringify() function is still present, but deprecated.

Copyright
---------

Copyright (c) Corporation for National Research Initiatives 2009.
