Do not use entities: Part II

Answered by: Eliot Kimber Last Updated: 2006-09-14

=Question by N.N.=

I don't totally disagree with what you have said. But I do have one question. Your statement:

"But I would say that as a matter of preferred practice, you should not use entities at all."

Are you referring to character entities or all entities? With the requirement in XML that all entities be declared in the local instance, I can somewhat understand not using text entities. That is a 'somewhat'. I can also see the need to use text entities for often repeated or common text.

=Eliot Kimber answered=

All entities. See my earlier posts on this subject for my feelings about external parsed entities (not objects, syntactic not semantic, require DTDs, etc.). Use XInclude or its functional equivalent (i.e., DITA conref).

For character entities, my previous statements stand (not needed with Unicode, require DTDs).

For text replacement I prefer a generic element-based mechanism, which can be described generically "variable/variable-ref". I've put something like this in all the document types I've designed for the last five years or so.

The basic idea is simple: you have elements that have a name and take a value and other elements that reference those names. Implementing the resolution is trivial in XSLT and pretty easy in ACL.

This mechanism has the advantage that it is not dependent on the parser or the schema mechanism so it can be implemented however. You can use XSD schema's key/keyref feature to define more sophisticated reference constraints and scoping, although it usually makes sense to make the variables global over a compound document.

Another feature is that this approach will work for documents that are composed of multiple individual documents (i.e., a compound document constructed using XInclude).

Also, you can mix explicit variables with references to other well-named and well-defined things, such as metadata items. For example, in one document type I had both the generic variable mechanism as well as a "metadata-ref" element that could create references to metadata values stored in the document, such as the product name, product number, or whatever.

In a re-use scenario this approach is the only way to get the effect where within a re-useable module (managed as an independent XML document and referenced with an XInclude-like mechanism) you can refer to a field by name (i.e., "product-name") and have the value that shows up in the rendition be a function of the properties of the top-level document in the compound document (i.e., the document that contains the metadata values or defines specific values for the named variables). This mechanism is not syntactic but semantic and will work regardless of whether you have a DTD or not.

One test for any XML use is whether or not it will work in the absence of any DTD or schema. Obviously the use of entities (and the use of fixed attributes (or defaulted attributes)) fails this test.

(Editor's comment by Karl Johan Kleist: remaining part added in another posting)

Another feature of this approach is that you can define your document type so that you can put sets of variable definitions in a separate document and then manage that as a normal XML document in your normal XML document management system. By using XInclude-type references to pull in the variables you can quickly swap in one set or another. Or, since it's just a link (and not a syntactic inclusion), your application can be more sophisticated and, for example, use resolution-time parameters to determine the correct set of values to use.

For example, I might allow my variables to be organized like so:  platform MS Windows ... 