Détail de la notice
Titre du Document
A formal language model for parsing SGML
Auteur(s)
MATZEN R. W. ; GEORGE K. M. ; HEDRICK G. E.
Résumé
The Standard Generalized Markup Language (SGML) is an international standard for document definition (ISO 8879) that was adopted in 1986 and is rapidly gaining acceptance in industry and government. It is a meta-language system for document design rather than a specific scheme for document processing; almost any kind of document can be described using SGML. Productions called element declarations are used to define arbitrary elements of documents and the context in which they can occur. A finite set of element declarations called a document type definition (DTD) defines the high-level syntax of a set of documents. DTDs are similar to context-free grammars, but the productions are more complex. The standard does not describe a formal language model for SGML, and there is little work in the literature on this topic. This article defines a formal language model for SGML; systems of finite automata from systems of regular expressions. This model is applied in two ways: a parser is constructed for DTDs, and methods are shown for automatically constructing parsers for the documents defined by a DTD. These methods for parsing SGML are new, and they include features of DTDs that have not previously been included in a static language model. The model applies directly to the syntactic constructs of SGML, and thus, the methods shown in this article have distinct advantages for parsing SGML over traditional context-free parsing methods.
Editeur
Elsevier Science
Identifiant
ISSN : 0164-1212 CODEN : JSSODM
Source
The Journal of systems and software A. 1997, vol. 36, n° 2, pp. 147-166 [bibl. : 17 ref.]
Langue
Anglais
Pour les membres de la communauté du CNRS, ce document est autorisé à la reproduction à titre gratuit.
Pour les membres des communautés hors CNRS, la reproduction de ce document à titre onéreux sera fournie sous réserve d’autorisation du Centre Français d’exploitation du droit de Copie.

Pour bénéficier de nos services (strictement destinés aux membres de la communauté CNRS (Centre National de la Recherche Scientifique), de l'ESR français (Enseignement Supérieur et Recherche), et du secteur public français & étranger) :