19 Background SGML to HTML

The following sections provide a formal definition of SGML HTML 4.0. This includes SGML classified , document type definition (Document Type Definition, DTD), and links to a combination of symbols , as well as a sample of SGML catalog .

These files are also available in ASCII format:

Default DTD (Default DTD):
http://www.w3.org/TR/REC-html40/strict.dtd
Transitional DTD (Transitional DTD):
http://www.w3.org/TR/REC-html40/loose.dtd
DTD with frames (Frameset DTD):
http://www.w3.org/TR/REC-html40/frameset.dtd
SGML Declaration:
http://www.w3.org/TR/REC-html40/HTML4.decl
Files definition combinations:
http://www.w3.org/TR/REC-html40/HTMLspecial.ent
http://www.w3.org/TR/REC-html40/HTMLsymbol.ent
http://www.w3.org/TR/REC-html40/HTMLlat1.ent
Directory Example:
http://www.w3.org/TR/REC-html40/HTML4.cat

19.1 Document Validation

Many authors have checked their documents in a limited set of browsers, believing that if the browsers can present them to the document, the document is valid. Unfortunately, this is a very ineffective means of verifying the documents, because browsers are designed to support and incorrect documents, presenting them as best as possible.

To check the correctness of the documents should be checked with the help of SGML parser nsgmls type (see. [The SP] ), to make sure they comply with the HTML 4.0 DTD. If the type declaration of your document includes the URI, and the SGML parser supports this type of system identifier, it will get a DTD directly. Otherwise you can use a sample SGML catalog. It is assumed that the DTD is stored in a "strict.dtd" file, and the files are in combination "HTMLlat1.ent", "HTMLsymbol.ent" and "HTMLspecial.ent". In any case, make sure that the SGML parser supports Unicode. See. In the documentation for the utility to validate.

Remember that despite the fact that such check is useful and is highly recommended, it does not guarantee full compliance with the specification document HTML 4.0. The reason is that the syntactic analyzer uses this SGML DTD SGML only, which does not reflect all aspects correct document HTML 4.0. In particular, the parser SGML syntax ensures correctness structure list of items and their attributes. But he can not keep track of, for example, errors such as setting an invalid attribute value width element images IMG (for example, "of foo" or "12.5"). Although in this specification a value of this attribute is only limited "integer representing the length of the pixel" defined in DTD as a type only CDATA , for which all values are acceptable. Check full compliance with HTML 4.0 can only be a special program.

However, such testing is still strongly recommended, as it can detect a wide range of errors.

19.2 of SGML catalog Example

In this directory are included replacing the directive to ensure that processing software such as nsgmls, common identifiers would have priority over the system. This means that users do not have to be connected to the Web at startup based system identifiers URI.

  OVERRIDE YES
 PUBLIC "- // W3C // DTD HTML 4.0 // EN" strict.dtd
 PUBLIC "- // W3C // DTD HTML 4.0 Transitional // EN" loose.dtd
 PUBLIC "- // W3C // DTD HTML 4.0 Frameset // EN" frameset.dtd
 PUBLIC "- // W3C // ENTITIES Latin1 // EN // HTML" HTMLlat1.ent
 PUBLIC "- // W3C // ENTITIES Special // EN // HTML" HTMLspecial.ent
 PUBLIC "- // W3C // ENTITIES Symbols // EN // HTML" HTMLsymbol.ent