Goals
After today's session, the successful learner will be able to:
- make an external DTD
- use the DTD to learn more about elements and attributes
- validate an xml document against a DTD
- define and use a namespace
- understand the role of an xml schema
Preliminaries
- discussion of freeware and commercial editors
- xmlSpy - screenshot - mindshare, $200/user, email registration
- Xed - screenshot - unix port, primitive, free
- xml notepad - class input?
- other class experiences
- xml document design - 'loser pet shop' example
- cats
- dogs
- wetPets
- herpPets
- smallPets
Some common elements: breed -} name, price, numberOfLegs, suggestedFood?
external DTSs nn|332+
- external DTDs are reusable
- a simple example of an xml document with an external dtd
- external DTDs can be on the same SYSTEM or be PUBLICally available.
- Declaring elements (nn|339):
- PCDATA: Text not contained in a child element. nn|340-41
- general format: <!ELEMENT name content> (p.57|338)
- content = content-model | EMPTY | ANY.
- Child elements under ANY still must be declared... (works in progress)
- a content model is like a "regex" (regular expression)
- Order (aka "Sequences" nn|341):
(foo, bar, baz) element foo THEN element bar THEN element baz
- Alternation (aka "choice" nn|344):
(foo | bar | baz) element foo OR element bar OR element baz
- Grouping
( (foo|bar),(dag|nabbit) ) nn|341
- Repetition (aka "Occurrence indicators","cardinality" nn|345):
- ? = 0 or 1 instance of the element
- + = 1 or more instance of the element
- * = 0 or more instances of the element
(foo+, bar?, baz*) at least one instance of element foo THEN one or no instances of element bar, THEN any number (or no) instances of element baz.
- mixed-content-models nn|340,342:
(#PCDATA | foo | bar | baz)* parseable text, THEN element foo, THEN element bar, THEN element baz.
Note that mixed-content elements are restricted to this content form.
- can combine internal and external -- the internal is defined first and overrides
- Ambiguity example: pp60-61|??
- difficult to specify numbers beyond the wildcards
- Hands-on:
- make an external DTD for the previous .xml document
- validate it - add a constraint - break it - change the DTD or xml document
- here are some constraints to work with
- elements in a given order
- elements from alternation
- element using +
- element using ?
- element using *
- element declaration with mixed content (tricky!)
- a working example with copy of the DTD
- Declaring attributes nn|349-51:
- Hands-on:
- add an ID to one of your elements. Validate.
- add an IDREF to another element. Validate. Verify it breaks if the IDREF does not reference an existing ID.
- a working example with DTD
- Parameter entities - DTD only
- DTD conditionals - pp 136-7.
Namespace in xml p.139|293-301
"Namespace collisions" occur when two objects have the same name. The xml author can control this with use of name prefixes.
- likelihood of collision in an extensible language
- re-use of subdocuments
- namespace as pure extraction
- namespace:name format (c.f. packages or classes)
- problem of unique prefices: solution is to relay on a URI
- URL: uniform resource locator, familiar to you
- URN: uniform resource name,
- declare the namespace in your root element (inheritance):
<mySpace:myRootElement xmlns:mySpace="http://www.myDomain.com/unique"> (explicit namespace definition, allows multiples and mixed default/explicit) example nn|298-299
or just <myRootElement xmlns:mySpace="http://www.myDomain.com/unique"> (singular default namespace) example nn|300
cancel namespace with<myChildElement xmlns:mySpace=""> local to that element and children
-
Schemas
- DTD is an SGML legacy tool (EBNF grammar) - XML Schema is XML: "schema valid"
- written in XML
- element content validation; nn|385-386.
- able to reuse sections: ElementTypes
- derived datatypes
- associating a document with a schema nn|395
- a partial example
- a simple xml with its schema that uses Microsoft schema. Note the backwards tree construction!
Reading
- CSS - p377-394|61-75
- XSL - 453-477|133
- XSLT nn|133-140
http://www.mousetrap.net/syllabus/xml/day2.html
$Id: day2.orb,v 1.11 2002/04/28 17:51:33 mouse Exp $
Remember, your login is based on your machine's hostname, not on any other number.
~/[initials] refers to the subdirectory under your homedir, named after your initials. Everything except for .dotfiles will be stored in your ~/[initials] directory.