XML question: writing a DTD

I’ve got what looks like an interesting question about writing a DTD for XML documents. Say I’ve got three elements, imaginatively named A, B, and C. I’d like for <A><B><C></C></B></A> to be legal as a complete document, or <A><C></C></A>, but not <B><C></C></B>. Is there a way to do this? The few tutorials I’ve looked at so far don’t say anything either way.

You’re pointing out something I find really annoying about XML dtd’s.

The concept of which element is to be the document (root) element of a document appears in the “doctype” declaration of the xml document itself, not the dtd.

For instance, an xml document might look something like this example:


<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE settings PUBLIC
          "-//NetBeans//DTD Session settings 1.0//EN"
          "http://www.netbeans.org/dtds/sessionsettings-1_0.dtd">

...


<settings> is the intended document element of this document, and the dtd provided at that URL simply contains information defining the “settings” element and its content. It does allow you to validate informational fragments as entire documents, or define documents using several related roots with one dtd, but it means that attempting to look at a dtd and see what you expect a complete, coherent document to look like is impossible unless it is commented.

So to do the example in my OP, I’d have to have two types of documents, one with a root of <A>, and the other with a root of <A>'s parent? Or am I misunderstanding you?

Perhaps I’m misunderstanding your question, as well

No, you can have one dtd, which says <A>'s can contain <C>'s or <A>'s can contain <B>'s, and <B>'s contain <C>'s. The information that <B><C></C></B> is not a complete document is reflected in the DOCTYPE declaration of the actual xml document, which declares A as the document element:


<?xml version="1.0" encoding="UTF-8" ?>

<!DOCTYPE A [
    <!ELEMENT A (B|C)>
    <!ELEMENT B (C)>
    <!ELEMENT C (#PCDATA)>
]>

<A>
<C>stuff</C>
</A>

The goop between the square braces is the dtd, produced inline
in this document - I could make that a URL, but the information that <A> was my document element would stay in the xml document, not the dtd. The body of my document could have had a <B> around the <C> and would have still validated.

I should probably also mention that in that example, <B><C></C></B> by itself would not be a legal document body because of the DOCTYPE declaration, but it would be if I replaced the A in that declaration with B.

Suppose that I had a tag <DOC>. Could I use an <A> as part of a document declared with a <DOC> and maintain the restrictions on <A>, <B>, and <C>?

Thanks for your help so far.

The A,B,C hierarchies defined in the DTD will still hold no matter where in the document they live, provided that they are allowed within the ‘DOC’ element… or am I misunderstanding your question?

As simply as I can put it, I want to write a DTD which will allow <DOC><A><B><C></C></B></A></DOC> but not <DOC><B><C></C></B></DOC>. Is that possible?

Yes, define your A,B,C hierarchies in the DTD and allow only the ‘B’ one inside of ‘DOC’.

It would look something like this:

<!ELEMENT DOC (B)>

<!ELEMENT A (B)>

<!ELEMENT B (C)>

<!ELEMENT C>

etc…

Forgive me if my exact syntax is off, despite my constant exposure to them, I can never seem to spout off DTD from the top of my head without error :slight_smile:

You meant DOC(A). If I understand the OP, I would answer “Yes, just add an <!ELEMENT DOC(A)> to your dtd and leave all the other rules alone”. And your DOCTYPE declaration in your XML documents will now use “DOC” for the root element. You usually wind up creating some intended document element that serves no other purpose than to serve as an outer envelope, since a valid xml document must have one element which encloses the entire doc. You would NEED the <DOC>, for instances, if the thing you are representing could be a bunch of <A>'s instead of just one, and your dtd would contain:

<!ELEMENT DOC(A*)> or <!ELEMENT DOC(A+)>

Oops. I guess I’m a little dyslexic when I’m tired and haven’t eaten dinner. :smack: