This is Interesting: Free Magazines for Graphics designers and webmasters  


Home > Archive > Microsoft XML > September 2005 > Combining user tags and tr/td tags - how to?





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Combining user tags and tr/td tags - how to?
Richard Lionheart

2005-09-22, 11:24 pm

Hi All,



I've got an HTML document and its DTD shown below.



The output currently displays as a single line. I want the content
displayed as a table, but I don't want to manually install td & tr tags in
the midst of the content.



I'd prefer to do this by adding td and tr tags to my tags in the DTD,
CSS-style. Is this feasible, and if so can you point me to an example
online?



If not, is XSLT the simplest alternative?



Thanks in advance,

Richard



======= TestGlossary.html ============

<?xml version="1.0" encoding="UTF-16"?>

<!DOCTYPE Glossary SYSTEM
"file:/K:/_Projects/XML/German/German_Glossary.dtd">

<Glossary>

<Greetings>

<Entry>

<German>Guten Tag! </German>

<English>Good Day </English>

</Entry>

<Entry>

<German>Grüß Gott! </German>

<English>Hello! </English>

<Info>southern Germany & Austria</Info>

</Entry>

</Greetings>

</Glossary>



======== German_Glossary.dtd =============

<!ELEMENT English (#PCDATA)>

<!ELEMENT German (#PCDATA)>

<!ELEMENT Info (#PCDATA)>

<!ELEMENT Entry (German | English | Info)*>

<!ELEMENT Greetings (Entry)*>

<!ELEMENT Glossary (Greetings)>


Peter Flynn

2005-09-23, 7:40 pm

Richard Lionheart wrote:

> I've got an HTML document and its DTD shown below.


That's not HTML, it's XML.
Is there a particular reason why you used UTF-16 instead of UTF-8
or ISO-8859-1? (nothing wrong with it, just curious).

> The output currently displays as a single line.


How the output displays depends on your stylesheet, and you haven't
shown us that.

> I want the content
> displayed as a table, but I don't want to manually install td & tr tags
> in the midst of the content.


So code your stylesheet to do it.

> I'd prefer to do this by adding td and tr tags to my tags in the DTD,


No need if your markup is already accurate.

> CSS-style.


What has CSS got to do with it?

> Is this feasible, and if so can you point me to an example
> online?
>
> If not, is XSLT the simplest alternative?


Yes. Making some huge assumptions about what you want:

<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">

<xsl:output method="html"/>

<xsl:template match="/">
<html>
<head>
<title>Untitled document</title>
<link rel="stylesheet" href="glossary.css" type="text/css"/>
</head>
<xsl:apply-templates/>
</html>
</xsl:template>

<xsl:template match="Glossary">
<body>
<xsl:apply-templates/>
</body>
</xsl:template>

<xsl:template match="Greetings">
<table border="1">
<xsl:apply-templates/>
</table>
</xsl:template>

<xsl:template match="Entry">
<tr>
<xsl:apply-templates/>
</tr>
</xsl:template>

<xsl:template match="German|English|Info">
<td>
<xsl:apply-templates/>
</td>
</xsl:template>

</xsl:stylesheet>

produces

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Untitled document</title>
<link rel="stylesheet" href="glossary.css" type="text/css">
</head>
<body>
<table border="1">
<tr>
<td>Guten Tag! </td>
<td>Good Day </td>
</tr>
<tr>
<td>Grüß Gott! </td>
<td>Hello! </td>
<td>southern Germany & Austria</td>
</tr>
</table>
</body>
</html>

///Peter

Richard Lionheart

2005-09-23, 11:19 pm

Hi Peter,

Thanks for your resonse, which ranks among the top of all initial responses
I've gotten to the dozens (perhaps hundreds) of NG-posts I've made over the
years.

>
> That's not HTML, it's XML.


Well, I guess that's my first problem.

I've got the same content stored both as an .xml file and an .html file.
When I double-click the .xml file, an IE window opens with the content:
"Action canceled
Internet Explorer was unable to link to the Web page you requested.
The page might be temporarily unavailable.
"
(I'm running WinXP-Pro/SP2 with Office Pro 2003)

> Is there a particular reason why you used UTF-16 instead of UTF-8
> or ISO-8859-1? (nothing wrong with it, just curious).


Yes, I've got German content which includes characters not displayable
without something beyond the 127- or 255-element character set of UTF-8.

I had one, which I provide below, though it wouldn't have addressed the
formatting that yours, thankfully, does. (Incidentally, both the .dtd and
..xsl were genertated by the OxygenXML editor, ver. 6, which I love but still
don't know much about. I edited both of them to more exactly reflect my
intended structure.)

Thank you very much for the style-sheet I wanted but didn't know how to
code. I'll try to get it working on my system, though there may be a few
glitches still to be overcome.

Again, with sincere thanks,
Richard

============= German_Glossary.xsd ============
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:element name="English" type="xs:string"/>
<xs:element name="German" type="xs:string"/>
<xs:element name="Info" type="xs:string"/>
<xs:element name="Entry">
<xs:complexType>
<xs:sequence>
<xs:element ref="German"/>
<xs:element ref="English"/>
<xs:element ref="Info" minOccurs="0" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="Greetings">
<xs:sequence>
<xs:element ref="Greetings"/>
</xs:sequence>
</xs:complexType>
<xs:element name="Greetings">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" maxOccurs="unbounded" ref="Entry"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Glossary" type="Greetings"/>
</xs:schema>
============== End of German_Glossary.xsd ============


Richard Lionheart

2005-09-24, 4:21 am

Hi Peter,

I finally got this stuff working.

I made FireFox my default browser and the XML doc displayed beautifully with
color-coded tags and indented structure.

I had trouble using your XML Style Sheet, so I've got to study that further
and perhaps post some new questions. I did switch from UTF-16 to iso-8859-1
in my XML document as you suggested. The document is still encoded as
Unicode-bigendian and the 8859-1 declaration seems compatible with that.

I added a reference in my XML file to the XML Schema I built and validated
the document against it, using OxygenXML.

I added a reference in my XML doc. to the XML Stylesheet I built and ran it
from with Oxygen. It generated an HTLM file and displayed the table I
wanted in FireFox.

I ran the XML document directrly from Windows Explorer and got the same
result.

Thanks for getting me moving on this. Now I have to get this into a
database and build it in .NET.

I doubt you have any interest in the details of this trivial exercise, but
just in case:

Best wishes,
Richard

======== XML Doc ========
<?xml version="1.0" encoding="iso-8859-1"?>
<?xml-stylesheet type="text/xsl"
href="German_Glossary-ANSI_StyleSheet.xsl"?>
<Glossary xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="German_Glossary-ANSI_Schema.xsd">
<Greetings>
<Entry>
<German>Guten Tag! </German>
<English>Good Day </English>
</Entry>
<Entry>
<German>Grüß Gott! </German>
<English>Hello! </English>
<Info>southern Germany & Austria</Info>
</Entry>
<!-- ********* Snip ********* -->
</Greetings>
</Glossary>

========= XML Schema =========
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:element name="English" type="xs:string"/>
<xs:element name="German" type="xs:string"/>
<xs:element name="Info" type="xs:string"/>
<xs:element name="Entry">
<xs:complexType>
<xs:sequence>
<xs:element ref="German"/>
<xs:element ref="English"/>
<xs:element ref="Info" minOccurs="0" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="Greetings">
<xs:sequence>
<xs:element ref="Greetings"/>
</xs:sequence>
</xs:complexType>
<xs:element name="Greetings">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" maxOccurs="unbounded" ref="Entry"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Glossary" type="Greetings"/>
</xs:schema>

======== XML Stylesheet ==============
<?xml version="1.0" encoding="ISO-8859-1"?><xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:template match="/">
<html>
<body>
<h2>German Glossary</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th align="left">German</th>
<th align="left">English</th>
<th align="left">Info</th>
</tr>

<xsl:for-each select="Glossary/Greetings/Entry">
<tr>
<td><xsl:value-of select="German"/></td>
<td><xsl:value-of select="English"/></td>
<td><xsl:value-of select="Info"/></td>
</tr>
</xsl:for-each>

</table>
</body>
</html>
</xsl:template></xsl:stylesheet>


Peter Flynn

2005-09-24, 10:20 pm

Richard Lionheart wrote:

> Hi Peter,
>
> I finally got this stuff working.
>
> I made FireFox my default browser and the XML doc displayed beautifully
> with color-coded tags and indented structure.


Good news.

> I had trouble using your XML Style Sheet, so I've got to study that
> further
> and perhaps post some new questions. I did switch from UTF-16 to
> iso-8859-1
> in my XML document as you suggested. The document is still encoded as
> Unicode-bigendian and the 8859-1 declaration seems compatible with that.


UTF-16 is a 16-bit (2-byte) encoding of Unicode; UTF-8 is an 8-bit
(potentially) multi-byte encoding of the same thing, so there shouldn't be
anything you can encode in UTF-16 that you can't encode in UTF-8 (as far as
I know, which may not be far enough :-) But for modern German, using the
Latin alphabet, all the diacriticals are in ISO-8859-1 (umlauted vowels and
the ess-zet ligature). Middle High German and Old German, however, would
indeed require additional glyphs for which you would need UTF-8 or UTF-16.

Glad to know it's now working.

///Peter

Richard Lionheart

2005-09-25, 3:21 am

Hi Peter,

Thanks a lot for that additional clarification.

> UTF-16 is a 16-bit (2-byte) encoding of Unicode; UTF-8 is an 8-bit
> (potentially) multi-byte encoding of the same thing, so there shouldn't be
> anything you can encode in UTF-16 that you can't encode in UTF-8 (as far
> as
> I know, which may not be far enough :-) But for modern German, using the
> Latin alphabet, all the diacriticals are in ISO-8859-1 (umlauted vowels
> and
> the ess-zet ligature). Middle High German and Old German, however, would
> indeed require additional glyphs for which you would need UTF-8 or UTF-16.


What would be the consequence if I
- wrote an XML doc in NotePad, say, specifying ISO-8859-1
- included some Old German character in it
- saved it as Unicode-BigEndian
- fed it to XSLT with my XSL styesheet?

Would XSLT barf or would the O-G character be omitted from the resulting
..html doc?

Thanks in advance for any additional detail you may be able to provide. I'm
grateful for all the time you've already devoted to my education.


Regards,
Richard


Peter Flynn

2005-09-25, 6:24 pm

Richard Lionheart wrote:

> Hi Peter,
>
> Thanks a lot for that additional clarification.
>
>
> What would be the consequence if I
> - wrote an XML doc in NotePad, say, specifying ISO-8859-1


Provided your document only used ISO-8859-1 character codes, it would
be fine. If you used MS-1252 or some other character set, then you
would get some illegal characters and some garbled characters, because
you would be feeding your XML processor one thing but labelling it as
being another.

It's a bit like those users who think all they have to do to convert a
GIF image to JPG is change the filetype. That doesn't work: the data
still has to be correct :-)

In any event, Notepad is too limited for anything other than trivial use.
There are plenty of usable XML editors available for download.

> - included some Old German character in it


Then the file might be unprocessable. You'd be telling the processor that
it was an ISO-8859-1 file, but effectively lying about it by including a
non-8859-1 character -- which would in any case be a multi-byte character
which your processor would interpret as two separate characters...unless
you are working on an operating system which can label the type of data
at the filesystem level, in which case you have a third axis to worry
about : what the OS believes your file to consist of. I don't know if
Microsoft Windows can do this, though.

> - saved it as Unicode-BigEndian


You'd have to do that manually in NotePad, as it has no way of adding a
BOM. And as I said, I don't know if WinXP can distinguish between the
character encodings at the filesystem level, so "saving as Unicode" may
be a non-existent concept on XP (someone correct me).

> - fed it to XSLT with my XSL styesheet?


You'd get an error message unless the combination of bytes used for the
OHG character happened also to be valid ISO-8859-1 codepoints.

> Would XSLT barf or would the O-G character be omitted from the resulting
> .html doc?


If it barfed, the output would be null. If it didn't, the output would
contain spurious characters.

All three axes have to synchronize:

1. the encoding you declare in the XML Declaration (or the default, UTF-8)

2. the set of characters you use in the document.

3. the mode used by your filesystem to store the file (which may or may not
be relevant, according to your OS, and which may or may not be passed on
behind the scenes to your XSLT processor when you run it).

If possible, always by preference use an operating system and applications
software in which you have absolute control over all three, then you won't
have any problems.

> Thanks in advance for any additional detail you may be able to provide.
> I'm grateful for all the time you've already devoted to my education.


Not a problem. Writing this has already shown up some holes in the way the
FAQ (xml.silmaril.ie) explains this, so I need to do some repairs there.

///Peter

Richard Lionheart

2005-09-25, 6:24 pm

Hi Peter,

Once again, thanks for the additional tutoring.

Somewhere along the line I wound up saving my original Unicode XML as an
ANSI file. I just looked at it with XVI32, a great (free) hex editor from
http://www.chmaas.handshake.de/delp...vi32/xvi32.htm, and
confirmed just what you said. The unlaut chars and the ess-zet
ligature[nice name :-) ... it brings to mind my modest study of fonts in
Petzold's book decades ago] are indeed stored in single-byte characters.

> In any event, Notepad is too limited for anything other than trivial use.

I agree. I use it when I want to be pretty sure about exactly what I'm
creating. If it didn't exist and I could find something without frills on
the Net, I'd write my own.

> There are plenty of usable XML editors available for download.

Indeed. Actually, I use OxygenXML casually (as well as Visual Studio 6
professionally and VS.Net a little), though in this exercise I:
- fed it my XML doc.
- had it produce a DTD by "learning" the XML doc's structure
- had it convert the the DTD to an XSD
- edited the xsd to meet my precise requirements for the XML doc
- had it associate first the DTD and later the XSD with my XML doc, then
validated my XML doc
- had it associate my XSL with my XML doc and had it apply style to my xml
to produce an HTML and display the latter in Firefox

I've still got high on my list the task of learning how to build XSDs in
your style, versus the simpler one I built based on W3School's tutorial.

I'm saving this thread in the form of a Q & A in MS Word for further study
at my leisure. Now that I at least know the contours of this issue I can
press on combining the new technologies I'm picking up (XML, WSH, SQL Server
and .NET) to produce a couple of apps that friends want me to help on, one
of which I may even get around to posting in the form of a tutorial.

Kindest regards,
Richard


Peter Flynn

2005-09-25, 6:25 pm

Richard Lionheart wrote:

> Somewhere along the line I wound up saving my original Unicode XML as an
> ANSI file. I just looked at it with XVI32, a great (free) hex editor from
> http://www.chmaas.handshake.de/delp...vi32/xvi32.htm, and
> confirmed just what you said. The unlaut chars and the ess-zet
> ligature[nice name :-) ... it brings to mind my modest study of fonts in
> Petzold's book decades ago] are indeed stored in single-byte characters.


Good, that rules out one source of errors.

> I agree. I use it when I want to be pretty sure about exactly what I'm
> creating. If it didn't exist and I could find something without frills on
> the Net, I'd write my own.


Emacs. No question about it.

> Indeed. Actually, I use OxygenXML casually (as well as Visual Studio 6
> professionally and VS.Net a little), though in this exercise I:
> - fed it my XML doc.
> - had it produce a DTD by "learning" the XML doc's structure
> - had it convert the the DTD to an XSD
> - edited the xsd to meet my precise requirements for the XML doc
> - had it associate first the DTD and later the XSD with my XML doc, then
> validated my XML doc
> - had it associate my XSL with my XML doc and had it apply style to my xml
> to produce an HTML and display the latter in Firefox
>
> I've still got high on my list the task of learning how to build XSDs in
> your style, versus the simpler one I built based on W3School's tutorial.


I don't used XSDs much at all, as 99% of my work is with text documents,
where DTDs work just fine. Schemas are useful where you need to control
data types at validation time rather than in a processor, and there are
structures they can represent which DTDs can't, but for normal text docs
DTDs are probably all that is needed.

> I'm saving this thread in the form of a Q & A in MS Word for further study
> at my leisure. Now that I at least know the contours of this issue I can
> press on combining the new technologies I'm picking up (XML, WSH, SQL
> Server
> and .NET) to produce a couple of apps that friends want me to help on,
> one of which I may even get around to posting in the form of a tutorial.


That would be a very useful resource.

///Peter

Richard Lionheart

2005-09-28, 6:34 pm

Hi Peter,

> Emacs. No question about it.


I agree that Emacs is powerful. I used it briefly when I had to work on
Unix boxes, but now it would be "work" to get it (and me) running on my
WinXP box.

[color=darkred]
> I don't used XSDs much at all ...
> but for normal text docs
> DTDs are probably all that is needed.


One app I'm planing to build is a small invoice
generation/storage/retrieval/updating system, so I want to be alerted asap
if anything is corrupted in the system: hence my desire to have a schema
that tighten down anything that might be added. Anyway, it's fun to able to
express that decree of contol over data using patterns rather than
procedural code.

> That would be a very useful resource.


Well, I an keeping notes as I go along learning this stuff. I hope I get
it in the form that W3Schools.com, TopXML.com etc. will find worthy of
publishing.

Again, thanks for your input. I'll still have plenty of questions, so
we'll run across one another on the NGs.

Best wishes,
Richard


Sponsored Links


Copyright 2003 - 2008 forum4designers.com  Software forum  Computer Hardware reviews