This is Interesting: Free Magazines for Graphics designers and webmasters  


Home > Archive > Microsoft XML > October 2005 > MS XML Strange Error





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author MS XML Strange Error
Gadrin@gmail.com

2005-10-20, 3:23 am

I have a page that was converted to XHTML via XMLStarlet.

I tried to load the page into a MS XML 3.0 Document but get the
following error:

Error Code: -1072896758
Reason: The character '>' was expected.
Line: 31 Column: 3
Text: -- Typical usage:

Now that text isn't even in the page! However there are numerous
comments with the double -- so I went a removed each by hand.

I still get the error.

So I completely removed line 31 (it was one of the comments that I
removed above)...and I still get the error.

I've noticed MS XML is a bit flaky with errors in the past (I've seen
it die before it actually reaches the Column of text that caused the
error).

Anyone feel like eyeballing the XHTML file via email? It's 52K.

Or better yet here's the original page's URL (I'm retrieving it via
XMLHTTP)

url =
"http://www.acinet.org/acinet/oview1.asp?from=National&next=oview1&level=Overall&op1=&op2=&op3=&op4=&op5=&op6=&op7=&op8=&id=1,,11&nodeid=3&showintro=&soccode=&stfips=00&group=2"

this is line 31:
<!--td width="157" valign="middle" align="RIGHT"
background="./images/acinetbottom.jpg"
style="background-repeat:no-repeat;"><a
href="http://www.careeronestop.org/sitemap.asp" class="help">Site
Map</a> | <a
href="http://www.careeronestop.org/usersupport/UserSupport.asp"
class="help">Help</a>&nbsp;</td-->

....which I've completely removed...

Martin Honnen

2005-10-20, 6:42 pm



Gadrin@XXXXXXXXXX wrote:


> Anyone feel like eyeballing the XHTML file via email? It's 52K.


Well post the URL to that XHTML but consider checking it with various
validators around yourself:
<http://schneegans.de/sv/>
<http://valet.webthing.com/page/>


--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Gadrin@gmail.com

2005-10-20, 6:42 pm

Unfortunately I don't have a web-site I can post the page to.

XMLStarlet's validation says it's Valid when using the default option.

However when I validate it versus the dtd, XMLStarlet returns
"invalid".

using W3C gets
Result: Failed validation,
File: J:\Temp Folders\CodingTemp\Winbatch BBS\xh1.html
Encoding: us-ascii
Doctype:

Sorry, I am unable to validate this document because on line 25, 37,
42, 44, 166, 177, 179, 229, 235-237, 250-252, 254-255, 258-260,
262-263, 266-268, 270-271, 274-276, 278-279, 282-284, 286-287, 290-292,
294-295, 298-300, 302-303, 306-308, 310-311, 314-316, 318-319, 322-324,
326-327, 330-332, 334-335, 338-340, 342-343, 346-348, 350-351, 354-356,
358-359, 362-364, 366-367, 370-372, 374-375, 378-380, 382-383, 386-388,
390-391, 394-396, 398-399, 402-404, 406-407, 410-412, 414-415, 418-420,
422-423, 426-428, 430-431, 434-436, 438-439, 442-444, 446-447, 450-452,
463, 466-470, 474 it contained one or more bytes that I cannot
interpret as us-ascii (in other words, the bytes found are not valid
values in the specified Character Encoding). Please check both the
content of the file and the character encoding indication.

Which doesn't help me a whole helluva lot. Guess I need to scan
character-by-character for non-Ascii.

Gadrin@gmail.com

2005-10-20, 6:42 pm

Alright, thanks for your help Martin.

I went char by char on the file and replaced chr(194) and chr(160) with
their corresponding   etc.

Unfortunately that didn't work as a subsequent validation revealed two
more errors.

So another char by char search determined all of these are problem
chars.

NAList = "160|194|226|128|148|169"

So I replaced all of those and still get an error.

Missing "charset" attribute for "text/xml" document. The HTTP
Content-Type header (text/xml) sent by your web browser (Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1)) did not contain a "charset"
parameter, but the Content-Type was one of the XML text/* sub-types.

The relevant specification (RFC 3023) specifies a strong default of
"us-ascii" for such documents so we will use this value regardless of
any encoding you may have indicated elsewhere.

If you would like to use a different encoding, you should arrange to
have your browser send this new encoding information.



So I think I'm getting close.

Thanks, for your help.

Peter Flynn

2005-10-20, 10:20 pm

Gadrin@XXXXXXXXXX wrote:

> I have a page that was converted to XHTML via XMLStarlet.
>
> I tried to load the page into a MS XML 3.0 Document but get the
> following error:
>
> Error Code: -1072896758
> Reason: The character '>' was expected.
> Line: 31 Column: 3
> Text: -- Typical usage:
>
> Now that text isn't even in the page! However there are numerous
> comments with the double -- so I went a removed each by hand.
>
> I still get the error.
>
> So I completely removed line 31 (it was one of the comments that I
> removed above)...and I still get the error.
>
> I've noticed MS XML is a bit flaky with errors in the past (I've seen
> it die before it actually reaches the Column of text that caused the
> error).
>
> Anyone feel like eyeballing the XHTML file via email? It's 52K.
>
> Or better yet here's the original page's URL (I'm retrieving it via
> XMLHTTP)
>
> url =
>

"http://www.acinet.org/acinet/oview1.asp?from=National&next=oview1&level=Overall&op1=&op2=&op3=&op4=&op5=&op6=&op7=&op8=&id=1,,11&nodeid=3&showintro=&soccode=&stfips=00&group=2"

The original HTML is hopelessly invalid, and even Tidy refuses to convert it
("588 warnings, 26 errors were found!") so I have no idea what kind of fist
XMLStarlet is making of it. Can you post the whole file so we can see it?

> this is line 31:
> <!--td width="157" valign="middle" align="RIGHT"
> background="./images/acinetbottom.jpg"
> style="background-repeat:no-repeat;"><a
> href="http://www.careeronestop.org/sitemap.asp" class="help">Site
> Map</a> | <a
> href="http://www.careeronestop.org/usersupport/UserSupport.asp"
> class="help">Help</a>&nbsp;</td-->


In itself the comment is well-formed. In these cases, always save the file
to disk and validate it using one of the well-known, robust standalone
parser/validators, eg rxp, onsgmls, etc. These are known to give accurate
and explicit error messages.

///Peter
XML FAQ: http://xml.silmaril.ie/

Gadrin@gmail.com

2005-10-20, 10:20 pm

well, actually I got it to work.

XMLStarlet seems to do a better job that Tidy (I used both Tidy and
TidyCOM).

If you use XMLStarlet's -o (omit XML Declaration) and -D (omit
DOCTYPE) it becomes quasi-XML.

I then open the XML file as a text file, removed all NULLS and then
went through and changed each chr() into it's
corresponding numeric entity code so

chr(194) gets changed to  and then resave the file.

that done, it opens in MS XML 3.0 just fine.

Gadrin@gmail.com

2005-10-20, 10:20 pm

BTW Peter just in case you're looking...you can get XMLStarlet at

http://xmlstar.sourceforge.net/

I think it does a better job than Tidy. You can use it on http:// pages
or local files.
I like it too because it has XSLT and EXSLT built into it. I just wish
it had a COM interface.

and another neat tool is XStandard's XHTTP component which is similar
to XMLHTTP but
has Tidy built into it and returns XHTML or text as you need. It only
works on http:// pages, though.

http://xstandard.com/page.asp?p=C8A...4A-B6679AA949E6

Sponsored Links


Copyright 2003 - 2008 forum4designers.com  Software forum  Computer Hardware reviews