This is Interesting: Free Magazines for Graphics designers and webmasters  


Home > Archive > Microsoft XML > January 2005 > reading 500 Mb xml file without opening





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author reading 500 Mb xml file without opening
MG

2005-01-22, 7:17 pm

hi all
i have a large 500 mb rdf XML file from dmoz and am trying to open it, but
is therea way of searching the file wihtout opening it as my programs run
out of memory trying to open it, and woudl like to searh it and find jsut a
part of the file

any thoughts?

--
--------------------
Michael Guthrie
ArtFusion, Ltd.
www.artfusion.com


Bjoern Hoehrmann

2005-01-22, 7:17 pm

* MG wrote in microsoft.public.xml:
>i have a large 500 mb rdf XML file from dmoz and am trying to open it, but
>is therea way of searching the file wihtout opening it as my programs run
>out of memory trying to open it, and woudl like to searh it and find jsut a
>part of the file


The problem is more about how to process the document than about opening
it (which you cannot possibly avoid). It seems you are trying to read
the document into some kind of in-memory tree representation, which does
indeed take a lot of memory. You should probably switch to a different
API like an event-based API like SAX (which MSXML supports) or a pull-
API (similar to an event-based API, supported by .NET's System.Xml). If
you use one of these APIs you should not run into such memory problems.
--
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
MG

2005-01-22, 7:17 pm

well
i wouldn't know where to start with that answer
i do thank you for your response
but i am pretty new to this



"Bjoern Hoehrmann" <bjoern@hoehrmann.de> wrote in message
news:4210c2f8.371136703@news.bjoern.hoehrmann.de...
>* MG wrote in microsoft.public.xml:
>
> The problem is more about how to process the document than about opening
> it (which you cannot possibly avoid). It seems you are trying to read
> the document into some kind of in-memory tree representation, which does
> indeed take a lot of memory. You should probably switch to a different
> API like an event-based API like SAX (which MSXML supports) or a pull-
> API (similar to an event-based API, supported by .NET's System.Xml). If
> you use one of these APIs you should not run into such memory problems.
> --
> Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
> Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
> 68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/



Bjoern Hoehrmann

2005-01-22, 7:17 pm

* MG wrote in microsoft.public.xml:
>
>well
>i wouldn't know where to start with that answer
>i do thank you for your response
>but i am pretty new to this


When using MSXML,

_ http://msdn.microsoft.com/library/e...een_sax_dom.asp

might be a good starting point to choose the right interface.

http://www.google.com/search?q=msxm...3Amicrosoft.com

should take you to more information on using SAX with MSXML.
--
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Sponsored Links


Copyright 2003 - 2008 forum4designers.com  Software forum  Computer Hardware reviews