Tuesday, November 30, 2010

Linq to XML for those XML Files with Large file size

There is issue when we create XDocument object using the code:
XDocument mydoc=XDocument.Load("myfile.xml");

When the xml file is with large file size, there will have memory out exception.

This is a notorious issue for XDocument class:

See: http://mtaulty.com/CommunityServer/blogs/mike_taultys_blog/archive/2007/09/08/9803.aspx
http://james.newtonking.com/archive/2007/12/11/linq-to-xml-over-large-documents.aspx

We follow the article, the concept is use

XmlReader xreader=XmlReader.Create("myfile.xml");

And create an IEnumerable type object using XmlReader, and use Linq to query this IEnumerable collection.

The code is:

using System.Xml;
using System.Xml.Linq;

class Program
{
static void Main(string[] args)
{
XNamespace myNS = XNamespace.Get("http://myns.xsd");
IEnumerable myElements = GetElements(@"myfile.xml", "product");
var q = from c in myElements
where (null != c.Element(myNS + "merchantListing").Element(myNS + "merchantProduct").Attribute("mid"))
select new { WebID = c.Attribute("id").Value.ToString() };

foreach (var item in q)
{
Console.WriteLine(item.WebID);
}

}

public static IEnumerable GetElements(string fileuri,string name)
{
using(XmlReader xreader=XmlReader.Create(fileuri))
{
xreader.MoveToContent();

while (xreader.Read())
{
if (xreader.NodeType == XmlNodeType.Element && xreader.Name ==name)
{
XElement element = (XElement)XElement.ReadFrom(xreader);
yield return element;
}
}
xreader.Close();
}
}
}

Here we use Linq to XML.
Of course, we don't necessary to "MUST" use Linq. As XDocument.Decendents("..."), this is IEnumerable type, we can use foreach as well.

XDocument myDoc=XDocument.Load("myfile.xml");
//for large size XML please use XMLReader as my previous post
XElement myel=myDoc.Decendent("product");
foreach(var item in myel)
{
Console.WriteLine(item.Element("id").Value);
}

No comments: