Monday, 30 January 2012

Removing White Spaces (New Lines) in the XML Document


Removing XmlDocument white space c#

I've recently been working on matching certain API calls with XML data pulled from an XML file for testing purposes. I noticed there was a large amount of white space left in the XML when pulled from the resourced XML file; which is something I didn't want.
I thought setting the XmlDocument.PreserveWhitespace property to false would remove this for me, but it just seems to remove the preceding and trailing white space; making it similar to the string.Trim()method. I needed to use something akin to string.Replace() (this replaces a specific substring with another substring), but more powerful. Here comes the Regex.Replace function to the rescue, which is a bit like string.Replace() on steroids! Regex.Replace() allows replacement of text using regular expressions, something which can make make complex replacements a piece of cake.
Here is the code to replace white space in XML or any XML dialect(such as HTML - or XHTML):

  1. / Remove inner Xml whitespace  
  2. Regex regex = new Regex(@">\s*<");  
  3. string cleanedXml = regex.Replace(dirtyXml, "><"); 

No comments:

Post a Comment