XML Encoding Text Really, Really Easily

by Jon Davis 18. January 2008 00:31

Frequently I need to pass some text along in an XML document that may have special characters like less-than (<), greater-than (>), or the really annoying nuisance, ampersand (&). Most people fix this lazily by using CDATA nodes. But I hate CDATA nodes with a passion!! I've been using a trick I've used for years and I dunno why I never blogged it. You don't need to use CDATA, nor do you need to manually perform a scan and replace for these special characters. Microsoft already did the dirty work for you in XmlDocument by setting a node's InnerText value and getting the InnerXml value back.

So when I'm generating an XML file, such as with using a StringBuilder or using repeaters on an .aspx template, this is what I sneak into the code-behind:


    private static System.Xml.XmlDocument _staticDoc = null;
    public static string XmlEncode(string str)
    {
        if (str == null) return "";
        if (_staticDoc == null)
        {
            _staticDoc = new System.Xml.XmlDocument();
            _staticDoc.LoadXml("<text></text>");
        }
        lock (_staticDoc)
        {
            _staticDoc.LastChild.InnerText = str;
            return _staticDoc.LastChild.InnerXml;
        }
    }

Then I can just use: 

<%# XmlEncode("Ed & Bob")%>

.. where "Ed & Bob" actually comes from a data object. :) This in turn outputs "Ed &amp; Bob".

There's also XmlTextWriter, which I haven't tried yet.

UPDATE: Alright, now I've tried XmlTextWriter. I had a need for ASCII enforcement of XML encoding so that weird Unicode characters are converted to their "&#??;" entity replacements. The method isn't so simple anymore but first tests seem to pass. I'll update this again if I find it to be flawed. Note that I'm putting this among other things into a shared XmlUtil class full of handy static methods. But this update in particular is important because XmlDocument.Load() was failing to load because of some Unicode characters that could be best described in XML entities.

namespace XmlUtil {

private static XmlDocument _staticDoc = null;
private static StringWriter _staticStringWriter = null;
private static XmlWriter _staticXmlWriter = null;
 
/// <summary>Converts Unicode text into ASCII-compliant XML encoded text</summary>
public static string EncodeText(string str)
{
    if (str == null) return "";
    if (_staticDoc == null)
    {
        _staticDoc = new System.Xml.XmlDocument();
        _staticDoc.LoadXml("<text></text>");
        _staticStringWriter = new StringWriter();
        XmlWriterSettings settings = new XmlWriterSettings();
        settings.ConformanceLevel = ConformanceLevel.Fragment;
        _staticXmlWriter = XmlTextWriter.Create(_staticStringWriter, settings);
    }
    lock (_staticDoc)
    {
        _staticDoc.LastChild.InnerText = str;
        str = _staticDoc.LastChild.InnerXml;
    }
 
    // ASCII enforcement
    StringBuilder sb = new StringBuilder();
    char[] chars = str.ToCharArray();
    for (int i = 0; i < chars.Length; i++)
    {
        char c = chars[i];
        if ((int)c > 127) // goes beyond ASCII charset
        {
            lock (_staticStringWriter)
            {
                lock (_staticXmlWriter)
                {
                    _staticXmlWriter.WriteCharEntity(c);
                    _staticXmlWriter.Flush();
                    StringBuilder _sb = _staticStringWriter.GetStringBuilder();
                    sb.Append(_sb.ToString());
                    _sb.Length = 0;
                }
            }

        }
        else sb.Append(c);
    }
    return sb.ToString();
}

}

Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: ,

Software Development | Web Development

Comments

Add comment


(Will show your Gravatar icon)  

  Country flag

biuquote
  • Comment
  • Preview
Loading




 

Powered by BlogEngine.NET 1.4.5.0
Theme by Mads Kristensen

About the author

Jon Davis (aka "stimpy77") has been a programmer, developer, and consultant for web and Windows software solutions professionally since 1997, with experience ranging from OS and hardware support to DHTML programming to IIS/ASP web apps to Java network programming to Visual Basic applications to C# desktop apps.
 
Software in all forms is also his sole hobby, whether playing PC games or tinkering with programming them. "I was playing Defender on the Commodore 64," he reminisces, "when I decided at the age of 12 or so that I want to be a computer programmer when I grow up."

Jon was previously employed as a senior .NET developer at a very well-known Internet services company whom you're more likely than not to have directly done business with. However, this blog and all of jondavis.net have no affiliation with, and are not representative of, his former employer in any way.

Contact Me 


Tag cloud

Calendar

<<  October 2020  >>
MoTuWeThFrSaSu
2829301234
567891011
12131415161718
19202122232425
2627282930311
2345678

View posts in large calendar