Html stripper (Remove Html tag)

This is a common problem when we want to show any data in gridview(asp.net) or table or repeater it breaks the page or design of the page due to html tag inside the presented date.

It will also cause the problem if we want use CKEDITOR to insert data. Because when we insert anything using rich editor it is wrap the text with <p></p>. If we want to show partial of the string it will cause page break because last </p> is missing.

So to avoid this problem we need to strip the Html tag from the string. We can do it in various ways such as regular expression and character array.

Solution 1(Good):

using System.Text.RegularExpressions;
//Remove HTML from string with Regex.
public static string StripTagsRegex(string source)
{
return Regex.Replace(source, "<.*?>", string.Empty);
}


Solution 2(Better):

using System.Text.RegularExpressions;
// Compiled regular expression for performance.
static Regex _htmlRegex = new Regex("<.*?>", RegexOptions.Compiled);

// Remove HTML from string with compiled Regex.
public static string StripTagsRegexCompiled(string source)
{
return _htmlRegex.Replace(source, string.Empty);
}

Solution 3:(Best)

//Remove HTML tags from string using char array.
public static string StripTagsCharArray(string source)
{
char[] array = new char[source.Length];
int arrayIndex = 0;
bool inside = false;

for (int i = 0; i < source.Length; i++) { char let = source[i]; if (let == '<') { inside = true; continue; } if (let == '>')
{
inside = false;
continue;
}
if (!inside)
{
array[arrayIndex] = let;
arrayIndex++;
}
}
return new string(array, 0, arrayIndex);
}

Comments

Simple and easy to follow. Regex can be daunting for some developers, but you explain this simple Strip Tags functionality well.

Popular posts from this blog

The model backing the 'MyDBContext' context has changed since the database was created. Either manually delete/update the database, or call Database.SetInitializer with an IDatabaseInitializer instance. For example, the DropCreateDatabaseIfModelChanges strategy will automatically delete and recreate the database, and optionally seed it with new data.

How can I get a value of a property from an anonymous type (C#)

Check/Uncheck all items in a CheckBoxList using Javascript