Friday, April 22, 2011

Javascript HTML tag remover

A little regular expression to remove all HTML tags from a string.

Input code
<h2>About</h2>
<p>Here you will find posts about news, 
trends and developing for internet, 
mainly focusing on browsers and 
web user interfaces.</p>
<p><a href="/about">Read more</a></p>
Output text
About
Here you will find posts about news, 
trends and developing for internet,
mainly focusing on browsers and 
web user interfaces.
Read more

The script

function removeHTMLTags(){
if(document.getElementById && _
   document.getElementById("input-code")){
  var strInputCode = document.getElementById("input-code").innerHTML;
  strInputCode = strInputCode.replace(/&(lt|gt);/g, 
     function (strMatch, p1){
      return (p1 == "lt")? "<" : ">";
  });
  var strTagStrippedText = strInputCode.replace(/<\/?[^>]+(>|$)/g, "");
    alert("Input code:\n" + strInputCode + "\n\nOutput text:\n" + _ 
    strTagStrippedText); 
  } 
}

No comments:

Post a Comment