By using this site you agree to the use of cookies by Brugbart and our partners.

Learn more

Prevent HTML Entities From Being Decoded

How to prevent that HTML entities gets decoded when submitting a form.

Edited: 2013-07-18 16:06

There are those of us who like to make our own CMS systems, and there are those who just install a pre-build CMS like Drupal or Wordpress. Those who make their own however, are very likely to encounter a problem with HTML entities in textareas.

Less than, greater than signs, and other signs

The problem is perhaps most apparent if you are working with source code examples, where signs such as greater than and less than gets used a lot. What happens is that when these signs are returned to the browser, they will be returned as "&lt;" and "&gt;" respectively, which will be converted to "<" and ">" by the browser – but we want the real source to show up!

The real problem first appears the moment when you hit submit, at which point all the HTML entities will be converted to hard characters by the browser. The easiest way to solve the problem, is to deal with this behavior on the server-side, before showing the data to be edited in the textarea.

Replacing HTML Entities for Textarea use

Some will likely suggest using the PHP function htmlentities to convert the characters back – and that could work in some cases – but will be difficult in situations where you have real HTML saved together with the data. So we came up with a simple solution to replace these special characters, before outputting the data to the textarea. I.e.

$content = preg_replace('/&([a-zA-Z]+);/', '&amp;$1;', $raw_content_from_db);

The above is a simple regular expression, replacing the ampersands in all HTML entities with "&amp;" – which is also a HTML entity. This will make all characters display correctly in the textarea.

See also

  1. Special characters in HTML