By using this site you agree to the use of cookies by Brugbart and our partners.

Learn more

Ignore Brackets in Regular Expressions

How to not create a backreference, and ignore the content of the parentheses.

Edited: 2013-07-04 13:00

Sometimes you do not really need to create backreferences in your regular expressions. Backreferences are created whenever you are using curly brackets, but since parentheses are also used in certain other regular expressions, there are situations where you likely want to optimize your regular expression by not creating a backreference that you do not really need.

The trick is to begin the parentheses with a question mark and a semicolon!

The following would match all strong and em tags on a page.

 #<(?:strong|em)>([^<]+)</(?:strong|em)>#

This pattern will only remember the content of the em and strong html elements, ignoring the closing and opening tags.

Making your regex engine not remember the match can be very useful, just imagine having to keep track of a bunch of backreferences that you don't really need – it can quickly get dificult!

The [^<]+ brackets part in the middle of it all simply tells to match everything, excluding the less than sign – since this is used in the beginning of html tags, that would only match everything up until the next html tag.