Skip to content Skip to sidebar Skip to footer

Word Boundary Regular Expression Unless Inside Html Tag

I have a regular expression using word boundaries that works exceedingly well... ~\b('.$value.')\b~i ...save for the fact that it matches text inside HTML tags (i.e. title='This is

Solution 1:

Davey, resurrecting this question because apart from the Dom solution, there is a better regex solution than the one mentioned so far. It's a simple solution that requires a single step.

The general solution is

<[^>]*>(*SKIP)(*F)|blue

Here's a demo

Any content within <> tags is simply skipped. Content in between tags, such as blue is matched, which sounds like it fits your needs.

In the expression, replace "blue" for what you like.

Reference

  1. How to match pattern except in situations s1, s2, s3
  2. How to match a pattern unless...

Solution 2:

Regex replaces often seem like the solution but they can have a lot of ill side-effects, and not really accomplish what you want. Look into DOMDocument models instead (as some commenters have suggested).

But if you insist on using regex, here's a good post on SO. It uses two passes to accomplish what you want.

Post a Comment for "Word Boundary Regular Expression Unless Inside Html Tag"