php-regex-for-web-developers

Regular expressions are a very useful tool for developers. They allow to find, identify or replace a word, character or any kind of string. This tutorial will teach you how to master PHP regexp and show you extremely useful, ready-to-use PHP regular expressions that any web developer should have in his toolkit.

Getting Started With Regular Expressions

For many beginners, regular expressions seem to be hard to learn and use. In fact, they’re far less hard than you may think. Before we dive deep inside regexp with useful and reusable codes, let’s quickly see the basics of PCRE regex patterns:

Regular Expressions Syntax

A regular expression (regex or regexp for short) is a special text string for describing a search pattern. A regex pattern matches a target string. The following table describes most common regex:

Regular Expression Will match…
foo The string “foo”
^foo “foo” at the start of a string
foo$ “foo” at the end of a string
^foo$ “foo” when it is alone on a string
[abc] a, b, or c
[a-z] Any lowercase letter
[^A-Z] Any character that is not a uppercase letter
(gif|jpg) Matches either “gif” or “jpg”
[a-z] One or more lowercase letters
[0-9.-] Any number, dot, or minus sign
^[a-zA-Z0-9_]{1,}$ Any word of at least one letter, number or _
([wx])([yz]) wy, wz, xy, or xz
[^A-Za-z0-9] Any symbol (not a number or a letter)
([A-Z]{3}|[0-9]{4}) Matches three letters or four numbers

PHP Regular Expression Functions

PHP has many useful functions to work with regular expressions. Here is a quick cheat sheet of the main PHP regex functions. Remember that all of them are case sensitive.

For more information about the native functions for PHP regular expressions, have a look at the manual.

Function Description
preg_match() The preg_match() function searches string for pattern, returning true if pattern exists, and false otherwise.
preg_match_all() The preg_match_all() function matches all occurrences of pattern in string. Useful for search and replace.
preg_replace() The preg_replace() function operates just like ereg_replace(), except that regular expressions can be used in the pattern and replacement input parameters.
preg_split() Preg Split (preg_split()) operates exactly like the split() function, except that regular expressions are accepted as input parameters.
preg_grep() The preg_grep() function searches all elements of input_array, returning all elements matching the regex pattern within a string.
preg_ quote() Quote regular expression characters

Validate a Domain Name

Case sensitive regex to verify if a string is a valid domain name. This is very useful when validating web forms.

$url = "http://komunitasweb.com/";
if (preg_match('/^(http|https|ftp)://([A-Z0-9][A-Z0-9_-]*(?:.[A-Z0-9][A-Z0-9_-]*) ):?(d )?/?/i', $url)) {
    echo "Your url is ok.";
} else {
    echo "Wrong url.";
}

» Source

Enlight a Word From a Text

This very useful regular expression will find a specific word in a string and enlight it. Extremely useful for search results. Remember that it’s case sensitive.

$text = "Sample sentence... regex has become popular in web programming. Now we learn regex. According to wikipedia, Regular expressions (abbreviated as regex or regexp, with plural forms regexes, regexps, or regexen) are written in a formal language that can be interpreted by a regular expression processor";
echo preg_replace("http://www.webdesignernews.com/b(regex)b/i", '1', $text);

» Source

Enlight Search Results in Your WordPress Blog

The previous code snippet can be very handy when it comes to displaying search results. If your website is powered by WordPress, here is a more specific snippet that will search and replace a text by the same text within an HTML tag that you can style later, using CSS.

Open your search.php file and find the the_title() function. Replace it with the following:

echo $title;

Now, just before the modified line, add this code:

',
		$title);
?>

Save the search.php file and open style.css. Append the following line to it:

strong.search-excerpt { background: yellow; }

» Source

Get All Images From a HTML Document

If you ever wanted to be able to get all images form a webpage, this code is a must have for you. You should easily create an image downloader using the power of cURL.

$images = array();
preg_match_all('/(img|src)=("|')[^"'>] /i', $data, $media);
unset($data);
$data=preg_replace('/(img|src)("|'|="|=')(.*)/i',"$3",$media[0]);
foreach($data as $url)
{
	$info = pathinfo($url);
	if (isset($info['extension']))
	{
		if (($info['extension'] == 'jpg') ||
		($info['extension'] == 'jpeg') ||
		($info['extension'] == 'gif') ||
		($info['extension'] == 'png'))
		array_push($images, $url);
	}
}

» Source

Remove Repeated Words (Case Insensitive)

Often repeating words while typing? This handy case insensitive PCRE regex will be very helpful.

$text = preg_replace("http://www.webdesignernews.com/s(w s)1/i", "$1", $text);

» Source

Remove Repeated Punctuation

Same php regex as above, but this one will look for repeated punctuation within a string. Goodbye multiple commas!

$text = preg_replace("http://www.webdesignernews.com/. /i", ".", $text); 

» Source

Match a XML/HTML Tag

This simple function takes two arguments: The first is the tag you’d like to match, and the second is the variable containing the XML or HTML. Once again, this can be very powerful used along with cURL.

function get_tag( $tag, $xml ) {
  $tag = preg_quote($tag);
  preg_match_all('{<'.$tag.'[^>]*>(.*?).'}',
                   $xml,
                   $matches,
                   PREG_PATTERN_ORDER);

  return $matches[1];
}

Match an HTML/XML Tag With a Specific Attribute Value

This function is very similar to the previous one, but it allow you to match a tag having a specific attribute. For example, you could easily match