1.20.x link.module _link_escape_domain($url)

Given a URL that uses UTF-8 characters, escape to use only ASCII.

Domain names can also be problematic as they use a special encoding, IDN Punycode. This converts a URL into a fully escaped format so that it may be validated using only ASCII characters.

Parameters

$url: The URL whose domain should be escaped.

Return value

string: The provided URL with its domain escaped to ASCII.

File

modules/link/link.module, line 708
Defines simple link field types.

Code

function _link_escape_domain($url) {
  $matches = array();
  // PHP 5.3 *sometimes* may have idn_to_ascii(), either through a PECL
  // extension or when compiled with --enable-intl. PHP 5.4 and higher should
  // always have idn_to_ascii().
  if (function_exists('idn_to_ascii')) {
    // External links.
    if (strpos($url, '://') && preg_match('!^(.*://)([^@]*@)?([^/]+)(.*)$!', $url, $matches)) {
      // INTL_IDNA_VARIANT_UTS46 is the default in PHP 7.4+ but does not exist
      // in PHP 5.3. Use it if it exists, otherwise use the default variant.
      if (defined('INTL_IDNA_VARIANT_UTS46')) {
        $escaped_url = $matches['1'] . $matches['2'] . idn_to_ascii($matches['3'], IDNA_DEFAULT, INTL_IDNA_VARIANT_UTS46) . $matches['4'];
      }
      else {
        $escaped_url = $matches['1'] . $matches['2'] . idn_to_ascii($matches['3']) . $matches['4'];
      }
    }
    // E-mail links.
    elseif (strpos($url, 'mailto:') === 0 && preg_match('/^(mailto:)([^@]+@)(.*)$/', $url, $matches)) {
      if (defined('INTL_IDNA_VARIANT_UTS46')) {
        $escaped_url = $matches['1'] . $matches['2'] . idn_to_ascii($matches['3'], IDNA_DEFAULT, INTL_IDNA_VARIANT_UTS46);
      }
      else {
        $escaped_url = $matches['1'] . $matches['2'] . idn_to_ascii($matches['3']);
      }
    }
    // Other links that do not have the domain at all.
    else {
      $escaped_url = $url;
    }
  }
  // No IDN support available.
  else {
    $escaped_url = $url;
  }

  return $escaped_url;
}