pcre2
Version:
A PCRE2 binding for node.js
1,149 lines • 187 kB
HTML
<html>
<head>
<title>pcre2api specification</title>
</head>
<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
<h1>pcre2api man page</h1>
<p>
Return to the <a href="index.html">PCRE2 index page</a>.
</p>
<p>
This page is part of the PCRE2 HTML documentation. It was generated
automatically from the original man page. If there is any nonsense in it,
please consult the man page, in case the conversion went wrong.
<br>
<ul>
<li><a name="TOC1" href="#SEC1">PCRE2 NATIVE API BASIC FUNCTIONS</a>
<li><a name="TOC2" href="#SEC2">PCRE2 NATIVE API AUXILIARY MATCH FUNCTIONS</a>
<li><a name="TOC3" href="#SEC3">PCRE2 NATIVE API GENERAL CONTEXT FUNCTIONS</a>
<li><a name="TOC4" href="#SEC4">PCRE2 NATIVE API COMPILE CONTEXT FUNCTIONS</a>
<li><a name="TOC5" href="#SEC5">PCRE2 NATIVE API MATCH CONTEXT FUNCTIONS</a>
<li><a name="TOC6" href="#SEC6">PCRE2 NATIVE API STRING EXTRACTION FUNCTIONS</a>
<li><a name="TOC7" href="#SEC7">PCRE2 NATIVE API STRING SUBSTITUTION FUNCTION</a>
<li><a name="TOC8" href="#SEC8">PCRE2 NATIVE API JIT FUNCTIONS</a>
<li><a name="TOC9" href="#SEC9">PCRE2 NATIVE API SERIALIZATION FUNCTIONS</a>
<li><a name="TOC10" href="#SEC10">PCRE2 NATIVE API AUXILIARY FUNCTIONS</a>
<li><a name="TOC11" href="#SEC11">PCRE2 NATIVE API OBSOLETE FUNCTIONS</a>
<li><a name="TOC12" href="#SEC12">PCRE2 EXPERIMENTAL PATTERN CONVERSION FUNCTIONS</a>
<li><a name="TOC13" href="#SEC13">PCRE2 8-BIT, 16-BIT, AND 32-BIT LIBRARIES</a>
<li><a name="TOC14" href="#SEC14">PCRE2 API OVERVIEW</a>
<li><a name="TOC15" href="#SEC15">STRING LENGTHS AND OFFSETS</a>
<li><a name="TOC16" href="#SEC16">NEWLINES</a>
<li><a name="TOC17" href="#SEC17">MULTITHREADING</a>
<li><a name="TOC18" href="#SEC18">PCRE2 CONTEXTS</a>
<li><a name="TOC19" href="#SEC19">CHECKING BUILD-TIME OPTIONS</a>
<li><a name="TOC20" href="#SEC20">COMPILING A PATTERN</a>
<li><a name="TOC21" href="#SEC21">JUST-IN-TIME (JIT) COMPILATION</a>
<li><a name="TOC22" href="#SEC22">LOCALE SUPPORT</a>
<li><a name="TOC23" href="#SEC23">INFORMATION ABOUT A COMPILED PATTERN</a>
<li><a name="TOC24" href="#SEC24">INFORMATION ABOUT A PATTERN'S CALLOUTS</a>
<li><a name="TOC25" href="#SEC25">SERIALIZATION AND PRECOMPILING</a>
<li><a name="TOC26" href="#SEC26">THE MATCH DATA BLOCK</a>
<li><a name="TOC27" href="#SEC27">MATCHING A PATTERN: THE TRADITIONAL FUNCTION</a>
<li><a name="TOC28" href="#SEC28">NEWLINE HANDLING WHEN MATCHING</a>
<li><a name="TOC29" href="#SEC29">HOW PCRE2_MATCH() RETURNS A STRING AND CAPTURED SUBSTRINGS</a>
<li><a name="TOC30" href="#SEC30">OTHER INFORMATION ABOUT A MATCH</a>
<li><a name="TOC31" href="#SEC31">ERROR RETURNS FROM <b>pcre2_match()</b></a>
<li><a name="TOC32" href="#SEC32">OBTAINING A TEXTUAL ERROR MESSAGE</a>
<li><a name="TOC33" href="#SEC33">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a>
<li><a name="TOC34" href="#SEC34">EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS</a>
<li><a name="TOC35" href="#SEC35">EXTRACTING CAPTURED SUBSTRINGS BY NAME</a>
<li><a name="TOC36" href="#SEC36">CREATING A NEW STRING WITH SUBSTITUTIONS</a>
<li><a name="TOC37" href="#SEC37">DUPLICATE CAPTURE GROUP NAMES</a>
<li><a name="TOC38" href="#SEC38">FINDING ALL POSSIBLE MATCHES AT ONE POSITION</a>
<li><a name="TOC39" href="#SEC39">MATCHING A PATTERN: THE ALTERNATIVE FUNCTION</a>
<li><a name="TOC40" href="#SEC40">SEE ALSO</a>
<li><a name="TOC41" href="#SEC41">AUTHOR</a>
<li><a name="TOC42" href="#SEC42">REVISION</a>
</ul>
<P>
<b>#include <pcre2.h></b>
<br>
<br>
PCRE2 is a new API for PCRE, starting at release 10.0. This document contains a
description of all its native functions. See the
<a href="pcre2.html"><b>pcre2</b></a>
document for an overview of all the PCRE2 documentation.
</P>
<br><a name="SEC1" href="#TOC1">PCRE2 NATIVE API BASIC FUNCTIONS</a><br>
<P>
<b>pcre2_code *pcre2_compile(PCRE2_SPTR <i>pattern</i>, PCRE2_SIZE <i>length</i>,</b>
<b> uint32_t <i>options</i>, int *<i>errorcode</i>, PCRE2_SIZE *<i>erroroffset,</i></b>
<b> pcre2_compile_context *<i>ccontext</i>);</b>
<br>
<br>
<b>void pcre2_code_free(pcre2_code *<i>code</i>);</b>
<br>
<br>
<b>pcre2_match_data *pcre2_match_data_create(uint32_t <i>ovecsize</i>,</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>pcre2_match_data *pcre2_match_data_create_from_pattern(</b>
<b> const pcre2_code *<i>code</i>, pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>int pcre2_match(const pcre2_code *<i>code</i>, PCRE2_SPTR <i>subject</i>,</b>
<b> PCRE2_SIZE <i>length</i>, PCRE2_SIZE <i>startoffset</i>,</b>
<b> uint32_t <i>options</i>, pcre2_match_data *<i>match_data</i>,</b>
<b> pcre2_match_context *<i>mcontext</i>);</b>
<br>
<br>
<b>int pcre2_dfa_match(const pcre2_code *<i>code</i>, PCRE2_SPTR <i>subject</i>,</b>
<b> PCRE2_SIZE <i>length</i>, PCRE2_SIZE <i>startoffset</i>,</b>
<b> uint32_t <i>options</i>, pcre2_match_data *<i>match_data</i>,</b>
<b> pcre2_match_context *<i>mcontext</i>,</b>
<b> int *<i>workspace</i>, PCRE2_SIZE <i>wscount</i>);</b>
<br>
<br>
<b>void pcre2_match_data_free(pcre2_match_data *<i>match_data</i>);</b>
</P>
<br><a name="SEC2" href="#TOC1">PCRE2 NATIVE API AUXILIARY MATCH FUNCTIONS</a><br>
<P>
<b>PCRE2_SPTR pcre2_get_mark(pcre2_match_data *<i>match_data</i>);</b>
<br>
<br>
<b>uint32_t pcre2_get_ovector_count(pcre2_match_data *<i>match_data</i>);</b>
<br>
<br>
<b>PCRE2_SIZE *pcre2_get_ovector_pointer(pcre2_match_data *<i>match_data</i>);</b>
<br>
<br>
<b>PCRE2_SIZE pcre2_get_startchar(pcre2_match_data *<i>match_data</i>);</b>
</P>
<br><a name="SEC3" href="#TOC1">PCRE2 NATIVE API GENERAL CONTEXT FUNCTIONS</a><br>
<P>
<b>pcre2_general_context *pcre2_general_context_create(</b>
<b> void *(*<i>private_malloc</i>)(PCRE2_SIZE, void *),</b>
<b> void (*<i>private_free</i>)(void *, void *), void *<i>memory_data</i>);</b>
<br>
<br>
<b>pcre2_general_context *pcre2_general_context_copy(</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>void pcre2_general_context_free(pcre2_general_context *<i>gcontext</i>);</b>
</P>
<br><a name="SEC4" href="#TOC1">PCRE2 NATIVE API COMPILE CONTEXT FUNCTIONS</a><br>
<P>
<b>pcre2_compile_context *pcre2_compile_context_create(</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>pcre2_compile_context *pcre2_compile_context_copy(</b>
<b> pcre2_compile_context *<i>ccontext</i>);</b>
<br>
<br>
<b>void pcre2_compile_context_free(pcre2_compile_context *<i>ccontext</i>);</b>
<br>
<br>
<b>int pcre2_set_bsr(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_character_tables(pcre2_compile_context *<i>ccontext</i>,</b>
<b> const uint8_t *<i>tables</i>);</b>
<br>
<br>
<b>int pcre2_set_compile_extra_options(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>extra_options</i>);</b>
<br>
<br>
<b>int pcre2_set_max_pattern_length(pcre2_compile_context *<i>ccontext</i>,</b>
<b> PCRE2_SIZE <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_newline(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_parens_nest_limit(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_compile_recursion_guard(pcre2_compile_context *<i>ccontext</i>,</b>
<b> int (*<i>guard_function</i>)(uint32_t, void *), void *<i>user_data</i>);</b>
</P>
<br><a name="SEC5" href="#TOC1">PCRE2 NATIVE API MATCH CONTEXT FUNCTIONS</a><br>
<P>
<b>pcre2_match_context *pcre2_match_context_create(</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>pcre2_match_context *pcre2_match_context_copy(</b>
<b> pcre2_match_context *<i>mcontext</i>);</b>
<br>
<br>
<b>void pcre2_match_context_free(pcre2_match_context *<i>mcontext</i>);</b>
<br>
<br>
<b>int pcre2_set_callout(pcre2_match_context *<i>mcontext</i>,</b>
<b> int (*<i>callout_function</i>)(pcre2_callout_block *, void *),</b>
<b> void *<i>callout_data</i>);</b>
<br>
<br>
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
<b> void *<i>callout_data</i>);</b>
<br>
<br>
<b>int pcre2_set_offset_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> PCRE2_SIZE <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_heap_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_match_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_depth_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
</P>
<br><a name="SEC6" href="#TOC1">PCRE2 NATIVE API STRING EXTRACTION FUNCTIONS</a><br>
<P>
<b>int pcre2_substring_copy_byname(pcre2_match_data *<i>match_data</i>,</b>
<b> PCRE2_SPTR <i>name</i>, PCRE2_UCHAR *<i>buffer</i>, PCRE2_SIZE *<i>bufflen</i>);</b>
<br>
<br>
<b>int pcre2_substring_copy_bynumber(pcre2_match_data *<i>match_data</i>,</b>
<b> uint32_t <i>number</i>, PCRE2_UCHAR *<i>buffer</i>,</b>
<b> PCRE2_SIZE *<i>bufflen</i>);</b>
<br>
<br>
<b>void pcre2_substring_free(PCRE2_UCHAR *<i>buffer</i>);</b>
<br>
<br>
<b>int pcre2_substring_get_byname(pcre2_match_data *<i>match_data</i>,</b>
<b> PCRE2_SPTR <i>name</i>, PCRE2_UCHAR **<i>bufferptr</i>, PCRE2_SIZE *<i>bufflen</i>);</b>
<br>
<br>
<b>int pcre2_substring_get_bynumber(pcre2_match_data *<i>match_data</i>,</b>
<b> uint32_t <i>number</i>, PCRE2_UCHAR **<i>bufferptr</i>,</b>
<b> PCRE2_SIZE *<i>bufflen</i>);</b>
<br>
<br>
<b>int pcre2_substring_length_byname(pcre2_match_data *<i>match_data</i>,</b>
<b> PCRE2_SPTR <i>name</i>, PCRE2_SIZE *<i>length</i>);</b>
<br>
<br>
<b>int pcre2_substring_length_bynumber(pcre2_match_data *<i>match_data</i>,</b>
<b> uint32_t <i>number</i>, PCRE2_SIZE *<i>length</i>);</b>
<br>
<br>
<b>int pcre2_substring_nametable_scan(const pcre2_code *<i>code</i>,</b>
<b> PCRE2_SPTR <i>name</i>, PCRE2_SPTR *<i>first</i>, PCRE2_SPTR *<i>last</i>);</b>
<br>
<br>
<b>int pcre2_substring_number_from_name(const pcre2_code *<i>code</i>,</b>
<b> PCRE2_SPTR <i>name</i>);</b>
<br>
<br>
<b>void pcre2_substring_list_free(PCRE2_SPTR *<i>list</i>);</b>
<br>
<br>
<b>int pcre2_substring_list_get(pcre2_match_data *<i>match_data</i>,</b>
<b>" PCRE2_UCHAR ***<i>listptr</i>, PCRE2_SIZE **<i>lengthsptr</i>);</b>
</P>
<br><a name="SEC7" href="#TOC1">PCRE2 NATIVE API STRING SUBSTITUTION FUNCTION</a><br>
<P>
<b>int pcre2_substitute(const pcre2_code *<i>code</i>, PCRE2_SPTR <i>subject</i>,</b>
<b> PCRE2_SIZE <i>length</i>, PCRE2_SIZE <i>startoffset</i>,</b>
<b> uint32_t <i>options</i>, pcre2_match_data *<i>match_data</i>,</b>
<b> pcre2_match_context *<i>mcontext</i>, PCRE2_SPTR <i>replacementz</i>,</b>
<b> PCRE2_SIZE <i>rlength</i>, PCRE2_UCHAR *<i>outputbuffer</i>,</b>
<b> PCRE2_SIZE *<i>outlengthptr</i>);</b>
</P>
<br><a name="SEC8" href="#TOC1">PCRE2 NATIVE API JIT FUNCTIONS</a><br>
<P>
<b>int pcre2_jit_compile(pcre2_code *<i>code</i>, uint32_t <i>options</i>);</b>
<br>
<br>
<b>int pcre2_jit_match(const pcre2_code *<i>code</i>, PCRE2_SPTR <i>subject</i>,</b>
<b> PCRE2_SIZE <i>length</i>, PCRE2_SIZE <i>startoffset</i>,</b>
<b> uint32_t <i>options</i>, pcre2_match_data *<i>match_data</i>,</b>
<b> pcre2_match_context *<i>mcontext</i>);</b>
<br>
<br>
<b>void pcre2_jit_free_unused_memory(pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>pcre2_jit_stack *pcre2_jit_stack_create(PCRE2_SIZE <i>startsize</i>,</b>
<b> PCRE2_SIZE <i>maxsize</i>, pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>void pcre2_jit_stack_assign(pcre2_match_context *<i>mcontext</i>,</b>
<b> pcre2_jit_callback <i>callback_function</i>, void *<i>callback_data</i>);</b>
<br>
<br>
<b>void pcre2_jit_stack_free(pcre2_jit_stack *<i>jit_stack</i>);</b>
</P>
<br><a name="SEC9" href="#TOC1">PCRE2 NATIVE API SERIALIZATION FUNCTIONS</a><br>
<P>
<b>int32_t pcre2_serialize_decode(pcre2_code **<i>codes</i>,</b>
<b> int32_t <i>number_of_codes</i>, const uint8_t *<i>bytes</i>,</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>int32_t pcre2_serialize_encode(const pcre2_code **<i>codes</i>,</b>
<b> int32_t <i>number_of_codes</i>, uint8_t **<i>serialized_bytes</i>,</b>
<b> PCRE2_SIZE *<i>serialized_size</i>, pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>void pcre2_serialize_free(uint8_t *<i>bytes</i>);</b>
<br>
<br>
<b>int32_t pcre2_serialize_get_number_of_codes(const uint8_t *<i>bytes</i>);</b>
</P>
<br><a name="SEC10" href="#TOC1">PCRE2 NATIVE API AUXILIARY FUNCTIONS</a><br>
<P>
<b>pcre2_code *pcre2_code_copy(const pcre2_code *<i>code</i>);</b>
<br>
<br>
<b>pcre2_code *pcre2_code_copy_with_tables(const pcre2_code *<i>code</i>);</b>
<br>
<br>
<b>int pcre2_get_error_message(int <i>errorcode</i>, PCRE2_UCHAR *<i>buffer</i>,</b>
<b> PCRE2_SIZE <i>bufflen</i>);</b>
<br>
<br>
<b>const uint8_t *pcre2_maketables(pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>void pcre2_maketables_free(pcre2_general_context *<i>gcontext</i>,</b>
<b> const uint8_t *<i>tables</i>);</b>
<br>
<br>
<b>int pcre2_pattern_info(const pcre2_code *<i>code</i>, uint32_t <i>what</i>,</b>
<b> void *<i>where</i>);</b>
<br>
<br>
<b>int pcre2_callout_enumerate(const pcre2_code *<i>code</i>,</b>
<b> int (*<i>callback</i>)(pcre2_callout_enumerate_block *, void *),</b>
<b> void *<i>user_data</i>);</b>
<br>
<br>
<b>int pcre2_config(uint32_t <i>what</i>, void *<i>where</i>);</b>
</P>
<br><a name="SEC11" href="#TOC1">PCRE2 NATIVE API OBSOLETE FUNCTIONS</a><br>
<P>
<b>int pcre2_set_recursion_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_recursion_memory_management(</b>
<b> pcre2_match_context *<i>mcontext</i>,</b>
<b> void *(*<i>private_malloc</i>)(PCRE2_SIZE, void *),</b>
<b> void (*<i>private_free</i>)(void *, void *), void *<i>memory_data</i>);</b>
<br>
<br>
These functions became obsolete at release 10.30 and are retained only for
backward compatibility. They should not be used in new code. The first is
replaced by <b>pcre2_set_depth_limit()</b>; the second is no longer needed and
has no effect (it always returns zero).
</P>
<br><a name="SEC12" href="#TOC1">PCRE2 EXPERIMENTAL PATTERN CONVERSION FUNCTIONS</a><br>
<P>
<b>pcre2_convert_context *pcre2_convert_context_create(</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>pcre2_convert_context *pcre2_convert_context_copy(</b>
<b> pcre2_convert_context *<i>cvcontext</i>);</b>
<br>
<br>
<b>void pcre2_convert_context_free(pcre2_convert_context *<i>cvcontext</i>);</b>
<br>
<br>
<b>int pcre2_set_glob_escape(pcre2_convert_context *<i>cvcontext</i>,</b>
<b> uint32_t <i>escape_char</i>);</b>
<br>
<br>
<b>int pcre2_set_glob_separator(pcre2_convert_context *<i>cvcontext</i>,</b>
<b> uint32_t <i>separator_char</i>);</b>
<br>
<br>
<b>int pcre2_pattern_convert(PCRE2_SPTR <i>pattern</i>, PCRE2_SIZE <i>length</i>,</b>
<b> uint32_t <i>options</i>, PCRE2_UCHAR **<i>buffer</i>,</b>
<b> PCRE2_SIZE *<i>blength</i>, pcre2_convert_context *<i>cvcontext</i>);</b>
<br>
<br>
<b>void pcre2_converted_pattern_free(PCRE2_UCHAR *<i>converted_pattern</i>);</b>
<br>
<br>
These functions provide a way of converting non-PCRE2 patterns into
patterns that can be processed by <b>pcre2_compile()</b>. This facility is
experimental and may be changed in future releases. At present, "globs" and
POSIX basic and extended patterns can be converted. Details are given in the
<a href="pcre2convert.html"><b>pcre2convert</b></a>
documentation.
</P>
<br><a name="SEC13" href="#TOC1">PCRE2 8-BIT, 16-BIT, AND 32-BIT LIBRARIES</a><br>
<P>
There are three PCRE2 libraries, supporting 8-bit, 16-bit, and 32-bit code
units, respectively. However, there is just one header file, <b>pcre2.h</b>.
This contains the function prototypes and other definitions for all three
libraries. One, two, or all three can be installed simultaneously. On Unix-like
systems the libraries are called <b>libpcre2-8</b>, <b>libpcre2-16</b>, and
<b>libpcre2-32</b>, and they can also co-exist with the original PCRE libraries.
</P>
<P>
Character strings are passed to and from a PCRE2 library as a sequence of
unsigned integers in code units of the appropriate width. Every PCRE2 function
comes in three different forms, one for each library, for example:
<pre>
<b>pcre2_compile_8()</b>
<b>pcre2_compile_16()</b>
<b>pcre2_compile_32()</b>
</pre>
There are also three different sets of data types:
<pre>
<b>PCRE2_UCHAR8, PCRE2_UCHAR16, PCRE2_UCHAR32</b>
<b>PCRE2_SPTR8, PCRE2_SPTR16, PCRE2_SPTR32</b>
</pre>
The UCHAR types define unsigned code units of the appropriate widths. For
example, PCRE2_UCHAR16 is usually defined as `uint16_t'. The SPTR types are
constant pointers to the equivalent UCHAR types, that is, they are pointers to
vectors of unsigned code units.
</P>
<P>
Many applications use only one code unit width. For their convenience, macros
are defined whose names are the generic forms such as <b>pcre2_compile()</b> and
PCRE2_SPTR. These macros use the value of the macro PCRE2_CODE_UNIT_WIDTH to
generate the appropriate width-specific function and macro names.
PCRE2_CODE_UNIT_WIDTH is not defined by default. An application must define it
to be 8, 16, or 32 before including <b>pcre2.h</b> in order to make use of the
generic names.
</P>
<P>
Applications that use more than one code unit width can be linked with more
than one PCRE2 library, but must define PCRE2_CODE_UNIT_WIDTH to be 0 before
including <b>pcre2.h</b>, and then use the real function names. Any code that is
to be included in an environment where the value of PCRE2_CODE_UNIT_WIDTH is
unknown should also use the real function names. (Unfortunately, it is not
possible in C code to save and restore the value of a macro.)
</P>
<P>
If PCRE2_CODE_UNIT_WIDTH is not defined before including <b>pcre2.h</b>, a
compiler error occurs.
</P>
<P>
When using multiple libraries in an application, you must take care when
processing any particular pattern to use only functions from a single library.
For example, if you want to run a match using a pattern that was compiled with
<b>pcre2_compile_16()</b>, you must do so with <b>pcre2_match_16()</b>, not
<b>pcre2_match_8()</b> or <b>pcre2_match_32()</b>.
</P>
<P>
In the function summaries above, and in the rest of this document and other
PCRE2 documents, functions and data types are described using their generic
names, without the _8, _16, or _32 suffix.
</P>
<br><a name="SEC14" href="#TOC1">PCRE2 API OVERVIEW</a><br>
<P>
PCRE2 has its own native API, which is described in this document. There are
also some wrapper functions for the 8-bit library that correspond to the
POSIX regular expression API, but they do not give access to all the
functionality of PCRE2. They are described in the
<a href="pcre2posix.html"><b>pcre2posix</b></a>
documentation. Both these APIs define a set of C function calls.
</P>
<P>
The native API C data types, function prototypes, option values, and error
codes are defined in the header file <b>pcre2.h</b>, which also contains
definitions of PCRE2_MAJOR and PCRE2_MINOR, the major and minor release numbers
for the library. Applications can use these to include support for different
releases of PCRE2.
</P>
<P>
In a Windows environment, if you want to statically link an application program
against a non-dll PCRE2 library, you must define PCRE2_STATIC before including
<b>pcre2.h</b>.
</P>
<P>
The functions <b>pcre2_compile()</b> and <b>pcre2_match()</b> are used for
compiling and matching regular expressions in a Perl-compatible manner. A
sample program that demonstrates the simplest way of using them is provided in
the file called <i>pcre2demo.c</i> in the PCRE2 source distribution. A listing
of this program is given in the
<a href="pcre2demo.html"><b>pcre2demo</b></a>
documentation, and the
<a href="pcre2sample.html"><b>pcre2sample</b></a>
documentation describes how to compile and run it.
</P>
<P>
The compiling and matching functions recognize various options that are passed
as bits in an options argument. There are also some more complicated parameters
such as custom memory management functions and resource limits that are passed
in "contexts" (which are just memory blocks, described below). Simple
applications do not need to make use of contexts.
</P>
<P>
Just-in-time (JIT) compiler support is an optional feature of PCRE2 that can be
built in appropriate hardware environments. It greatly speeds up the matching
performance of many patterns. Programs can request that it be used if
available by calling <b>pcre2_jit_compile()</b> after a pattern has been
successfully compiled by <b>pcre2_compile()</b>. This does nothing if JIT
support is not available.
</P>
<P>
More complicated programs might need to make use of the specialist functions
<b>pcre2_jit_stack_create()</b>, <b>pcre2_jit_stack_free()</b>, and
<b>pcre2_jit_stack_assign()</b> in order to control the JIT code's memory usage.
</P>
<P>
JIT matching is automatically used by <b>pcre2_match()</b> if it is available,
unless the PCRE2_NO_JIT option is set. There is also a direct interface for JIT
matching, which gives improved performance at the expense of less sanity
checking. The JIT-specific functions are discussed in the
<a href="pcre2jit.html"><b>pcre2jit</b></a>
documentation.
</P>
<P>
A second matching function, <b>pcre2_dfa_match()</b>, which is not
Perl-compatible, is also provided. This uses a different algorithm for the
matching. The alternative algorithm finds all possible matches (at a given
point in the subject), and scans the subject just once (unless there are
lookaround assertions). However, this algorithm does not return captured
substrings. A description of the two matching algorithms and their advantages
and disadvantages is given in the
<a href="pcre2matching.html"><b>pcre2matching</b></a>
documentation. There is no JIT support for <b>pcre2_dfa_match()</b>.
</P>
<P>
In addition to the main compiling and matching functions, there are convenience
functions for extracting captured substrings from a subject string that has
been matched by <b>pcre2_match()</b>. They are:
<pre>
<b>pcre2_substring_copy_byname()</b>
<b>pcre2_substring_copy_bynumber()</b>
<b>pcre2_substring_get_byname()</b>
<b>pcre2_substring_get_bynumber()</b>
<b>pcre2_substring_list_get()</b>
<b>pcre2_substring_length_byname()</b>
<b>pcre2_substring_length_bynumber()</b>
<b>pcre2_substring_nametable_scan()</b>
<b>pcre2_substring_number_from_name()</b>
</pre>
<b>pcre2_substring_free()</b> and <b>pcre2_substring_list_free()</b> are also
provided, to free memory used for extracted strings. If either of these
functions is called with a NULL argument, the function returns immediately
without doing anything.
</P>
<P>
The function <b>pcre2_substitute()</b> can be called to match a pattern and
return a copy of the subject string with substitutions for parts that were
matched.
</P>
<P>
Functions whose names begin with <b>pcre2_serialize_</b> are used for saving
compiled patterns on disc or elsewhere, and reloading them later.
</P>
<P>
Finally, there are functions for finding out information about a compiled
pattern (<b>pcre2_pattern_info()</b>) and about the configuration with which
PCRE2 was built (<b>pcre2_config()</b>).
</P>
<P>
Functions with names ending with <b>_free()</b> are used for freeing memory
blocks of various sorts. In all cases, if one of these functions is called with
a NULL argument, it does nothing.
</P>
<br><a name="SEC15" href="#TOC1">STRING LENGTHS AND OFFSETS</a><br>
<P>
The PCRE2 API uses string lengths and offsets into strings of code units in
several places. These values are always of type PCRE2_SIZE, which is an
unsigned integer type, currently always defined as <i>size_t</i>. The largest
value that can be stored in such a type (that is ~(PCRE2_SIZE)0) is reserved
as a special indicator for zero-terminated strings and unset offsets.
Therefore, the longest string that can be handled is one less than this
maximum.
<a name="newlines"></a></P>
<br><a name="SEC16" href="#TOC1">NEWLINES</a><br>
<P>
PCRE2 supports five different conventions for indicating line breaks in
strings: a single CR (carriage return) character, a single LF (linefeed)
character, the two-character sequence CRLF, any of the three preceding, or any
Unicode newline sequence. The Unicode newline sequences are the three just
mentioned, plus the single characters VT (vertical tab, U+000B), FF (form feed,
U+000C), NEL (next line, U+0085), LS (line separator, U+2028), and PS
(paragraph separator, U+2029).
</P>
<P>
Each of the first three conventions is used by at least one operating system as
its standard newline sequence. When PCRE2 is built, a default can be specified.
If it is not, the default is set to LF, which is the Unix standard. However,
the newline convention can be changed by an application when calling
<b>pcre2_compile()</b>, or it can be specified by special text at the start of
the pattern itself; this overrides any other settings. See the
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
page for details of the special character sequences.
</P>
<P>
In the PCRE2 documentation the word "newline" is used to mean "the character or
pair of characters that indicate a line break". The choice of newline
convention affects the handling of the dot, circumflex, and dollar
metacharacters, the handling of #-comments in /x mode, and, when CRLF is a
recognized line ending sequence, the match position advancement for a
non-anchored pattern. There is more detail about this in the
<a href="#matchoptions">section on <b>pcre2_match()</b> options</a>
below.
</P>
<P>
The choice of newline convention does not affect the interpretation of
the \n or \r escape sequences, nor does it affect what \R matches; this has
its own separate convention.
</P>
<br><a name="SEC17" href="#TOC1">MULTITHREADING</a><br>
<P>
In a multithreaded application it is important to keep thread-specific data
separate from data that can be shared between threads. The PCRE2 library code
itself is thread-safe: it contains no static or global variables. The API is
designed to be fairly simple for non-threaded applications while at the same
time ensuring that multithreaded applications can use it.
</P>
<P>
There are several different blocks of data that are used to pass information
between the application and the PCRE2 libraries.
</P>
<br><b>
The compiled pattern
</b><br>
<P>
A pointer to the compiled form of a pattern is returned to the user when
<b>pcre2_compile()</b> is successful. The data in the compiled pattern is fixed,
and does not change when the pattern is matched. Therefore, it is thread-safe,
that is, the same compiled pattern can be used by more than one thread
simultaneously. For example, an application can compile all its patterns at the
start, before forking off multiple threads that use them. However, if the
just-in-time (JIT) optimization feature is being used, it needs separate memory
stack areas for each thread. See the
<a href="pcre2jit.html"><b>pcre2jit</b></a>
documentation for more details.
</P>
<P>
In a more complicated situation, where patterns are compiled only when they are
first needed, but are still shared between threads, pointers to compiled
patterns must be protected from simultaneous writing by multiple threads. This
is somewhat tricky to do correctly. If you know that writing to a pointer is
atomic in your environment, you can use logic like this:
<pre>
Get a read-only (shared) lock (mutex) for pointer
if (pointer == NULL)
{
Get a write (unique) lock for pointer
if (pointer == NULL) pointer = pcre2_compile(...
}
Release the lock
Use pointer in pcre2_match()
</pre>
Of course, testing for compilation errors should also be included in the code.
</P>
<P>
The reason for checking the pointer a second time is as follows: Several
threads may have acquired the shared lock and tested the pointer for being
NULL, but only one of them will be given the write lock, with the rest kept
waiting. The winning thread will compile the pattern and store the result.
After this thread releases the write lock, another thread will get it, and if
it does not retest pointer for being NULL, will recompile the pattern and
overwrite the pointer, creating a memory leak and possibly causing other
issues.
</P>
<P>
In an environment where writing to a pointer may not be atomic, the above logic
is not sufficient. The thread that is doing the compiling may be descheduled
after writing only part of the pointer, which could cause other threads to use
an invalid value. Instead of checking the pointer itself, a separate "pointer
is valid" flag (that can be updated atomically) must be used:
<pre>
Get a read-only (shared) lock (mutex) for pointer
if (!pointer_is_valid)
{
Get a write (unique) lock for pointer
if (!pointer_is_valid)
{
pointer = pcre2_compile(...
pointer_is_valid = TRUE
}
}
Release the lock
Use pointer in pcre2_match()
</pre>
If JIT is being used, but the JIT compilation is not being done immediately
(perhaps waiting to see if the pattern is used often enough), similar logic is
required. JIT compilation updates a value within the compiled code block, so a
thread must gain unique write access to the pointer before calling
<b>pcre2_jit_compile()</b>. Alternatively, <b>pcre2_code_copy()</b> or
<b>pcre2_code_copy_with_tables()</b> can be used to obtain a private copy of the
compiled code before calling the JIT compiler.
</P>
<br><b>
Context blocks
</b><br>
<P>
The next main section below introduces the idea of "contexts" in which PCRE2
functions are called. A context is nothing more than a collection of parameters
that control the way PCRE2 operates. Grouping a number of parameters together
in a context is a convenient way of passing them to a PCRE2 function without
using lots of arguments. The parameters that are stored in contexts are in some
sense "advanced features" of the API. Many straightforward applications will
not need to use contexts.
</P>
<P>
In a multithreaded application, if the parameters in a context are values that
are never changed, the same context can be used by all the threads. However, if
any thread needs to change any value in a context, it must make its own
thread-specific copy.
</P>
<br><b>
Match blocks
</b><br>
<P>
The matching functions need a block of memory for storing the results of a
match. This includes details of what was matched, as well as additional
information such as the name of a (*MARK) setting. Each thread must provide its
own copy of this memory.
</P>
<br><a name="SEC18" href="#TOC1">PCRE2 CONTEXTS</a><br>
<P>
Some PCRE2 functions have a lot of parameters, many of which are used only by
specialist applications, for example, those that use custom memory management
or non-standard character tables. To keep function argument lists at a
reasonable size, and at the same time to keep the API extensible, "uncommon"
parameters are passed to certain functions in a <b>context</b> instead of
directly. A context is just a block of memory that holds the parameter values.
Applications that do not need to adjust any of the context parameters can pass
NULL when a context pointer is required.
</P>
<P>
There are three different types of context: a general context that is relevant
for several PCRE2 operations, a compile-time context, and a match-time context.
</P>
<br><b>
The general context
</b><br>
<P>
At present, this context just contains pointers to (and data for) external
memory management functions that are called from several places in the PCRE2
library. The context is named `general' rather than specifically `memory'
because in future other fields may be added. If you do not want to supply your
own custom memory management functions, you do not need to bother with a
general context. A general context is created by:
<br>
<br>
<b>pcre2_general_context *pcre2_general_context_create(</b>
<b> void *(*<i>private_malloc</i>)(PCRE2_SIZE, void *),</b>
<b> void (*<i>private_free</i>)(void *, void *), void *<i>memory_data</i>);</b>
<br>
<br>
The two function pointers specify custom memory management functions, whose
prototypes are:
<pre>
<b>void *private_malloc(PCRE2_SIZE, void *);</b>
<b>void private_free(void *, void *);</b>
</pre>
Whenever code in PCRE2 calls these functions, the final argument is the value
of <i>memory_data</i>. Either of the first two arguments of the creation
function may be NULL, in which case the system memory management functions
<i>malloc()</i> and <i>free()</i> are used. (This is not currently useful, as
there are no other fields in a general context, but in future there might be.)
The <i>private_malloc()</i> function is used (if supplied) to obtain memory for
storing the context, and all three values are saved as part of the context.
</P>
<P>
Whenever PCRE2 creates a data block of any kind, the block contains a pointer
to the <i>free()</i> function that matches the <i>malloc()</i> function that was
used. When the time comes to free the block, this function is called.
</P>
<P>
A general context can be copied by calling:
<br>
<br>
<b>pcre2_general_context *pcre2_general_context_copy(</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
The memory used for a general context should be freed by calling:
<br>
<br>
<b>void pcre2_general_context_free(pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
If this function is passed a NULL argument, it returns immediately without
doing anything.
<a name="compilecontext"></a></P>
<br><b>
The compile context
</b><br>
<P>
A compile context is required if you want to provide an external function for
stack checking during compilation or to change the default values of any of the
following compile-time parameters:
<pre>
What \R matches (Unicode newlines or CR, LF, CRLF only)
PCRE2's character tables
The newline character sequence
The compile time nested parentheses limit
The maximum length of the pattern string
The extra options bits (none set by default)
</pre>
A compile context is also required if you are using custom memory management.
If none of these apply, just pass NULL as the context argument of
<i>pcre2_compile()</i>.
</P>
<P>
A compile context is created, copied, and freed by the following functions:
<br>
<br>
<b>pcre2_compile_context *pcre2_compile_context_create(</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>pcre2_compile_context *pcre2_compile_context_copy(</b>
<b> pcre2_compile_context *<i>ccontext</i>);</b>
<br>
<br>
<b>void pcre2_compile_context_free(pcre2_compile_context *<i>ccontext</i>);</b>
<br>
<br>
A compile context is created with default values for its parameters. These can
be changed by calling the following functions, which return 0 on success, or
PCRE2_ERROR_BADDATA if invalid data is detected.
<br>
<br>
<b>int pcre2_set_bsr(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
The value must be PCRE2_BSR_ANYCRLF, to specify that \R matches only CR, LF,
or CRLF, or PCRE2_BSR_UNICODE, to specify that \R matches any Unicode line
ending sequence. The value is used by the JIT compiler and by the two
interpreted matching functions, <i>pcre2_match()</i> and
<i>pcre2_dfa_match()</i>.
<br>
<br>
<b>int pcre2_set_character_tables(pcre2_compile_context *<i>ccontext</i>,</b>
<b> const uint8_t *<i>tables</i>);</b>
<br>
<br>
The value must be the result of a call to <b>pcre2_maketables()</b>, whose only
argument is a general context. This function builds a set of character tables
in the current locale.
<br>
<br>
<b>int pcre2_set_compile_extra_options(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>extra_options</i>);</b>
<br>
<br>
As PCRE2 has developed, almost all the 32 option bits that are available in
the <i>options</i> argument of <b>pcre2_compile()</b> have been used up. To avoid
running out, the compile context contains a set of extra option bits which are
used for some newer, assumed rarer, options. This function sets those bits. It
always sets all the bits (either on or off). It does not modify any existing
setting. The available options are defined in the section entitled "Extra
compile options"
<a href="#extracompileoptions">below.</a>
<br>
<br>
<b>int pcre2_set_max_pattern_length(pcre2_compile_context *<i>ccontext</i>,</b>
<b> PCRE2_SIZE <i>value</i>);</b>
<br>
<br>
This sets a maximum length, in code units, for any pattern string that is
compiled with this context. If the pattern is longer, an error is generated.
This facility is provided so that applications that accept patterns from
external sources can limit their size. The default is the largest number that a
PCRE2_SIZE variable can hold, which is effectively unlimited.
<br>
<br>
<b>int pcre2_set_newline(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
This specifies which characters or character sequences are to be recognized as
newlines. The value must be one of PCRE2_NEWLINE_CR (carriage return only),
PCRE2_NEWLINE_LF (linefeed only), PCRE2_NEWLINE_CRLF (the two-character
sequence CR followed by LF), PCRE2_NEWLINE_ANYCRLF (any of the above),
PCRE2_NEWLINE_ANY (any Unicode newline sequence), or PCRE2_NEWLINE_NUL (the
NUL character, that is a binary zero).
</P>
<P>
A pattern can override the value set in the compile context by starting with a
sequence such as (*CRLF). See the
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
page for details.
</P>
<P>
When a pattern is compiled with the PCRE2_EXTENDED or PCRE2_EXTENDED_MORE
option, the newline convention affects the recognition of the end of internal
comments starting with #. The value is saved with the compiled pattern for
subsequent use by the JIT compiler and by the two interpreted matching
functions, <i>pcre2_match()</i> and <i>pcre2_dfa_match()</i>.
<br>
<br>
<b>int pcre2_set_parens_nest_limit(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
This parameter adjusts the limit, set when PCRE2 is built (default 250), on the
depth of parenthesis nesting in a pattern. This limit stops rogue patterns
using up too much system stack when being compiled. The limit applies to
parentheses of all kinds, not just capturing parentheses.
<br>
<br>
<b>int pcre2_set_compile_recursion_guard(pcre2_compile_context *<i>ccontext</i>,</b>
<b> int (*<i>guard_function</i>)(uint32_t, void *), void *<i>user_data</i>);</b>
<br>
<br>
There is at least one application that runs PCRE2 in threads with very limited
system stack, where running out of stack is to be avoided at all costs. The
parenthesis limit above cannot take account of how much stack is actually
available during compilation. For a finer control, you can supply a function
that is called whenever <b>pcre2_compile()</b> starts to compile a parenthesized
part of a pattern. This function can check the actual stack size (or anything
else that it wants to, of course).
</P>
<P>
The first argument to the callout function gives the current depth of
nesting, and the second is user data that is set up by the last argument of
<b>pcre2_set_compile_recursion_guard()</b>. The callout function should return
zero if all is well, or non-zero to force an error.
<a name="matchcontext"></a></P>
<br><b>
The match context
</b><br>
<P>
A match context is required if you want to:
<pre>
Set up a callout function
Set an offset limit for matching an unanchored pattern
Change the limit on the amount of heap used when matching
Change the backtracking match limit
Change the backtracking depth limit
Set custom memory management specifically for the match
</pre>
If none of these apply, just pass NULL as the context argument of
<b>pcre2_match()</b>, <b>pcre2_dfa_match()</b>, or <b>pcre2_jit_match()</b>.
</P>
<P>
A match context is created, copied, and freed by the following functions:
<br>
<br>
<b>pcre2_match_context *pcre2_match_context_create(</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>pcre2_match_context *pcre2_match_context_copy(</b>
<b> pcre2_match_context *<i>mcontext</i>);</b>
<br>
<br>
<b>void pcre2_match_context_free(pcre2_match_context *<i>mcontext</i>);</b>
<br>
<br>
A match context is created with default values for its parameters. These can
be changed by calling the following functions, which return 0 on success, or
PCRE2_ERROR_BADDATA if invalid data is detected.
<br>
<br>
<b>int pcre2_set_callout(pcre2_match_context *<i>mcontext</i>,</b>
<b> int (*<i>callout_function</i>)(pcre2_callout_block *, void *),</b>
<b> void *<i>callout_data</i>);</b>
<br>
<br>
This sets up a callout function for PCRE2 to call at specified points
during a matching operation. Details are given in the
<a href="pcre2callout.html"><b>pcre2callout</b></a>
documentation.
<br>
<br>
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
<b> void *<i>callout_data</i>);</b>
<br>
<br>
This sets up a callout function for PCRE2 to call after each substitution
made by <b>pcre2_substitute()</b>. Details are given in the section entitled
"Creating a new string with substitutions"
<a href="#substitutions">below.</a>
<br>
<br>
<b>int pcre2_set_offset_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> PCRE2_SIZE <i>value</i>);</b>
<br>
<br>
The <i>offset_limit</i> parameter limits how far an unanchored search can
advance in the subject string. The default value is PCRE2_UNSET. The
<b>pcre2_match()</b> and <b>pcre2_dfa_match()</b> functions return
PCRE2_ERROR_NOMATCH if a match with a starting point before or at the given
offset is not found. The <b>pcre2_substitute()</b> function makes no more
substitutions.
</P>
<P>
For example, if the pattern /abc/ is matched against "123abc" with an offset
limit less than 3, the result is PCRE2_ERROR_NOMATCH. A match can never be
found if the <i>startoffset</i> argument of <b>pcre2_match()</b>,
<b>pcre2_dfa_match()</b>, or <b>pcre2_substitute()</b> is greater than the offset
limit set in the match context.
</P>
<P>
When using this facility, you must set the PCRE2_USE_OFFSET_LIMIT option when
calling <b>pcre2_compile()</b> so that when JIT is in use, different code can be
compiled. If a match is started with a non-default match limit when
PCRE2_USE_OFFSET_LIMIT is not set, an error is generated.
</P>
<P>
The offset limit facility can be used to track progress when searching large
subject strings or to limit the extent of global substitutions. See also the
PCRE2_FIRSTLINE option, which requires a match to start before or at the first
newline that follows the start of matching in the subject. If this is set with
an offset limit, a match must occur in the first line and also within the
offset limit. In other words, whichever limit comes first is used.
<br>
<br>
<b>int pcre2_set_heap_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
The <i>heap_limit</i> parameter specifies, in units of kibibytes (1024 bytes),
the maximum amount of heap memory that <b>pcre2_match()</b> may use to hold
backtracking information when running an interpretive match. This limit also
applies to <b>pcre2_dfa_match()</b>, which may use the heap when processing
patterns with a lot of nested pattern recursion or lookarounds or atomic
groups. This limit does not apply to matching with the JIT optimization, which
has its own memory control arrangements (see the
<a href="pcre2jit.html"><b>pcre2jit</b></a>
documentation for more details). If the limit is reached, the negative error
code PCRE2_ERROR_HEAPLIMIT is returned. The default limit can be set when PCRE2
is built; if it is not, the default is set very large and is essentially
"unlimited".
</P>
<P>
A value for the heap limit may also be supplied by an item at the start of a
pattern of the form
<pre>
(*LIMIT_HEAP=ddd)
</pre>
where ddd is a decimal number. However, such a setting is ignored unless ddd is
less than the limit set by the caller of <b>pcre2_match()</b> or, if no such
limit is set, less than the default.
</P>
<P>
The <b>pcre2_match()</b> function starts out using a 20KiB vector on the system
stack for recording backtracking points. The more nested backtracking points
there are (that is, the deeper the search tree), the more memory is needed.
Heap memory is used only if the initial vector is too small. If the heap limit
is set to a value less than 21 (in particular, zero) no heap memory will be
used. In this case, only patterns that do not have a lot of nested backtracking
can be successfully processed.
</P>
<P>
Similarly, for <b>pcre2_dfa_match()</b>, a vector on the system stack is used
when processing pattern recursions, lookarounds, or atomic groups, and only if
this is not big enough is heap memory used. In this case, too, setting a value
of zero disables the use of the heap.
<br>
<br>
<b>int pcre2_set_match_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
The <i>match_limit</i> parameter provides a means of preventing PCRE2 from using
up too many computing resources when processing patterns that are not going to
match, but which have a very large number of possibilities in their search
trees. The classic example is a pattern that uses nested unlimited repeats.
</P>
<P>
There is an internal counter in <b>pcre2_match()</b> that is incremented each
time round its main matching loop. If this value reaches the match limit,
<b>pcre2_match()</b> returns the negative value PCRE2_ERROR_MATCHLIMIT. This has
the effect of limiting the amount of backtracking that can take place. For
patterns that are not anchored, the count restarts from zero for each position
in the subject string. This limit also applies to <b>pcre2_dfa_match()</b>,
though the counting is done in a different way.
</P>
<P>
When <b>pcre2_match()</b> is called with a pattern that was successfully
processed by <b>pcre2_jit_compile()</b>, the way in which matching is executed
is entirely different. However, there is still the possibility of runaway
matching that goes on for a very long time, and so the <i>match_limit</i> value
is also used in this case (but in a different way) to limit how long the
matching can continue.
</P>
<P>
The default value for the limit can be set when PCRE2 is built; the default
default is 10 million, which handles all but the most extreme cases. A value
for the match limit may also be supplied by an item at the start of a pattern
of the form
<pre>
(*LIMIT_MATCH=ddd)
</pre>
where ddd is a decimal number. However, such a setting is ignored unless ddd is
less than the limit set by the caller of <b>pcre2_match()</b> or
<b>pcre2_dfa_match()</b> or, if no such limit is set, less than the default.
<br>
<br>
<b>int pcre2_set_depth_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
This parameter limits the depth of nested backtracking in <b>pcre2_match()</b>.
Each time a nested backtracking point is passed, a new memory "frame" is used
to remember the state of matching at that point. Thus, this parameter
indirectly limits the amount of memory that is used in a match. However,
because the size of each memory "frame" depends on the number of capturing
parentheses, the actual memory limit varies from pattern to pattern. This limit
was more useful in versions before 10.30, where function recursion was used for
backtracking.
</P>
<P>
The depth limit is not relevant, and is ignored, when matching is done using
JIT compiled code. However, it is supported by <b>pcre2_dfa_match()</b>, which
uses it to limit the depth of nested internal recursive function calls that
implement atomic groups, lookaround assertions, and pattern recursions. This
limits, indirectly, the amount of system stack that is used. It was more useful
in versions before 10.32, when stack memory was used for local workspace
vectors for recursive function calls. From version 10.32, only local variables
are allocated on the stack and as each call uses only a few hundred bytes, even
a small stack can support quite a lot of recursion.
</P>
<P>
If the depth of internal recursive function calls is great enough, local
workspace vectors are allocated on the heap from version 10.32 onwards, so the
depth limit also indirectly limits the amount of heap memory that is used. A
recursive pattern such as /(.(?2))((?1)|)/, when matched to a very long string
using <b>pcre2_dfa_match()</b>, can use a great deal of memory. However, it is
probably better to limit heap usage directly by calling
<b>pcre2_set_heap_limit()</b>.
</P>
<P>
The default value for the depth limit can be set when PCRE2 is built; if it is
not, the default is set to the same value as the default for the match limit.
If the limit is exceeded, <b>pcre2_match()</b> or <b>pcre2_dfa_match()</b>
returns PCRE2_ERROR_DEPTHLIMIT. A value for the depth limit may also be
supplied by an item at the start of a pattern of the form
<pre>
(*LIMIT_DEPTH=ddd)
</pre>
where ddd is a decimal number. However, such a setting is ignored unless ddd is
less than the limit set by the caller of <b>pcre2_match()</b> or
<b>pcre2_dfa_match()</b> or, if no such limit is set, less than the default.
</P>
<br><a name="SEC19" href="#TOC1">CHECKING BUILD-TIME OPTIONS</a><br>
<P>
<b>int pcre2_config(uint32_t <i>what</i>, void *<i>where</i>);</b>
</P>
<P>
The function <b>pcre2_config()</b> makes it possible for a PCRE2 client to find
the value of certain configuration parameters and to discover which optional
features have been compiled into the PCRE2 library. The
<a href="pcre2build.html"><b>pcre2build</b></a>
documentation has more details about these features.
</P>
<P>
The first argument for <b>pcre2_config()</b> specifies which information is
required. The second argument is a pointer to memory into which the information
is placed. If NULL is passed, the function returns the amount of memory that is
needed for the requested information. For calls that return numerical values,
the value is in bytes; when requesting these values,