Further ChangeLog updates for 10.47
3 files changed
tree: 355cb9dd007821216d8c4751f28db580a05c4100
  1. .github/
  2. cmake/
  3. deps/
  4. doc/
  5. m4/
  6. maint/
  7. src/
  8. testdata/
  9. vms/
  10. .editorconfig
  11. .gitattributes
  12. .gitignore
  13. .gitmodules
  14. AUTHORS.md
  15. autogen.sh
  16. BUILD.bazel
  17. build.zig
  18. ChangeLog
  19. CMakeLists.txt
  20. configure.ac
  21. COPYING
  22. HACKING
  23. INSTALL
  24. libpcre2-16.pc.in
  25. libpcre2-32.pc.in
  26. libpcre2-8.pc.in
  27. libpcre2-posix.pc.in
  28. LICENCE.md
  29. Makefile.am
  30. MODULE.bazel
  31. NEWS
  32. NON-AUTOTOOLS-BUILD
  33. pcre2-config.in
  34. perltest.sh
  35. README
  36. README.md
  37. RunGrepTest
  38. RunGrepTest.bat
  39. RunTest
  40. RunTest.bat
  41. SECURITY.md
README.md

Overview

The PCRE2 library is a set of C functions that implement regular expression pattern matching.

It is self-contained and portable, and designed to be easy to embed into existing projects and build systems, on almost any platform or build target.

The PCRE2 library is free and open-source (BSD licence), and permitted in proprietary software.

It supports Unicode matching and a very wide range of regular expression features. It accepts input in various character encodings, and optionally includes a highly performant JIT matching engine.

PCRE2 is mature and highly-trusted: bundled in dozens or hundreds of open-source and commercial products, such as Excel, Safari, Apache, and Git, and used as the basis for regular expressions in several programming languages including PHP and R.

https://pcre2project.github.io/pcre2/

GitHub Release  BSD licence

Codecov  Clang Sanitizers  Clang Static Analyzer  Valgrind  Coverity Scan  CodeQL  OSS-Fuzz  OSSF-Scorecard Score 

Quickstart

# Fetch PCRE2 with 'git clone', or use curl/wget to download a release.
# Here, let's use git to check out a release tag:
git clone https://github.com/PCRE2Project/pcre2.git ./pcre2 \
    --branch pcre2-$PCRE2_VERSION \
    -c advice.detachedHead=false --depth 1

# If using the JIT, remember to fetch the Git submodule:
(cd ./pcre2; git submodule update --init)

# Now let's build PCRE2:
(cd ./pcre2; \
    cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug \
        -DPCRE2_SUPPORT_JIT=ON -B build; \
    cmake --build build/)

# Great, PCRE2 is built.

# Here's a quick little demo to show how we can make use of PCRE2.
# For a fuller example, see './pcre2/src/pcre2demo.c'.
# See below for the demo code.

# Compile the demo:
gcc -g -I./pcre2/build -L./pcre2/build demo.c -o demo -lpcre2-8

# Finally, run our demo:
./demo 'c.t' 'dogs and cats'

# We fetched, built, and called PCRE2 successfully! :)

File demo.c:

/* Set PCRE2_CODE_UNIT_WIDTH to indicate we will use 8-bit input. */
#define PCRE2_CODE_UNIT_WIDTH 8
#include <pcre2.h>

#include <string.h> /* for strlen */
#include <stdio.h>  /* for printf */

int main(int argc, char* argv[]) {
    if (argc != 3) {
        fprintf(stderr, "Usage: %s <pattern> <subject>\n", argv[0]);
        return 1;
    }

    const char *pattern = argv[1];
    const char *subject = argv[2];

    /* Compile the pattern. */
    int error_number;
    PCRE2_SIZE error_offset;
    pcre2_code *re = pcre2_compile(
        pattern,               /* the pattern */
        PCRE2_ZERO_TERMINATED, /* indicates pattern is zero-terminated */
        0,                     /* default options */
        &error_number,         /* for error number */
        &error_offset,         /* for error offset */
        NULL);                 /* use default compile context */
    if (re == NULL) {
        fprintf(stderr, "Invalid pattern: %s\n", pattern);
        return 1;
    }

    /* Match the pattern against the subject text. */
    pcre2_match_data *match_data =
        pcre2_match_data_create_from_pattern(re, NULL);
    int rc = pcre2_match(
        re,                   /* the compiled pattern */
        subject,              /* the subject text */
        strlen(subject),      /* the length of the subject */
        0,                    /* start at offset 0 in the subject */
        0,                    /* default options */
        match_data,           /* block for storing the result */
        NULL);                /* use default match context */

    /* Print the match result. */
    if (rc == PCRE2_ERROR_NOMATCH) {
        printf("No match\n");
    } else if (rc < 0) {
        fprintf(stderr, "Matching error\n");
    } else {
        PCRE2_SIZE *ovector = pcre2_get_ovector_pointer(match_data);
        printf("Found match: '%.*s'\n", (int)(ovector[1] - ovector[0]),
               subject + ovector[0]);
    }

    pcre2_match_data_free(match_data);   /* Free resources */
    pcre2_code_free(re);
    return 0;
}

The main ways of obtaining PCRE2 are:

  1. Via Git clone:

    git clone https://github.com/PCRE2Project/pcre2.git
    

    Please use a release tag in production, not the development branch!

    Because PCRE2's JIT uses code from a Git submodule, you must check this out after a fresh clone:

    git submodule update --init
    
  2. Via download of the release tarball.

  3. Finally, PCRE2 is also bundled by various downstream package managers (such as Linux distributions, or vcpkg). These are provided by third parties, not the PCRE2 project.

The main ways of building PCRE2 are:

  1. Via CMake (Linux/Windows/macOS, and others)

    cd pcre2/
    cmake -B build .
    cmake --build build/
    
  2. Via Autoconf (Linux/Unix)

    cd pcre2/
    ./configure
    make
    

See “Platforms” below for links to more detailed build documentation.

API Overview

The PCRE2 API supports strings in 8-bit, 16-bit, and 32-bit encodings, with or without UTF encoding. There is also EBCDIC support.

The default regular expression dialect closely matches the syntax and behaviour of Perl 5, with PCRE2-specific extensions. A wide variety of granular flags can be passed to the PCRE2 API to customise this to more closely follow other dialects such as JavaScript or Python.

The default matching engine uses a depth-first tree search with backtracking, which is highly feature-rich but has worst-case exponential time (PCRE2 allows aborting the match if a time limit is exceeded, expressed as a maximum number of steps in the tree search). The second matching engine uses a JIT for greatly improved performance, compiling the regular expression to a block of equivalent native machine code.

PCRE2 has a third matching engine, using a DFA engine which is generally slower, but has worst-case polynomial matching time and is able to find the POSIX-style “leftmost-longest” match.

There are accompanying utility functions for converting glob patterns and POSIX BRE/ERE patterns to PCRE2 regular expressions; and also for performing high-level regular expression operations such as search-and-replace with a powerful replacement string syntax.

As well as the PCRE2 API, the library also offers a POSIX-compatible <regex.h> header and regexec() function. However, this does not provide the ability to pass PCRE2 flags, so we recommend users consume the PCRE2 API if possible.

See the full library and API documentation for further details.

For third-party documentation, see further:

  • A curated summary of changes for each PCRE release, and some excellent tutorials on PCRE2 on the RexEgg website.
  • Jan Goyvaerts' popular Regular-Expressions.info site includes information about PCRE2 as well as tutorials and highly detailed comparisons of PCRE2 to other regular expression dialects.
  • Jeffrey Friedl‘s book Mastering Regular Expressions includes chapters on Perl and PCRE, and is available in print and online via O’Reilly Media.

Platforms

PCRE2 is portable C code, and is likely to work on any system with a C99 compiler.

Other systems are likely to work (including mobile, embedded platforms, and commercial UNIX systems), but these are not tested continuously by the PCRE2 maintainers. Users are encouraged to run the full PCRE2 test suite when compiling for any new platform. We are aware of working ports to VMS and z/OS (PCRE2 supports EBCDIC).

PCRE2 releases support CMake for building, and for UNIX platforms include a ./configure script built by Autoconf. Build files for the Bazel build system and zig build are also included. Integrating PCRE2 with other systems can be done by including the .c files in an existing project.

Please see the files README and NON-AUTOTOOLS-BUILD for full build documentation, as well as the man pages, including man pcre2/doc/pcre2build.3.

Licence

PCRE2 is released under the BSD 3-clause licence with a PCRE2 Exception. It is open-source and also corporate-friendly.

  • See LICENCE for legal text.
  • See AUTHORS for details of the current maintainers of PCRE2 and acknowledgements of its contributors, including Philip Hazel, the original author.

Contributing & support

Join the community by reporting issues or asking questions via GitHub issues. We welcome feedback and proposals.

Contributions ranging from bug fixes to feature requests are welcome, and can be made via GitHub pull requests.

Please review our SECURITY policy for information on reporting security issues.

Release announcements will be made via the pcre2-dev@googlegroups.com mailing list, where you can also start discussions about PCRE2 issues and development. You can browse the list archives.