打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
NotePad++ 正则表达式替换 高级用法

在我们处理文件时,很多时候会用到查找与替换。当我们想将文件中某一部分替换替换文件中另一部分时,怎么办呢? 下面正则表达式 给我提供方法。

正则表达式,提供复杂 并且弹性的查找与替换

注意: 不支持多行表达式 (involving \n, \r, etc).

1 基本表达式

符号解释
.匹配任意字符,除了新一行(\n)。也就是说 “.”可以匹配 \r ,当文件中同时含有\r and \n时,会引起混乱。要匹配所有的字符,使用\s\S。
(…)这个匹配一个标签区域. 这个标签可以被访问,通过语法 \1访问第一个标签, \2 访问第二个, 同理 \3 \4 … \9。 这些标签可以用在当前正则表达式中,或则替search和replace中的换字符串。
\1, \2, etc在替换中代表1到9的标签区域(\1 to \9)。例如, 查找字符串 Fred([1-9])XXX 并替换为字符串 Sam\1YYY的方法,当在文件中找到Fred2XXX的字符串时,会替换为Sam2YYY。注意: 只有9个区域能使用,所以我们在使用时很安全,像\10\2 表示区域1和文本”0”以及区域2。
[…]表示一个字符集合, 例如 [abc]表示任意字符 a, b or c.我们也可以使用范围例如[a-z] 表示所以的小写字母。
[^…]表示字符补集. 例如, [^A-Za-z] 表示任意字符除了字母表。
^匹配一行的开始(除非在集合中, 如下).
$匹配行尾.
*匹配0或多次, 例如 Sa*m 匹配 Sm, Sam, Saam, Saaam 等等.
+匹配1次或多次,例如 Sa+m 匹配 Sam, Saam, Saaam 等等.
?匹配0或者1次, 例如 Sa?m 匹配 Sm, Sam.
{n}匹配确定的 n 次.例如, 'Sa{2}m’ 匹配 Saam.
{m,n}匹配至少m次,至多n次(如果n缺失,则任意次数).例如, 'Sa{2,3}m’ 匹配 Saam or Saaam. 'Sa{2,}m’ 与 'Saa+m’相同
*?, +?, ??, {n,m}?非贪心匹配,匹配第一个有效的匹配,通常 '<.>’ 会匹配整个 'content’字符串 –但 '<.?>’ 只匹配 ” .这个标记一个标签区域,这些区域可以用语法\1 \2 等访问多个对应1-9区域。

2 标记和分组

符号解释
(…)一组捕获. 可以通过\1 访问第一个组, \2 访问第二个.
(?:…)非捕获组.
(?=…)非捕获组 – 向前断言. 例如’(.*)(?=ton)’ 表达式,当 遇到’Appleton’字符串时,会匹配为’Apple’.
(?<=…)非捕获组 – 向后断言. 例如’(?<=sir) (.*)’ 表示式,当遇到’sir William’ 字符串时,匹配为’ William’.
(?!…)非捕获组 – 消极的向前断言. 例如’.(?!e)’ 表达式,当遇到’Apple’时,会找到每个字母除了 'l’,因为它紧跟着 'e’.
(?非捕获组 – 消极向后断言. 例如 '(?
(?P…)命名所捕获的组. 提交一个名称到组中供后续使用,例如’(?PA[^\s]+)\s(?P=first)’ 会找到 'Apple Apple’. 类似的 '(A[^\s]+)\s\1’ 使用组名而不是数字.
(?=name)匹配名为name的组. (?P…).
(?#comment)批注 –括号中的内容在匹配时将被忽略。

3 特殊符号

符号解释
\s匹配空格. 注意,会匹配标记的末尾. 使用 [[:blank:]] 来避免匹配新一行。
\S匹配非空白
\w匹配单词字符
\W匹配非单词字符
\d匹配数字字符
\D匹配非数字字符
\b匹配单词边界. '\bW\w+’ 找到W开头的单词
\B匹配非单词边界. '\Be\B+’ – 找到位于单子中间的字母’e’
\<This matches the start of a word using Scintilla’s definitions of words.
>This matches the end of a word using Scintilla’s definition of words.
\x运行用x来表达可能具有其他意思的字符。例如, [ 用来插入到文本中作为[ 而不是作为字符集的开始.

4 字符类

符号解释
[[:alpha:]]匹配字母字符: [A-Za-z]
[[:digit:]]匹配数字字符: [0-9]
[[:xdigit:]]匹配16进制字符: [0-9A-Fa-f]
[[:alnum:]]匹配字母数字字符: [0-9A-Za-z]
[[:lower:]]匹配小写字符: [a-z]
[[:upper:]]匹配大写字符: [A-Z]
[[:blank:]]匹配空白 (空格 or tab):[ \t]
[[:space:]]匹配空白字符:[ \t\r\n\v\f]
[[:punct:]]匹配标点字符: [-!”#$%&’()*+,./:;<=>?@[]_`{
[[:graph:]]匹配图形字符: [\x21-\x7E]
[[:print:]]匹配可打印的字符 (graphical characters and spaces)
[[:cntrl:]]匹配控制字符

5 替换操作

使用正则表达式的标记,通过()来包围想要用的字符,然后用\1 来替换字符串,第一个匹配文本。

例如:

Text bodySearch stringReplace stringResult
Hi my name is Fredmy name is (.+)my name is not \1Hi my name is not Fred
The quick brown fox jumped over the fat lazy dogbrown (.+) jumped over the (.+)brown \2 jumped over the \1The quick brown fat jumped over the fox lazy dog

6 限制

Support for regular expressions in PN2 is currently limited, the supported patterns and syntax are a very small subset of the powerful expressions supported by perl. 最大的限制是正则表达式只能匹配单行,不能用多行匹配表达。可以用Backslash Expressions代替.

准备计划是使用PCRE库 library (used elsewhere in PN2) 来支持文档搜索.

from http://www.pnotepad.org/docs/search/regular_expressions/

作者:Evan_Gu 来源:CSDN

原文:https://blog.csdn.net/gdp12315_gu/article/details/51730584



Regular Expressions

Search Patterns

_NOTE: For older versions of PN the tagged expressions start \( and end \) and there are no non-capture groups nor the backslash groups. _

Regular Expressions allow complicated and flexible search/replace using a specific syntax.

Note: Multi-line expressions (involving \n, \r, etc) are not yet supported. See Restrictions below.

Basic Expressions

PatternMeaning
.Matches any character except new line (\n). Note: That this means "." will also match \r which might cause some confusion when you are editing a file with both \r and \n. To match all characters including new lines you can use \s\S.
(...)This marks a region for tagging a match. These tag can be access using the syntax \1 for the first tag, \2 for the second, and \3 \4 ... \9. These tags can be used within the current regular expression or in the replacement string in a search/replace.
``
\1, \2, etcThis refers to the first through ninth (\1 to \9) tagged region when replacing. For example, if the search string was Fred([1-9])XXX and the replace string was Sam\1YYY, when applied to Fred2XXX this would generate Sam2YYY. Note: As only 9 regions can be used you can safely use replace string \10\2 to produce "text from region 1"0"text from region 2".
[...]This indicates a set of characters, for example, [abc] means any of the characters a, b or c. You can also use ranges, for example [a-z] for any lower case character.
[^...]The complement of the characters in the set. For example, [^A-Za-z] means any character except an alphabetic character.
^This matches the start of a line (unless used inside a set, see above).
$This matches the end of a line.
*This matches 0 or more times. For example, Sa*m matches Sm, Sam, Saam, Saaam and so on.
+This matches 1 or more times. For example, Sa+m matches Sam, Saam, Saaam and so on.
?This matches 0 or 1 occurences. For example, Sa?m matches Sm, Sam.
{n}This matches exactly n times. For example, 'Sa{2}m' matches Saam.
{m,n}This matches at least m times at most n times (if n is excluded then any number of times). For example, 'Sa{2,3}m' matches Saam or Saaam. 'Sa{2,}m' is the same as 'Saa+m'
*?, +?, ??, {n,m}?non-greedy matches -- matches the first valid match. Normally '<.>' will match the whole string 'content' -- but '<.?>' will match '' and ''.
This marks a region for tagging a match. These tag can be access using the syntax \1 for the first tag, \2 for the second, and \3 \4 ... \9. These tags can be used within the current regular expression or in the replacement string in a search/replace.

Tagging and Groups

PatternMeaning
(...)A capture group. Accessable though \1 for the first group, \2 for the second and so on.
(?:...)A Non-capture group.
(?=...)Non-capture group -- Look ahead assertion. '(.*)(?=ton)' given 'Appleton' will match 'Apple'.
(?<=...)Non-capture group -- Look behind assertion. '(?<=sir) (.*)' given 'sir William' will find ' William'.
(?!...)Non-capture group -- negative look ahead assertion. '.(?!e)' given 'Apple' will find each letter with the exception of 'l' because it is followed by an 'e'.
(?<!...)Non-capture group -- negative look behind assertion. '(?<!sir) (.*)(?=ton)' given 'sir William' will find ' William'.
(?P<name>...)Named capture group. Assign a name to a group for later use: '(?PA[^\s]+)\s(?P=first)' will find 'Apple Apple'. Similar to '(A[^\s]+)\s\1' but uses names rather than group number.
(?=name)Match to named group. see (?P...) for example.
(?#comment)Comment -- contents of the parentheses are ignored during matching.

Special Symbols

PatternMeaning
\sMatch whitespace. note: will match the end of like marker. Use [[:blank:]] when you need to avoid matching to a newline character.
\SMatch non-whitespace
\wMatch word character
\WMatch non-word character
\dMatch numeric digit
\DMatch non-digit character
\bMatch word boundary. '\bW\w+' -- finds words that begin with a 'W'
\BMatch non-word boundary. '\Be\B+' -- finds the letter 'e' only when it is in the middle of a word
\<This matches the start of a word using Scintilla's definitions of words.
\>This matches the end of a word using Scintilla's definition of words.
\xThis allows you to use a character x that would otherwise have a special meaning. For example, [ would be interpreted as [ and not as the start of a character set.

Character Classes

PatternMeaning
[[:alpha:]]Match a letter character: [A-Za-z]
[[:digit:]]Match a digit character: [0-9]
[[:xdigit:]]Match a hexadecimal digit character: [0-9A-Fa-f]
[[:alnum:]]Match an alphanumeric character: [0-9A-Za-z]
[[:lower:]]Match a lower case character: [a-z]
[[:upper:]]Match an upper case character: [A-Z]
[[:blank:]]Match a blank (space or tab):[ \t]
[[:space:]]Match a whitespace character):[ \t\r\n\v\f]
[[:punct:]]Match a punctuation character: [-!"#$%&'()*+,./:;<=>?@[\]_`{
[[:graph:]]Match Graphical character: [\x21-\x7E]
[[:print:]]Match Printable character (graphical characters and spaces)
[[:cntrl:]]Match control character

Replacing

Regular Expressions supports tagged expressions. This is accomplished using ( and ) to surround the text you want tagged, and then using \1 in the replace string to substitute the first matched text, \2 for the second, etc.

For example:

Text bodySearch stringReplace stringResult
Hi my name is Fredmy name is (.+)my name is not \1Hi my name is not Fred
The quick brown fox jumped over the fat lazy dogbrown (.+) jumped over the (.+)brown \2 jumped over the \1The quick brown fat jumped over the fox lazy dog

Restrictions

Support for regular expressions in PN2 is currently limited, the supported patterns and syntax are a very small subset of the powerful expressions supported by perl. The biggest restriction is that regular expressions match only within a single line, you cannot use multi-line regular expressions. As a workaround to the lack of multi-line search, you can instead use BackslashExpressions.

There are plans to improve this support by using the PCRE library (used elsewhere in PN2) to provide document searching. If you're interested in helping please make yourself known to the pn-discuss mailing list: PN Mailing Lists.

Examples

DescriptionSearchReplace
Remove leading whitespace on each line^[ \t]*
Change getVariable() to setVariable()get(\w+)\(\)set\1()

Breaking up a URL to display arguments:
Given a URL such as http://www.google.com/search?q=Programmers+Notepad&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a

DescriptionSearchReplace
Breakup a URL's arguments.*?[?&](\w+)=([\w-+%.:]+)(.*)$\1: \2\n\3

Place cursor at the beginning of the line. Hit the FindNext button, and then repeatedly hit the Replace button. Note that hitting "replace all" will not work right since we want to search though the results of the last replace.

Result is a list of the URL parameters:

q: Programmers+Notepadie: utf-8oe: utf-8aq: trls: org.mozilla:en-US:officialclient: firefox-a

See Also

本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
C++ regex 正则表达式的使用
正则表达式需要熟悉的几点
《R数据科学》第10章-用stringr处理字符串
php中正则的使用
懵了!女友突然问我什么是正则表达式
正则表达式1
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服