回复: 请教:如何用editplus作中文“句提取”工具(sentence extractor)
EditPlus supports following regular expressions in Find, Replace and Find in Files command.
Expression Description
\t Tab character.
\n New line.
. Matches any character.
| Either expression on its left and right side matches the target string. For example, "a|b" matches "a" and "b".
[] Any of the enclosed characters may match the target character. For example, "[ab]" matches "a" and "b". "[0-9]" matches any digit.
[^] None of the enclosed characters may match the target character. For example, "[^ab]" matches all character EXCEPT "a" and "b". "[^0-9]" matches any non-digit character.
* Character to the left of asterisk in the expression should match 0 or more times. For example "be*" matches "b", "be" and "bee".
+ Character to the left of plus sign in the expression should match 1 or more times. For example "be+" matches "be" and "bee" but not "b".
? Character to the left of question mark in the expression should match 0 or 1 time. For example "be?" matches "b" and "be" but not "bee".
^ Expression to the right of ^ matches only when it is at the beginning of line. For example "^A" matches an "A" that is only at the beginning of line.
$ Expression to the left of $ matches only when it is at the end of line. For example "e$" matches an "e" that is only at the end of line.
() Affects evaluation order of expression and also used for tagged expression.
\ Escape character. If you want to use character "\" itself, you should use "\\".
The tagged expression is enclosed by (). Tagged expressions can be referenced by \0, \1, \2, \3, etc. \0 indicates a tagged expression representing the entire substring that was matched. \1 indicates the first tagged expression, \2 is the second, etc. See following examples.
Original Search Replace Result
abc (ab)(c) \0-\1-\2 abc-ab-c
abc a(b)(c) \0-\1-\2 abc-b-c
abc (a)b(c) \0-\1-\2 abc-a-c
I could not find regular expressions which support full-width character search under Editplus. You may try other tools.