- Home
Introduction and rules - User manual
How to use AWB - Discussion
Discuss AWB, report errors, and request features - User tasks
Request or help with AWB-able tasks - Technical
Technical documentation
These are the regular expressions for AWB relevant to wrangling text in CJK languages, and for fixing typos and style issues particular to Chinese, Japanese, Korean, et al. topics.
Please add to or improve this page!
New additions
edit(?<cjktext>([一-鿿]|\ |[0-9])+)
(?<koreantext>([가-힣]|\ |[0-9])+)
Korean
editWrapping labelled, unwrapped Hangul + Hanja in first sentence in bios
editDisable case sensitivity. This is meant specifically for unwrapped Hangul/Hanja in the first sentence of the lead (in parentheses) that someone has attempted to manually label. Designed to only work on articles about people, as it expects a comma or semicolon after the hanja. E.g. Jeff ([[Hangul]]:제프, Hanja: 爸爸, 1902–1998 etc)
-> Jeff ({{Korean|hangul=제프|hanja=爸爸}}; 1902–1998 etc)
Search:
\((\[\[Korean language\|Korean\]\]|\[\[Hangul\]\]|Hangul|Korean)[ ]*:[ ]*(?<koreantext>([가-힣]|\ |[0-9])+)(,|;)[ ]*(Hanja|\[\[Hanja\]\])[ ]*:[ ]*(?<chinesetext>[\u4E00-\u9FFF]+)(;|,)
Replace:
({{Korean|hangul=${koreantext}|hanja=${chinesetext}}};
Wrapping manually-labeled Hangul
editDisable case sensitivity. This identifies when someone has attempted to manually write a language label for Hangul, and wraps it with Template:Korean instead. E.g. [[Korean language|Korean]]: 안녕
-> {{korean|hangul=안녕}}
Search:
(\[\[Korean language\|Korean\]\]|\[\[Hangul\]\]|Hangul|Korean)[ ]*:[ ]*(?<koreantext>([가-힣]|\ |[0-9])+)
Replace:
{{Korean|hangul=${koreantext}}}
Merging tacked-on Hanja
editDisable case sensitivity. Fixing when someone has manually tacked on Hanja after properly-formatted Hangul in Template:Korean. E.g. {{korean|hangul=안녕}}, [[Hanja]]:你好
-> {{korean|hangul=안녕|hanja=你好}}
Search:
{{(Korean|ko-hhrm)[ ]*\|[ ]*(hangul[ ]*=[ ]*)?(?<koreantext>([가-힣]|\ |[0-9])+)[ ]*}}(,|;)?[ ]*(\[\[)?(Hanja|Chinese)(\]\])?:?[ ]*(?<chinesetext>[\u4E00-\u9FFF]+)
Replace:
{{Korean|hangul=${koreantext}|hanja=${chinesetext}}}
See also
edit{{CJKV}}
{{transliteration}}