Glossary of Terms

Definitions for terms used within the USFM/USX schema and diagrams.

hs

Pattern: /[\u0009\u0020]/ A single simple horizontal (non newline) whitespace character

HS

Pattern: /${hs}+/ A sequence of more horizontal whitespace characters

Hs

Pattern: /${hs}*/ A sequence of zero or more horizontal whitespace characters

nl

Pattern: /(?:\u000D?\u000A|\u000D)/ A single newline (as supported by all operating systems)

NL

Pattern: /${nl}+/ A sequence of newline characters

ws

Pattern: /(?:${hs}|${nl}|$)/ A reducable whitespace character, including a single newline sequence

anyws

Pattern: /[\u0009\u000A\u000D\u0020]/ Matches any single reducible whitespace character, may split newline

WS

Pattern: /${anyws}+/ A sequence of reducable whitespace characters

Ws

Pattern: /${anyws}*/ A sequence of zero or more reducable whitespace characters

allws

Pattern: /[\u0009-\u000D\u0020\u00A0\u1680\u2000-\u200B\u2028\u2029\u202F\u205F\u3000]/ All possible whitespace characters including content whitespace. Matches a single character

WSNL

Pattern: /${HS}${NL}/ A sequence of non-newline whitespace up to and including a newline

TAGEND

Pattern: /(?:${ws}+|(?=[\\|]|$))/ Delimits a marker

TEXTEND

Pattern: /(?=${anyws}*\\|//|$)/ Delimits simple text

ATTRIBTEXTEND

Pattern: /${Hs}(?=[\\|])/ Reducable characters following an attribute value

ATTRIBTEXT

Pattern: /(?:\\["\\=~/|]|[^\\"])+/ Matches text inside an attribute value (not including the quotes)

ATTRIBALL

Pattern: /(?:[^\\=|]|\\[\\=|~/])+(?=\\)/ Matches a default attribute value string

ATTRIBNAME

Pattern: /[a-zA-Z_][a-zA-Z0-9\-_]*?/ Matches an attribute name

TEXT

Pattern: /([^\\/]|/(?!/)|\\[/~\\|])+?/ Matches simple text up to the next marker

TEXTNWS

Pattern: /.+(?=${Ws}\\[^~/\\])/ Matches simple text without trailing whitespace

TEXTNOTATTRIB

Pattern: /(?:[^\\|]|\\[\\~/|])+/ Matches simple text up to the start of a sequence of attributes delimited by |

TEXTNOTATTRIBOPT

Pattern: /(?:[^\\|]|\\[\\~/|])*/ Matches simple text up to the start of a sequence of attributes delimited by | if present

IGNORELINE

Pattern: /[^\r\n]*/ Matches anything up to the end of the line. Used for ignoring everything up to the end of a line

PIPE

Pattern: /${hs}*(?<!\\)\|${hs}*/ Matches the attributes list delimiter of |

TLC

Pattern: /[0-9A-Z]{3}/ Three letter uppercase code including digits

VERSE

Pattern: /[1-9][0-9]*[\p{L}\p{Mn}]*(‏?[\-,][0-9]+[\p{L}\p{Mn}]*)*/ Verse number, including ranges and sequences

VID

Pattern: /[A-Z1-4]{3} ?[‏a-z0-9,\-:\p{L}\p{Mn}]*/ USX eid, sid, vid references

MID

Pattern: /[\p{L}\d_\-\.:]+/ Milestone sid or eid any identifier, in effect

HREF

Pattern: /(.*\/\/\/?(.*\/?)+)|((prj:[A-Za-z\-0-9]{3,8} )?[A-Z1-4]{3} \d+:\d+(\-\d+)?)|(#[^\s]+)/ href bible reference