您发布的文本片段可以由SGML解析器在
DOCTYPE
<tab>
在您的示例中,表示实际
tab
data.ent
,然后创建以下SGML文件,
doc.sgm
<!DOCTYPE doc [
<!ELEMENT doc O O (tab)+>
<!ELEMENT tab - O (((b,c?)|c),text)>
<!ELEMENT text O O (#PCDATA|b)+>
<!ELEMENT b - - (#PCDATA)>
<!ELEMENT c - - (#PCDATA)>
<!ENTITY data SYSTEM "data.ent">
<!ENTITY startc "<c>">
<!ENTITY endc "</c>">
<!SHORTREF intab "(" startc ")" endc>
<!USEMAP intab tab>
<!USEMAP #EMPTY text>
]>
&data
使用这些DTD规则解析数据的结果(使用
osgmlnorm doc.sgm
<DOC>
<TAB>
<B>SECTION 5.</B>
<TEXT>In Colorado Revised Statutes, 13-5-142, <B>amend</B> (1)
introductory portion, (1)(b), and (3)(b)(II) as follows:
</TEXT>
</TAB>
<TAB>
<B>13-5-142. National instant criminal background check system
reporting.</B>
<C>1</C>
<TEXT>On and after March 20, 2013, the state court administrator
shall send electronically the following information to the
Colorado bureau of investigation created pursuant to section
24-33.5-401, referred to in this section as the "bureau":
</TEXT>
</TAB>
<TAB>
<C>b</C>
<TEXT>The name of each person who has been committed by order
of the court to the custody of the office of behavioral health
in the department of human services pursuant to section 27-81-112
or 27-82-108; and
</TEXT>
</TAB>
<TAB>
<C>3</C>
<TEXT>The state court administrator shall take all necessary steps
to cancel a record made by the state court administrator in the
national instant criminal background check system if:
</TEXT>
</TAB>
<TAB>
<C>b</C>
<TEXT>No less than three years before the date of the written
request:
</TEXT>
</TAB>
<TAB>
<C>II</C>
<TEXT>The period of commitment of the most recent order of
commitment or recommitment expired, or a court entered an order
terminating the person's incapacity or discharging the person
from commitment in the nature of habeas corpus, if the record in
the national instant criminal background check system is based on
an order of commitment to the custody of the office of behavioral
health in the department of human services; except that the state
court administrator shall not cancel any record pertaining to
a person with respect to whom two recommitment orders have been
entered pursuant to section 27-81-112 (7) and (8), or who was
discharged from treatment pursuant to section 27-81-112 (11) on
the grounds that further treatment is not likely to bring about
significant improvement in the person's condition; or
</TEXT>
</TAB>
</DOC>
说明:
-
我创建的SGML DTD使用SGML标记推断来推断合成的
DOC
元素作为文档元素,以及人工
TEXT
和
C
主要目的是将文档结构作为一系列
TAB
元素,每个元素包含一个节标识符(例如
<b>SECTION 5.</b>
(c)
-
我还制作了一个特殊元素
C
放在大括号中的文本(
(
和
)
字符);开始-结束元素
标签
由于以下原因,由SGML处理器自动插入
DTD的
SHORTREF
映射规则;这些告诉SGML
元素,SGML应替换所有
(
startc
实体(扩展到
<C>
字符由
的价值
endc
</C>
)
-
<!USEMAP #EMPTY text>
关闭中括号的扩展
文本
a的身体部位
桌棋类游戏
(7)
(8)
在里面
正文文本不会被更改(尽管可以像HTML一样更改为
链接以及使用SGML)
如果您使用
<
表示制表符(ASCII 9),SGML也可以处理它,例如通过将制表符转换为
<TAB>
SHORTREF公司
与所示规则类似的规则。
osgmlnorm
安装的程序;可以使用
sudo apt-get install opensp
如果你在Ubuntu上,在其他Linux版本和Mac操作系统上也是如此。对于您的应用程序,您可能需要使用
osx