<head>
)内部电流分区。
tei文件示例:
<div type="lib" n="1"><head>LIBER I</head>...
<div type="pr">...</div>
<div type="cap" n="1"><head>CAP EX</head><p><milestone unit="par" n="1" />...<milestone unit="par" n="2" />...</div>
<div type="cap" n="2"><head>CAP EX</head><milestone unit="par" n="1" />...<milestone unit="par" n="2" />...</div>
</div>
我试过了,但没有成功:
//source file:
$fulltext = '<div type="lib" n="1"><head>LIBER I</head>...<div type="pr">...</div><div type="cap" n="1"><head>CAP EX</head><p><milestone unit="par" n="1" />...<milestone unit="par" n="2" />...</div><div type="cap" n="2"><head>CAP EX</head><milestone unit="par" n="1" />...<milestone unit="par" n="2" />...</div></div>';
$dom = new DOMDocument();
@$dom->loadHTML($fulltext);
$domx = new DOMXPath($dom);
$entries = $domx->evaluate("//div");
echo '<ul>';
foreach ($entries as $entry){
$title = '';
type = $entry->getAttribute( 'type' );
$n = $entry->getAttribute( 'n' );
$head = $domx->evaluate("string(./head[1])",$entry);
if( $head != '' ) $title = $head; else $title = $n;
echo '<li><a href="#'.$type.'-'.$n.'">'.$title.'</li>';
}
echo '</ul>';
线路不工作:
$head = $domx->evaluate("string(./head[1])",$entry);
DOMDocument::loadHTML(): htmlParseStartTag: misplaced <head> tag in Entity, line: 3
此行的目的是获取循环内子标记头的文本(在本例中为“LIBER I”)