代码之家  ›  专栏  ›  技术社区  ›  Melanie

如何使用一个具有一个包含的XSD验证一个XML文件?

  •  26
  • Melanie  · 技术社区  · 14 年前

    我使用Java 5 javax .xML.Valual.ValualAtter来验证XML文件。我为一个只使用导入的模式做了这项工作,并且一切都可以正常工作。现在,我尝试使用另一个使用import和include的模式进行验证。我的问题是主模式中的元素被忽略,验证表明它找不到它们的声明。

    以下是我如何构建模式:

    InputStream includeInputStream = getClass().getClassLoader().getResource("include.xsd").openStream();
    InputStream importInputStream = getClass().getClassLoader().getResource("import.xsd").openStream();
    InputStream mainInputStream = getClass().getClassLoader().getResource("main.xsd").openStream();
    Source[] sourceSchema = new SAXSource[]{includeInputStream , importInputStream, 
    mainInputStream };
    Schema schema = factory.newSchema(sourceSchema);
    

    下面是main.xsd中声明的摘录

    <xsd:schema xmlns="http://schema.omg.org/spec/BPMN/2.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:import="http://www.foo.com/import" targetNamespace="http://main/namespace" elementFormDefault="qualified" attributeFormDefault="unqualified">
        <xsd:import namespace="http://www.foo.com/import" schemaLocation="import.xsd"/>
        <xsd:include schemaLocation="include.xsd"/>
        <xsd:element name="element" type="tElement"/>
        <...>
    </xsd:schema>
    

    如果我在main.xsd中复制包含的xsd的代码,它就可以正常工作。如果没有,验证就找不到“element”的声明。

    8 回复  |  直到 5 年前
        1
  •  57
  •   Stefan De Boey    14 年前

    你需要使用 LSResourceResolver 为了这个工作。请看下面的示例代码。

    验证方法:

    // note that if your XML already declares the XSD to which it has to conform, then there's no need to declare the schemaName here
    void validate(String xml, String schemaName) throws Exception {
    
        DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
        builderFactory.setNamespaceAware(true);
    
        DocumentBuilder parser = builderFactory
                .newDocumentBuilder();
    
        // parse the XML into a document object
        Document document = parser.parse(new StringInputStream(xml));
    
        SchemaFactory factory = SchemaFactory
                .newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
    
        // associate the schema factory with the resource resolver, which is responsible for resolving the imported XSD's
        factory.setResourceResolver(new ResourceResolver());
    
                // note that if your XML already declares the XSD to which it has to conform, then there's no need to create a validator from a Schema object
        Source schemaFile = new StreamSource(getClass().getClassLoader()
                .getResourceAsStream(schemaName));
        Schema schema = factory.newSchema(schemaFile);
    
        Validator validator = schema.newValidator();
        validator.validate(new DOMSource(document));
    }
    

    资源冲突解决程序实现:

    public class ResourceResolver  implements LSResourceResolver {
    
    public LSInput resolveResource(String type, String namespaceURI,
            String publicId, String systemId, String baseURI) {
    
         // note: in this sample, the XSD's are expected to be in the root of the classpath
        InputStream resourceAsStream = this.getClass().getClassLoader()
                .getResourceAsStream(systemId);
        return new Input(publicId, systemId, resourceAsStream);
    }
    
     }
    

    资源冲突解决程序返回的输入实现:

    public class Input implements LSInput {
    
    private String publicId;
    
    private String systemId;
    
    public String getPublicId() {
        return publicId;
    }
    
    public void setPublicId(String publicId) {
        this.publicId = publicId;
    }
    
    public String getBaseURI() {
        return null;
    }
    
    public InputStream getByteStream() {
        return null;
    }
    
    public boolean getCertifiedText() {
        return false;
    }
    
    public Reader getCharacterStream() {
        return null;
    }
    
    public String getEncoding() {
        return null;
    }
    
    public String getStringData() {
        synchronized (inputStream) {
            try {
                byte[] input = new byte[inputStream.available()];
                inputStream.read(input);
                String contents = new String(input);
                return contents;
            } catch (IOException e) {
                e.printStackTrace();
                System.out.println("Exception " + e);
                return null;
            }
        }
    }
    
    public void setBaseURI(String baseURI) {
    }
    
    public void setByteStream(InputStream byteStream) {
    }
    
    public void setCertifiedText(boolean certifiedText) {
    }
    
    public void setCharacterStream(Reader characterStream) {
    }
    
    public void setEncoding(String encoding) {
    }
    
    public void setStringData(String stringData) {
    }
    
    public String getSystemId() {
        return systemId;
    }
    
    public void setSystemId(String systemId) {
        this.systemId = systemId;
    }
    
    public BufferedInputStream getInputStream() {
        return inputStream;
    }
    
    public void setInputStream(BufferedInputStream inputStream) {
        this.inputStream = inputStream;
    }
    
    private BufferedInputStream inputStream;
    
    public Input(String publicId, String sysId, InputStream input) {
        this.publicId = publicId;
        this.systemId = sysId;
        this.inputStream = new BufferedInputStream(input);
    }
    }
    
        2
  •  4
  •   gil.fernandes    7 年前

    所接受的答案完全正确,但不需要对Java 8进行一些修改。也可以指定一个基本路径来读取导入的模式。

    我在Java 8中使用了下面的代码,它允许指定除了根路径之外的嵌入模式路径:

    import com.sun.org.apache.xerces.internal.dom.DOMInputImpl;
    import org.w3c.dom.ls.LSInput;
    import org.w3c.dom.ls.LSResourceResolver;
    
    import java.io.InputStream;
    import java.util.Objects;
    
    public class ResourceResolver implements LSResourceResolver {
    
        private String basePath;
    
        public ResourceResolver(String basePath) {
            this.basePath = basePath;
        }
    
        @Override
        public LSInput resolveResource(String type, String namespaceURI, String publicId, String systemId, String baseURI) {
            // note: in this sample, the XSD's are expected to be in the root of the classpath
            InputStream resourceAsStream = this.getClass().getClassLoader()
                    .getResourceAsStream(buildPath(systemId));
            Objects.requireNonNull(resourceAsStream, String.format("Could not find the specified xsd file: %s", systemId));
            return new DOMInputImpl(publicId, systemId, baseURI, resourceAsStream, "UTF-8");
        }
    
        private String buildPath(String systemId) {
            return basePath == null ? systemId : String.format("%s/%s", basePath, systemId);
        }
    }
    

    此实现还为用户提供了一条有意义的消息,以防无法读取架构。

        3
  •  3
  •   Community dbr    7 年前

    我不得不对 this post 作者:Amegmondoember

    我的主模式文件有一些来自兄弟文件夹的include,并且包含的文件也有一些来自本地文件夹的include。我还必须跟踪当前资源的基本资源路径和相对路径。这段代码对我来说很有用,但是请记住,它假定所有的XSD文件都有一个唯一的名称。如果您有一些具有相同名称的XSD文件,但是不同路径上的内容不同,那么它可能会给您带来问题。

    import java.io.ByteArrayInputStream;
    import java.io.InputStream;
    import java.util.HashMap;
    import java.util.Map;
    import java.util.Scanner;
    
    import org.slf4j.Logger;
    import org.slf4j.LoggerFactory;
    import org.w3c.dom.ls.LSInput;
    import org.w3c.dom.ls.LSResourceResolver;
    
    /**
     * The Class ResourceResolver.
     */
    public class ResourceResolver implements LSResourceResolver {
    
        /** The logger. */
        private final Logger logger = LoggerFactory.getLogger(this.getClass());
    
        /** The schema base path. */
        private final String schemaBasePath;
    
        /** The path map. */
        private Map<String, String> pathMap = new HashMap<String, String>();
    
        /**
         * Instantiates a new resource resolver.
         *
         * @param schemaBasePath the schema base path
         */
        public ResourceResolver(String schemaBasePath) {
            this.schemaBasePath = schemaBasePath;
            logger.warn("This LSResourceResolver implementation assumes that all XSD files have a unique name. "
                    + "If you have some XSD files with same name but different content (at different paths) in your schema structure, "
                    + "this resolver will fail to include the other XSD files except the first one found.");
        }
    
        /* (non-Javadoc)
         * @see org.w3c.dom.ls.LSResourceResolver#resolveResource(java.lang.String, java.lang.String, java.lang.String, java.lang.String, java.lang.String)
         */
        @Override
        public LSInput resolveResource(String type, String namespaceURI,
                String publicId, String systemId, String baseURI) {
            // The base resource that includes this current resource
            String baseResourceName = null;
            String baseResourcePath = null;
            // Extract the current resource name
            String currentResourceName = systemId.substring(systemId
                    .lastIndexOf("/") + 1);
    
            // If this resource hasn't been added yet
            if (!pathMap.containsKey(currentResourceName)) {
                if (baseURI != null) {
                    baseResourceName = baseURI
                            .substring(baseURI.lastIndexOf("/") + 1);
                }
    
                // we dont need "./" since getResourceAsStream cannot understand it
                if (systemId.startsWith("./")) {
                    systemId = systemId.substring(2, systemId.length());
                }
    
                // If the baseResourcePath has already been discovered, get that
                // from pathMap
                if (pathMap.containsKey(baseResourceName)) {
                    baseResourcePath = pathMap.get(baseResourceName);
                } else {
                    // The baseResourcePath should be the schemaBasePath
                    baseResourcePath = schemaBasePath;
                }
    
                // Read the resource as input stream
                String normalizedPath = getNormalizedPath(baseResourcePath, systemId);
                InputStream resourceAsStream = this.getClass().getClassLoader()
                        .getResourceAsStream(normalizedPath);
    
                // if the current resource is not in the same path with base
                // resource, add current resource's path to pathMap
                if (systemId.contains("/")) {
                    pathMap.put(currentResourceName, normalizedPath.substring(0,normalizedPath.lastIndexOf("/")+1));
                } else {
                    // The current resource should be at the same path as the base
                    // resource
                    pathMap.put(systemId, baseResourcePath);
                }
                Scanner s = new Scanner(resourceAsStream).useDelimiter("\\A");
                String s1 = s.next().replaceAll("\\n", " ") // the parser cannot understand elements broken down multiple lines e.g. (<xs:element \n name="buxing">)
                        .replace("\\t", " ") // these two about whitespaces is only for decoration
                        .replaceAll("\\s+", " ").replaceAll("[^\\x20-\\x7e]", ""); // some files has a special character as a first character indicating utf-8 file
                InputStream is = new ByteArrayInputStream(s1.getBytes());
    
                return new LSInputImpl(publicId, systemId, is); // same as Input class
            }
    
            // If this resource has already been added, do not add the same resource again. It throws
            // "org.xml.sax.SAXParseException: sch-props-correct.2: A schema cannot contain two global components with the same name; this schema contains two occurrences of ..."
            // return null instead.
            return null;
        }
    
        /**
         * Gets the normalized path.
         *
         * @param basePath the base path
         * @param relativePath the relative path
         * @return the normalized path
         */
        private String getNormalizedPath(String basePath, String relativePath){
            if(!relativePath.startsWith("../")){
                return basePath + relativePath;
            }
            else{
                while(relativePath.startsWith("../")){
                    basePath = basePath.substring(0,basePath.substring(0, basePath.length()-1).lastIndexOf("/")+1);
                    relativePath = relativePath.substring(3);
                }
                return basePath+relativePath;
            }
        }
    }
    
        4
  •  1
  •   Gordon Daugherty    5 年前

    正如用户“Ulab”在对另一个答案的评论中指出的,解决方案如 this answer (对于单独的stackoverflow问题)对许多人都有效。下面是该方法的大致概述:

    SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
    URL xsdURL = this.getResource("/xsd/my-schema.xsd");
    Schema schema = schemaFactory.newSchema(xsdURL);
    

    这种方法的关键是避免给模式工厂一个流,而是给它一个URL。这样它就可以获取有关XSD文件位置的信息。

    这里要记住的一点是,当您使用格式为“my common.xsd”或“common/some concept.xsd”的简单文件路径时,include和/或import元素上的“schemaLocation”属性将被视为相对于已提交给验证器的xsd文件的类路径位置。

    笔记: -在上面的示例中,我将模式文件放在一个“xsd”文件夹下的JAR文件中。 -“GETREST”参数中的主斜杠告诉Java从类加载器的根开始,而不是在“这个”对象的包名处。

        5
  •  0
  •   AMegmondoEmber    11 年前

    对于我们来说,解析器资源看起来是这样的。在一些prolog异常和奇怪之后 元素类型“xs:schema”后面必须跟属性规范、“>”或“/>”。 元素类型“xs:element”后面必须跟属性规范、“>”或“/>”。 (由于多条线路的故障)

    由于include的结构,需要路径历史

    main.xsd (this has include "includes/subPart.xsd")
    /includes/subPart.xsd (this has include "./subSubPart.xsd")
    /includes/subSubPart.xsd
    

    所以代码看起来像:

    String pathHistory = "";
    
    @Override
    public LSInput resolveResource(String type, String namespaceURI, String publicId, String systemId, String baseURI) {
        systemId = systemId.replace("./", "");// we dont need this since getResourceAsStream cannot understand it
        InputStream resourceAsStream = Message.class.getClassLoader().getResourceAsStream(systemId);
        if (resourceAsStream == null) {
            resourceAsStream = Message.class.getClassLoader().getResourceAsStream(pathHistory + systemId);
        } else {
            pathHistory = getNormalizedPath(systemId);
        }
        Scanner s = new Scanner(resourceAsStream).useDelimiter("\\A");
        String s1 = s.next()
                .replaceAll("\\n"," ") //the parser cannot understand elements broken down multiple lines e.g. (<xs:element \n name="buxing">) 
                .replace("\\t", " ") //these two about whitespaces is only for decoration
                .replaceAll("\\s+", " ") 
                .replaceAll("[^\\x20-\\x7e]", ""); //some files has a special character as a first character indicating utf-8 file
        InputStream is = new ByteArrayInputStream(s1.getBytes());
    
        return new LSInputImpl(publicId, systemId, is);
    }
    
    private String getNormalizedPath(String baseURI) {
        return baseURI.substring(0, baseURI.lastIndexOf(System.getProperty("file.separator"))+ 1) ;
    }
    
        6
  •  0
  •   teknopaul    8 年前

    接受的答案非常冗长,首先在内存中构建一个DOM,include对我来说似乎是开箱即用的,包括相对引用。

        SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
        Schema schema = schemaFactory.newSchema(new File("../foo.xsd"));
        Validator validator = schema.newValidator();
        validator.validate(new StreamSource(new File("./foo.xml")));
    
        7
  •  -1
  •   Ramakrishna    10 年前

    如果在XML中找不到元素,则会得到xml:lang异常。 元素区分大小写

        8
  •  -4
  •   ncenerar    11 年前
    SchemaFactory schemaFactory = SchemaFactory
                                    .newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
    Source schemaFile = new StreamSource(getClass().getClassLoader()
                                    .getResourceAsStream("cars-fleet.xsd"));
    Schema schema = schemaFactory.newSchema(schemaFile);
    Validator validator = schema.newValidator();
    StreamSource source = new StreamSource(xml);
    validator.validate(source);