代码之家  ›  专栏  ›  技术社区  ›  Björn Pollex

在哪里用JDBC创建一个准备好的语句?

  •  1
  • Björn Pollex  · 技术社区  · 14 年前

    考虑使用以下方法从某些数据结构中读取数据( InteractionNetwork )并使用 SQLite-JDBC dirver :

    private void loadAnnotations(InteractionNetwork network) throws SQLException {
        PreparedStatement insertAnnotationsQuery = 
            connection.prepareStatement(
            "INSERT INTO Annotations(GOId, ProteinId, OnthologyId) VALUES(?, ?, ?)");
        PreparedStatement getProteinIdQuery = 
            connection.prepareStatement(
            "SELECT Id FROM Proteins WHERE PrimaryUniProtKBAccessionNumber = ?");
        connection.setAutoCommit(false);
        for(common.Protein protein : network.get_protein_vector()) {
            /* Get ProteinId for the current protein from another table and
               insert the value into the prepared statement. */
            getProteinIdQuery.setString(1, protein.get_primary_id());
            ResultSet result = getProteinIdQuery.executeQuery();
            result.next();
            insertAnnotationsQuery.setLong(2, result.getLong(1));
            /* Extract all the other data and add all the tuples to the batch. */
        }
        insertAnnotationsQuery.executeBatch();
        connection.commit();
        connection.setAutoCommit(true);
    }
    

    此代码工作正常,程序运行大约30秒,平均占用80米堆空间。因为代码看起来很难看,所以我想重构它。我做的第一件事就是把 getProteinIdQuery 进入循环:

    private void loadAnnotations(InteractionNetwork network) throws SQLException {
        PreparedStatement insertAnnotationsQuery = 
            connection.prepareStatement(
            "INSERT INTO Annotations(GOId, ProteinId, OnthologyId) VALUES(?, ?, ?)");
        connection.setAutoCommit(false);
        for(common.Protein protein : network.get_protein_vector()) {
            /* Get ProteinId for the current protein from another table and
               insert the value into the prepared statement. */
            PreparedStatement getProteinIdQuery = // <--- moved declaration of statement here
                connection.prepareStatement(
                "SELECT Id FROM Proteins WHERE PrimaryUniProtKBAccessionNumber = ?");
            getProteinIdQuery.setString(1, protein.get_primary_id());
            ResultSet result = getProteinIdQuery.executeQuery();
            result.next();
            insertAnnotationsQuery.setLong(2, result.getLong(1));
            /* Extract all the other data and add all the tuples to the batch. */
        }
        insertAnnotationsQuery.executeBatch();
        connection.commit();
        connection.setAutoCommit(true);
    }
    

    当我现在运行代码时,会发生的事情是它需要大约130m的堆空间,并且需要永恒的运行时间。有人能解释这种奇怪的行为吗?

    2 回复  |  直到 14 年前
        1
  •  2
  •   Donal Fellows    14 年前

    正如你所发现的,准备一份声明需要时间。无论代码是否丑陋,速度的下降也是非常丑陋的,所以您需要使用更快的形式。

    但是您可以使用一个内部类来保存细节并提供一个更好的接口:

    private class DatabaseInterface {
        private PreparedStatement insertAnnotation, getProteinId;
        public DatabaseInterface() {
            // This is an inner class; 'connection' is variable in outer class
            insertAnnotation = connection.prepareStatement(
                "INSERT INTO Annotations(GOId, ProteinId, OnthologyId) VALUES(?, ?, ?)");
            getProteinId = connection.prepareStatement(
                "SELECT Id FROM Proteins WHERE PrimaryUniProtKBAccessionNumber = ?");
        }
        public long getId(Protein protein) { // Exceptions omitted...
            getProteinId.setString(1, protein.get_primary_id());
            ResultSet result = getProteinId.executeQuery();
            try {
                result.next();
                return result.getLong(1);
            } finally {
                result.close();
            }
        }
        public void insertAnnotation(int GOId, long proteinId, String ontologyId) {
            insertAnnotation.setInt(1, GOId);          // type may be wrong
            insertAnnotation.setLong(2, proteinId);
            insertAnnotation.setString(3, ontologyId); // type may be wrong
            insertAnnotation.executeUpdate();
        }
    }
    private void loadAnnotations(InteractionNetwork network) throws SQLException {
        connection.setAutoCommit(false);
        DatabaseInterface dbi = new DatabaseInterface();
        for(common.Protein protein : network.get_protein_vector()) {
            dbi.insertAnnotation(..., dbi.getId(protein), ...);
        }
        connection.commit();
        connection.setAutoCommit(true);
    }
    

    其目的是,您有一段代码知道如何将事情分为SQL(如果您转到不同的数据库,这段代码很容易适应),还有一段代码知道如何将这些事情协调在一起。

        2
  •  2
  •   Jan    14 年前

    我想,如果第一个片段看起来很丑,那是一个品味问题;-)…

    但是,第二个代码片段需要更长时间(imho)的原因是,现在对于for循环的每次迭代,都会创建一个新的PreparedStatement实例(getProteinIdQuery),而在第一个片段中,您会重新使用准备好的语句,并按照其预期的方式使用它:实例化,然后提供适当的值。

    至少,这是我的观点… 简