我正在尝试学习sqlite并寻找加速查询的技术。我看到这里有一些人想挤出MS,当我很容易进入
兆欧
秒。我有一个带四个表的sqliteDB,尽管我只查询三个表。下面是查询(我使用r来调用查询):
SELECT a.date, a.symbol, SUM (a.oi*a.contract_close) AS oi, c.ret, c.prc
FROM (SELECT date, symbol, oi, contract_close FROM ann
UNION
SELECT date, symbol AS sym, oi, contract_close FROM qtr
WHERE oi > 100 AND contract_close > 0 AND date > 20090600) a
INNER JOIN
(SELECT date, symbol || '1C' AS sym, ret, prc FROM crsp
WHERE prc > 5 AND date>20090600) c
ON a.date = c.date AND a.symbol = c.sym
GROUP BY a.date, a.symbol
我在每个表上都有一个按日期和符号排列的索引,只是被清空了,但它仍然很慢,就像一小时加上一个小时(注意我在寻找一个六个月的子集…我真的想回到2003年)。
这只是缓存大小问题吗?我有一台相对较新的笔记本电脑(带有4GB内存的MacBook Pro)。谢谢!
这里是
.schema
:
CREATE TABLE ann
( "date" INTEGER,
symbol TEXT,
contract_type_1 TEXT,
contract_type_2 TEXT,
product_type TEXT,
block_volume INTEGER,
oi_change INTEGER,
oi INTEGER,
efp_volume INTEGER,
total_volume INTEGER,
name TEXT,
contract_change INTEGER,
contract_open INTEGER,
contract_high INTEGER,
contract_low INTEGER,
contract_close INTEGER,
contract_settle INTEGER
);
CREATE TABLE crsp
( "date" INTEGER,
symbol TEXT,
permno INTEGER,
prc REAL,
ret REAL,
vwretd REAL,
ewretd REAL,
sprtrn REAL
);
CREATE TABLE dly
( "date" INTEGER,
symbol TEXT,
expiration INTEGER,
product_type TEXT,
shares_per_contract INTEGER,
"open" REAL,
high REAL,
low REAL,
"last" REAL,
settle REAL,
change REAL,
total_volume INTEGER,
efp_volume INTEGER,
block_volume INTEGER,
oi INTEGER
);
CREATE TABLE qtr
( "date" INTEGER,
symbol TEXT,
total_volume INTEGER,
block_volume INTEGER,
efp_volume INTEGER,
contract_high INTEGER,
contract_low INTEGER,
contract_open INTEGER,
contract_close INTEGER,
contract_settle INTEGER,
oi INTEGER,
oi_change INTEGER,
shares_per_contract INTEGER,
expiration INTEGER,
product_type TEXT,
unk TEXT,
name TEXT
);
CREATE INDEX idx_ann_date_sym ON ann (date, symbol);
CREATE INDEX idx_crsp_date_sym ON ann (date, symbol);
CREATE INDEX idx_dly_date_sym ON ann (date, symbol);
CREATE INDEX idx_qtr_date_sym ON ann (date, symbol);