查找最近活动
来自 PostgreSQL Wiki
跳转到导航跳转到搜索
偶尔会看到一个请求:给定一个表示更改的大型表(无论是错误评论、股票行情价格,还是任何其他内容),返回最近活动的一些受影响对象及其最新的更改。
显然,如果您正在跟踪每个对象的最后更改时间,那么有一些方法可以快速完成此操作,但代价是每次在更改表上插入数据时都要对对象行进行非 HOT 更新。本页介绍了如何从更改表本身高效地(以毫秒为单位)获取最近的更改。
这个具体的例子是为以下需求而写的:“给定一个(大型)表 prices(stock,price,updated),找出最近更新的 10 个股票中每个股票的 3 个最新价格”。假设存在适当的索引(特别是 (updated) 和 (stock,updated))。
此版本适用于 9.3 及更高版本
with recursive
t1 as ( (select stock, array[stock] as seen
from prices
order by updated desc limit 1)
union all
(select p.stock, t1.seen || p.stock
from t1,
LATERAL (select p1.stock from prices p1
where p1.stock <> all (t1.seen)
order by p1.updated desc limit 1) as p
where array_upper(t1.seen,1) < 10) )
select p.*
from t1,
LATERAL (select * from prices p2
where p2.stock=t1.stock
order by updated desc limit 3) p;
这是为 8.4 及更高版本编写的原始版本
with recursive
t1 as ( (select *, array[stock] as seen
from prices
order by updated desc limit 1)
union all
(select (p).*, s.seen || (p).stock
from (select (select p from prices p
where p.stock <> all (t.seen)
order by p.updated desc limit 1) as p,
t1.seen
from t1
where array_upper(t1.seen,1) < 10 offset 0) s
)
),
t2 as ( select stock, price, updated, 1 as n
from t1
union all
(select (p).stock, (p).price, (p).updated, s.n+1
from (select (select p from prices p
where p.stock=t2.stock and p.updated < t2.updated
order by p.updated desc limit 1) as p,
t2.n
from t2
where t2.n < 3 offset 0) s
)
)
select * from t2;
这里的想法是利用递归以最有效的方式获取单行,然后在达到所需数量时停止。假设源表比获取的行数大得多,而且不同股票的数量也比获取的数量大得多,或者至少有 10 种不同股票在表最小的部分有更新。
相当丑陋的子查询用法被证明是必要的,以防止在每个递归分支的最后迭代中完全扫描 prices 表。
以下变体适用于“返回每个主题的 5 个最新帖子”查询(但在 9.3+ 中,有一个更简单的 LATERAL 解决方案,无需递归)
with recursive
rp as (select (p).*, 1 as rcount
from (select (select p from posts p
where p.topic_id=t.topic_id
order by p.post_ts desc, p.post_id desc limit 1) as p
from topics t offset 0) s
where (p).post_id is not null
union all
select (p).*, s.rcount + 1
from (select (select p from posts p
where p.topic_id=rp.topic_id
and (p.post_ts,p.post_id) < (rp.post_ts,rp.post_id)
order by p.post_ts desc, p.post_id desc limit 1) as p,
rp.rcount
from rp
where rp.rcount < 5 offset 0) s
where (p).post_id is not null)
select * from rp;