代码之家 › 专栏 › 技术社区 › sastanin

固定空间线性时间下随机算法的迭代

profiling random haskell performance

sastanin · 技术社区 · 14 年前

我以前也问过类似的问题 once . 现在我要说得更具体些。目的是学习Haskell习语,用一元结果编写迭代算法。特别地,这对于实现各种随机化算法可能是有用的,例如遗传算法等。

关键点是随机更新一个元素(因此结果是 State StdGen 或其他单子):

type RMonad = State StdGen

-- An example of random iteration step: one-dimensional random walk.
randStep :: (Num a) => a -> RMonad a
randStep x = do
  rnd <- get
  let (goRight,rnd') = random rnd :: (Bool, StdGen)
  put rnd'
  if goRight
     then return (x+1)
     else return (x-1)

然后你需要更新很多元素,重复这个过程很多次。这里有个问题。因为每一步都是单子动作( :: a -> m a (Composing monad actions with folds) , seq 和 deepseq 帮助很多组成一元的行动。所以我做了:

-- Strict (?) iteration.
iterateM' :: (NFData a, Monad m) => Int -> (a -> m a) -> a -> m a
iterateM' 0 _ x = return $!! x
iterateM' n f x = (f $!! x) >>= iterateM' (n-1) f 

-- Deeply stict function application.
($!!) :: (NFData a) => (a -> b) -> a -> b
f $!! x = x `deepseq` f x

-- main seems to run in O(size*iters^2) time...
main :: IO ()
main = do
  (size:iters:_) <- liftM (map read) getArgs
  let start = take size $ repeat 0
  rnd <- getStdGen
  let end = flip evalState rnd $ iterateM' iters (mapM randStep) start
  putStr . unlines $ histogram "%.2g" end 13

当我测量完成这个程序所需的时间时,就迭代次数而言,它似乎与O(N^2)相似(内存分配似乎是可以接受的)。对于线性渐近线,此轮廓应平坦且恒定:

我假设这样一个程序应该以非常适度的内存需求运行,并且它所花费的时间应该与迭代次数成比例。我怎样才能在哈斯克尔做到这一点?

示例的完整可运行源代码是 here .

3 回复 | 直到 5 年前

Don Stewart 14 年前

需要考虑的事项:

使用 mersenne-random 发电机,通常是>比标准快100倍

import System.Random.Mersenne.Pure64

data R a = R !a {-# UNPACK #-}!PureMT

-- | The RMonad is just a specific instance of the State monad where the
--   state is just the PureMT PRNG state.
--
-- * Specialized to a known state type
--
newtype RMonad a = S { runState :: PureMT -> R a }

instance Monad RMonad where
    {-# INLINE return #-}
    return a = S $ \s -> R a s

    {-# INLINE (>>=) #-}
    m >>= k  = S $ \s -> case runState m s of
                                R a s' -> runState (k a) s'

    {-# INLINE (>>) #-}
    m >>  k  = S $ \s -> case runState m s of
                                R _ s' -> runState k s'

-- | Run function for the Rmonad.
runRmonad :: RMonad a -> PureMT -> R a
runRmonad (S m) s = m s

evalRmonad :: RMonad a -> PureMT -> a
evalRmonad r s = case runRmonad r s of R x _ -> x

-- An example of random iteration step: one-dimensional random walk.
randStep :: (Num a) => a -> RMonad a
randStep x = S $ \s -> case randomInt s of
                    (n, s') | n < 0     -> R (x+1) s'
                            | otherwise -> R (x-1) s'

http://hpaste.org/fastcgi/hpaste.fcgi/view?id=27414#a27414

在恒定空间中运行(与 [Double]

使用具有局部定义的专用状态单子也明显优于Control.monad.Strict。

alt text

请注意,它大约快了10倍,并且占用了1/5的空间。最大的问题是你的双打名单被分配。

monad-mersenne-random ,现在您的程序变成:

Using monad-mersenne-random

 {-# INLINE iterateM #-}
 iterateM n f x = go n x
     where
         go 0 !x = return x
         go n !x = f x >>= go (n-1)

总的来说,这使您的代码来自,K=500,N=30k

那就是, 快220倍 .

堆也更好了一点,现在iterateM取消了绑定。 alt text

sclv 14 年前

导入Control.Monad.State.Strict而不是Control.Monad.State可以显著提高性能。不知道你在寻找什么,但这可能会让你达到。

此外,通过交换iterateM和mapM,您可以提高性能,这样您就不必一直遍历列表,也不必保留列表的头,也不需要对列表进行deepseq,只需强制执行单个结果。即。:

let end = flip evalState rnd $ mapM (iterateM iters randStep) start

如果您这样做了,那么您也可以将iterateM更改为更加习惯用法:

iterateM 0 _ x = return x
iterateM n f !x = f x >>= iterateM (n-1) f

这当然需要bang模式语言扩展。

John L 14 年前

你定义

($!!) :: (NFData a) => (a -> b) -> a -> b
f $!! x = x `deepseq` f x

这将完全评估参数,但是函数结果根本不必评估。如果你想要 $!!

($!!) :: (NFData b) => (a -> b) -> a -> b
f $!! x = let y = f x in y `deepseq` y