efficient uniform shuffling
Spotify had a blog post about how to shuffle songs, which included a link to earlier work on the art of shuffling music. The original algorithm uses a lot of both memory and CPU (in particular, a playlist containing a lot of loosies will be extremely memory hungry as each song is expanded). I think I understand how to implement the Spotify “dithering” algorithm efficiently, but there’s no pudding.
Here’s working code for a another variant combining some ideas from both of the above. It shuffles in O(N) Space & Time and fits in an Ethernet frame.
code
local shuffler = { }
-- create a new shuffler
-- size should be the upper bound of the list size
-- overestimation will merely waste a little space
-- underestimation will result in poor results
function shuffler.new(size, spacing)
spacing = spacing or 8 -- more than seven, less than nine
local spl = { }
setmetatable(spl, { __index = shuffler })
spl.size = size * spacing
spl.items = { }
return spl
end
local function fyshuffle(items)
local j, t
for i = #items, 2, -1 do
j = math.random(1, i)
t = items[j]
items[j] = items[i]
items[i] = t
end
end
local function insert(spl, item, pos)
while true do
while pos > spl.size do
pos = pos - spl.size
end
if not spl.items[pos] then
spl.items[pos] = item
break
end
pos = pos + 1
end
end
-- add some items to the playlist
function shuffler.add(spl, items)
local gap = spl.size / #items
local start = math.random(1, spl.size)
local pos
fyshuffle(items)
for i = 1, #items do
pos = math.floor(start + i * gap +
math.random(math.floor(gap * 2 / 3)))
insert(spl, items[i], pos)
end
end
-- return the shuffled playlist
function shuffler.finish(spl)
local pl = { }
for i = 1, spl.size do
if spl.items[i] then
table.insert(pl, spl.items[i])
end
end
spl.items = { }
return pl
end
return shuffler
results
Test code:
local sh = require "shuffler"
local function split(s)
local r = { }
for w in string.gmatch(s, "%S+") do
table.insert(r, w)
end
return r
end
local function testgenres()
local spl = sh.new(31)
spl:add(split("A A A A A A A A A A"))
spl:add(split("B B B B B B B B B B B"))
spl:add(split("C C C C C C C C C C C"))
print(table.concat(spl:finish()))
end
local function testartists()
local spl = sh.new(11)
spl:add(split("W W W W"))
spl:add(split("X X"))
spl:add(split("B B B"))
spl:add(split("T"))
spl:add(split("J"))
print(table.concat(spl:finish()))
end
local function testrandom()
local spl = sh.new(200000)
for i = 1, 20000 do
spl:add({ string.format("s%d", i) })
end
local a
for i = 1, 8000 do
a = { }
for j = 1, math.random(8, 14) do
table.insert(a, string.format("a%dt%d", i, j))
end
spl:add(a)
end
print(#spl:finish())
end
for i = 1, 4 do
testgenres()
end
for i = 1, 4 do
testartists()
end
testrandom()
$ time lua51 test.lua
BACBACBACBCABABCCABCABCABACBCABC
CABACBCABCBACABCABCABCACBABCBCAB
BACACBABCCBABCABCABCACBABCABCABC
ABCCABBCACBABCACBACBACBACBCABBAC
WXWBWBTXWJB
WJWXBTWBXWB
XWBJTWBWXWB
JWXBWBXWWBT
107915
0m0.86s real 0m0.76s user 0m0.10s system
The results are generally aesthetically pleasing. It creates and shuffles a synthetic playlist of 20000 one hit wonders and 8000 albums (which would be approximately 325 GB) in less than one second.