this is where itertools.groupby
may come in handy:
from itertools import groupby
a = ["a", "a", "b", "b", "a", "a", "c", "c"]
res = [key for key, _group in groupby(a)]
print(res) # ['a', 'b', 'a', 'c']
this is a version where you could ‘scale’ down the unique keys (but are guaranteed to have at leas one in the result):
from itertools import groupby, repeat, chain
a = ['a', 'a', 'a', 'a', 'b', 'b', 'c', 'c', 'c', 'c', 'c', 'a', 'a',
'd', 'd', 'a', 'a']
scale = 0.4
key_count = tuple((key, sum(1 for _item in group)) for key, group in groupby(a))
# (('a', 4), ('b', 2), ('c', 5), ('a', 2), ('d', 2), ('a', 2))
res = tuple(
chain.from_iterable(
(repeat(key, round(scale * count) or 1)) for key, count in key_count
)
)
# ('a', 'a', 'b', 'c', 'c', 'a', 'd', 'a')
there may be smarter ways to determine the scale
(probably based on the length of the input list a
and the average group
length).
11
solved Remove some duplicates from list in python