The collections
module in Python is part of the standard library and provides alternatives to Python’s general-purpose built-in containers like dict, list, set, and tuple. It includes specialized container datatypes that provide more functionality and ease of use for certain tasks. This guide covers the primary classes and functions available in the collections
module, along with practical examples to illustrate their usage.
Overview of the collections
Module
The collections
module includes the following key classes and functions:
namedtuple()
: Factory function for creating tuple subclasses with named fields.deque
: List-like container with fast appends and pops on either end.ChainMap
: Dictionary-like class for creating a single view of multiple mappings.Counter
: Dictionary subclass for counting hashable objects.OrderedDict
: Dictionary subclass that remembers the order entries were added.defaultdict
: Dictionary subclass that calls a factory function to supply missing values.UserDict
,UserList
,UserString
: Wrapper classes that make it easier to create custom dictionary, list, and string subclasses.
Importing the Module
Before using the collections
module, you need to import it:
import collections
Using namedtuple()
The namedtuple()
function returns a new tuple subclass with named fields. It can be used to create simple classes that are immutable and lightweight.
Example
import collections
# Create a Point namedtuple
Point = collections.namedtuple('Point', ['x', 'y'])
# Instantiate a Point object
p = Point(10, 20)
print(p) # Output: Point(x=10, y=20)
# Accessing fields
print(p.x) # Output: 10
print(p.y) # Output: 20
Using deque
The deque
(double-ended queue) is a list-like container with fast appends and pops from both ends. It is useful for implementing queues and stacks.
Example
import collections
# Create a deque
d = collections.deque([1, 2, 3])
# Append to the right
d.append(4)
print(d) # Output: deque([1, 2, 3, 4])
# Append to the left
d.appendleft(0)
print(d) # Output: deque([0, 1, 2, 3, 4])
# Pop from the right
d.pop()
print(d) # Output: deque([0, 1, 2, 3])
# Pop from the left
d.popleft()
print(d) # Output: deque([1, 2, 3])
Using ChainMap
The ChainMap
class groups multiple dictionaries into a single view. This can be useful for managing contexts or scopes.
Example
import collections
# Create two dictionaries
dict1 = {'a': 1, 'b': 2}
dict2 = {'b': 3, 'c': 4}
# Create a ChainMap
chain = collections.ChainMap(dict1, dict2)
print(chain) # Output: ChainMap({'a': 1, 'b': 2}, {'b': 3, 'c': 4})
# Accessing values
print(chain['a']) # Output: 1
print(chain['b']) # Output: 2 (from the first dictionary)
# Modifying values
chain['b'] = 5
print(dict1) # Output: {'a': 1, 'b': 5}
Using Counter
The Counter
class is a dictionary subclass that counts hashable objects. It is useful for tallying occurrences of items.
Example
import collections
# Create a Counter
c = collections.Counter(['apple', 'banana', 'apple', 'orange', 'banana', 'banana'])
print(c) # Output: Counter({'banana': 3, 'apple': 2, 'orange': 1})
# Accessing counts
print(c['banana']) # Output: 3
print(c['apple']) # Output: 2
# Updating counts
c.update(['apple', 'apple', 'banana'])
print(c) # Output: Counter({'banana': 4, 'apple': 4, 'orange': 1})
# Getting the most common elements
print(c.most_common(2)) # Output: [('banana', 4), ('apple', 4)]
Using OrderedDict
The OrderedDict
class is a dictionary subclass that remembers the order entries were added. This can be useful for tasks that require maintaining insertion order.
Example
import collections
# Create an OrderedDict
od = collections.OrderedDict()
od['a'] = 1
od['b'] = 2
od['c'] = 3
print(od) # Output: OrderedDict([('a', 1), ('b', 2), ('c', 3)])
# Accessing values
print(od['b']) # Output: 2
# Adding a new entry
od['d'] = 4
print(od) # Output: OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])
Using defaultdict
The defaultdict
class is a dictionary subclass that calls a factory function to supply missing values. This is useful when you want to avoid key errors and provide default values.
Example
import collections
# Create a defaultdict with a default factory function
dd = collections.defaultdict(int)
dd['a'] += 1
dd['b'] += 2
print(dd) # Output: defaultdict(<class 'int'>, {'a': 1, 'b': 2})
# Create a defaultdict with a list as the default factory
dd_list = collections.defaultdict(list)
dd_list['a'].append(1)
dd_list['b'].append(2)
print(dd_list) # Output: defaultdict(<class 'list'>, {'a': [1], 'b': [2]})
Using UserDict
, UserList
, UserString
These classes act as wrappers around dictionary, list, and string objects, making it easier to create custom container types by subclassing them.
Example
import collections
# Custom dictionary subclass
class MyDict(collections.UserDict):
def __setitem__(self, key, value):
print(f'Setting {key} to {value}')
super().__setitem__(key, value)
md = MyDict()
md['a'] = 1 # Output: Setting a to 1
print(md) # Output: {'a': 1}
Practical Examples
Example 1: Counting Words in a Text
Using Counter
to count the occurrences of each word in a text.
import collections
text = "the quick brown fox jumps over the lazy dog the quick brown fox"
words = text.split()
word_count = collections.Counter(words)
print(word_count)
# Output: Counter({'the': 2, 'quick': 2, 'brown': 2, 'fox': 2, 'jumps': 1, 'over': 1, 'lazy': 1, 'dog': 1})
Example 2: Maintaining an Access Order
Using OrderedDict
to keep track of the order in which keys are accessed.
import collections
class LRUCache:
def __init__(self, capacity: int):
self.cache = collections.OrderedDict()
self.capacity = capacity
def get(self, key: int) -> int:
if key not in self.cache:
return -1
self.cache.move_to_end(key)
return self.cache[key]
def put(self, key: int, value: int) -> None:
if key in self.cache:
self.cache.move_to_end(key)
self.cache[key] = value
if len(self.cache) > self.capacity:
self.cache.popitem(last=False)
# Example usage
lru_cache = LRUCache(2)
lru_cache.put(1, 1)
lru_cache.put(2, 2)
print(lru_cache.get(1)) # Output: 1
lru_cache.put(3, 3)
print(lru_cache.get(2)) # Output: -1 (removed due to capacity)
The collections
module in Python provides a variety of specialized container datatypes that offer more functionality and ease of use compared to general-purpose built-in containers. By leveraging these classes and functions, you can write more efficient, readable, and maintainable code for a wide range of applications. Whether you need a dictionary that maintains insertion order, a counter for tallying occurrences, or a custom container type, the collections
module has you covered.