NumPy, short for Numerical Python, is an open-source library fundamental to scientific computing in Python. With its powerful n-dimensional array object, it allows users to perform a variety of mathematical operations in an efficient and straightforward manner. This guide aims to provide a thorough understanding of NumPy, from its basics to advanced features.
Key Features
- n-dimensional array object: Numpy’s array object
ndarray
is more efficient and convenient than Python’s built-in lists. - Broadcasting: A powerful mechanism for performing arithmetic operations on arrays of different shapes.
- Tools for integrating C/C++ and Fortran code.
- Useful linear algebra, random number generation, and Fourier transform functions.
Installation
Installing NumPy is straightforward using package managers such as pip or conda.
Or, using conda:
conda install numpy
NumPy Arrays
Creating Arrays
NumPy arrays can be created from lists, tuples, or using built-in NumPy functions.
import numpy as np
# From a list
arr = np.array([1, 2, 3, 4])
print(arr)
# Using built-in functions
zeros_array = np.zeros((2, 3))
ones_array = np.ones((2, 3))
arange_array = np.arange(10)
linspace_array = np.linspace(0, 1, 5)
Array Attributes
NumPy arrays come with a variety of useful attributes.
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.ndim) # Number of dimensions
print(arr.shape) # Shape of the array
print(arr.size) # Total number of elements
print(arr.dtype) # Data type of the elements
print(arr.itemsize) # Size in bytes of each element
print(arr.nbytes) # Total size in bytes
Indexing and Slicing
Just like Python lists, NumPy arrays can be indexed and sliced.
arr = np.array([1, 2, 3, 4, 5])
print(arr[0]) # First element
print(arr[-1]) # Last element
print(arr[1:4]) # Slice from index 1 to 3
For multi-dimensional arrays, indexing and slicing work similarly but with multiple indices.
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr[0, 0]) # Element at first row, first column
print(arr[:, 1]) # All rows, second column
print(arr[1, :]) # Second row, all columns
Operations on Arrays
NumPy supports element-wise operations and broadcasting.
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
print(arr1 + arr2) # Element-wise addition
print(arr1 - arr2) # Element-wise subtraction
print(arr1 * arr2) # Element-wise multiplication
print(arr1 / arr2) # Element-wise division
print(arr1 ** 2) # Element-wise exponentiation
Universal Functions (ufuncs)
Universal functions, or ufuncs, are functions that operate element-wise on arrays. They are a key feature of NumPy for vectorized operations.
arr = np.array([1, 2, 3, 4, 5])
print(np.sqrt(arr)) # Square root
print(np.exp(arr)) # Exponential
print(np.sin(arr)) # Sine
print(np.log(arr)) # Natural logarithm
Array Manipulation
NumPy provides functions for reshaping, stacking, and splitting arrays.
Reshaping Arrays
arr = np.arange(12)
reshaped_arr = arr.reshape((3, 4)) # Reshape to 3x4 array
print(reshaped_arr)
Stacking and Splitting Arrays
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
# Stacking arrays
vstacked = np.vstack((arr1, arr2)) # Vertical stack
hstacked = np.hstack((arr1, arr2)) # Horizontal stack
print(vstacked)
print(hstacked)
# Splitting arrays
split_arr = np.hsplit(arr1, 2) # Horizontal split
print(split_arr)
Linear Algebra with NumPy
NumPy includes functions for linear algebra operations.
arr = np.array([[1, 2], [3, 4]])
# Matrix multiplication
print(np.dot(arr, arr))
# Determinant
print(np.linalg.det(arr))
# Inverse
print(np.linalg.inv(arr))
# Eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(arr)
print(eigenvalues)
print(eigenvectors)
Random Number Generation
NumPy’s random
module provides functions for generating random numbers.
# Generate random numbers
rand_arr = np.random.rand(5) # Uniformly distributed over [0, 1)
randn_arr = np.random.randn(5) # Standard normal distribution
randint_arr = np.random.randint(1, 10, 5) # Random integers
Broadcasting
Broadcasting allows operations on arrays of different shapes.
arr = np.array([1, 2, 3])
scalar = 2
# Broadcasting a scalar
print(arr + scalar) # [3, 4, 5]
# Broadcasting with arrays
arr2 = np.array([[1], [2], [3]])
print(arr + arr2) # [[2, 3, 4], [3, 4, 5], [4, 5, 6]]
File I/O with NumPy
NumPy can read from and write to files.
# Saving and loading arrays
arr = np.array([1, 2, 3, 4, 5])
np.save('array.npy', arr)
loaded_arr = np.load('array.npy')
print(loaded_arr)
# Saving and loading text files
np.savetxt('array.txt', arr)
loaded_txt_arr = np.loadtxt('array.txt')
print(loaded_txt_arr)
Practical Applications
NumPy is used in a wide range of applications in science and industry.
Data Analysis
NumPy forms the foundation for data analysis libraries like pandas.
import pandas as pd
data = np.random.rand(100, 3)
df = pd.DataFrame(data, columns=['A', 'B', 'C'])
print(df.describe())
Machine Learning
NumPy is integral to machine learning libraries like scikit-learn and TensorFlow.
from sklearn.linear_model import LinearRegression
# Example dataset
X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
y = np.dot(X, np.array([1, 2])) + 3
# Fit model
model = LinearRegression().fit(X, y)
print(model.coef_)
print(model.intercept_)
Scientific Computing
NumPy is used extensively in scientific computing for simulations and modeling.
from scipy.integrate import odeint
# Example: Simple harmonic oscillator
def model(y, t):
return [y[1], -y[0]]
y0 = [1.0, 0.0]
t = np.linspace(0, 10, 100)
sol = odeint(model, y0, t)
import matplotlib.pyplot as plt
plt.plot(t, sol[:, 0])
plt.xlabel('time')
plt.ylabel('y(t)')
plt.show()
NumPy is a powerful library that is essential for numerical and scientific computing in Python. Its efficient handling of large arrays, comprehensive mathematical functions, and ease of integration with other scientific libraries make it a cornerstone of the Python scientific ecosystem.