NumPy is a powerful library for numerical computing in Python. It provides support for arrays, matrices, and many mathematical functions. Here’s a detailed guide from basics to advanced topics in NumPy, along with examples, best practices, and standard coding structures.

Basics of NumPy

1. Installation

To use NumPy, you need to install it first. You can install it using pip:

pip install numpy
Importing NumPy

Import NumPy as follows:

import numpy as np

2. NumPy Arrays

Creating Arrays

You can create a NumPy array using the array function:

import numpy as np

# Creating a 1D array
arr1 = np.array([1, 2, 3, 4, 5])

# Creating a 2D array
arr2 = np.array([[1, 2, 3], [4, 5, 6]])

print("1D Array:", arr1)
print("2D Array:", arr2)
Output:
1D Array: [1 2 3 4 5]
2D Array: 
[[1 2 3]
 [4 5 6]]
Array Attributes

NumPy arrays have several attributes that give information about the array.

print("Shape of arr1:", arr1.shape)
print("Shape of arr2:", arr2.shape)
print("Number of dimensions of arr2:", arr2.ndim)
print("Data type of arr2:", arr2.dtype)
print("Size of arr2:", arr2.size)
Output:
Shape of arr1: (5,)
Shape of arr2: (2, 3)
Number of dimensions of arr2: 2
Data type of arr2: int64
Size of arr2: 6

3. Array Initialization

NumPy provides several functions to create arrays:

# Array of zeros
zeros = np.zeros((2, 3))

# Array of ones
ones = np.ones((2, 3))

# Array with random values
rand = np.random.rand(2, 3)

# Array with a range of values
range_arr = np.arange(10)

# Array with values spaced evenly on a log scale
logspace_arr = np.logspace(1, 2, 10)

print("Zeros:\n", zeros)
print("Ones:\n", ones)
print("Random values:\n", rand)
print("Range array:\n", range_arr)
print("Logspace array:\n", logspace_arr)
Output:
Zeros:
 [[0. 0. 0.]
 [0. 0. 0.]]
Ones:
 [[1. 1. 1.]
 [1. 1. 1.]]
Random values:
 [[0.14279797 0.1727033  0.73141483]
 [0.58674656 0.38317635 0.20128951]]
Range array:
 [0 1 2 3 4 5 6 7 8 9]
Logspace array:
 [ 10.          12.91549665  16.68100537  21.5443469   27.82559402
  35.93813664  46.41588834  59.94842503  77.42636827 100.        ]

4. Basic Operations

Arithmetic Operations

Arithmetic operations can be performed on NumPy arrays element-wise.

arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([10, 20, 30, 40, 50])

# Addition
add = arr1 + arr2

# Subtraction
sub = arr1 - arr2

# Multiplication
mul = arr1 * arr2

# Division
div = arr1 / arr2

print("Addition:", add)
print("Subtraction:", sub)
print("Multiplication:", mul)
print("Division:", div)
Output:
Addition: [11 22 33 44 55]
Subtraction: [ -9 -18 -27 -36 -45]
Multiplication: [ 10  40  90 160 250]
Division: [0.1 0.1 0.1 0.1 0.1]
Statistical Operations

NumPy provides various functions to perform statistical operations.

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Minimum and Maximum
min_val = np.min(arr)
max_val = np.max(arr)

# Mean
mean_val = np.mean(arr)

# Standard Deviation
std_val = np.std(arr)

# Sum
sum_val = np.sum(arr)

print("Min:", min_val)
print("Max:", max_val)
print("Mean:", mean_val)
print("Standard Deviation:", std_val)
print("Sum:", sum_val)
Output:
Min: 1
Max: 10
Mean: 5.5
Standard Deviation: 2.8722813232690143
Sum: 55

Advanced NumPy

1. Broadcasting

Broadcasting allows NumPy to perform element-wise operations on arrays of different shapes.

arr1 = np.array([1, 2, 3])
arr2 = np.array([[1], [2], [3]])

# Broadcasting example
result = arr1 + arr2

print("Broadcasting Result:\n", result)
Output:
Broadcasting Result:
 [[2 3 4]
 [3 4 5]
 [4 5 6]]

2. Reshaping Arrays

You can change the shape of an array using the reshape function.

arr = np.arange(1, 13)

# Reshape to 3x4 array
reshaped_arr = arr.reshape((3, 4))

print("Original Array:", arr)
print("Reshaped Array:\n", reshaped_arr)
Output:
Original Array: [ 1  2  3  4  5  6  7  8  9 10 11 12]
Reshaped Array:
 [[ 1  2  3  4]
  [ 5  6  7  8]
  [ 9 10 11 12]]

3. Stacking and Splitting Arrays

You can stack and split arrays using various functions.

arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

# Horizontal Stack
hstacked = np.hstack((arr1, arr2))

# Vertical Stack
vstacked = np.vstack((arr1, arr2))

# Split arrays
hsplit_arr = np.hsplit(hstacked, 2)
vsplit_arr = np.vsplit(vstacked, 2)

print("Horizontally Stacked:\n", hstacked)
print("Vertically Stacked:\n", vstacked)
print("Horizontally Split:", hsplit_arr)
print("Vertically Split:", vsplit_arr)
Output:
Horizontally Stacked:
 [[1 2 5 6]
  [3 4 7 8]]
Vertically Stacked:
 [[1 2]
  [3 4]
  [5 6]
  [7 8]]
Horizontally Split: [array([[1, 2],
       [3, 4]]), array([[5, 6],
       [7, 8]])]
Vertically Split: [array([[1, 2],
       [3, 4]]), array([[5, 6],
       [7, 8]])]

4. Linear Algebra Operations

NumPy provides support for linear algebra operations.

from numpy.linalg import inv, eig

matrix = np.array([[1, 2], [3, 4]])

# Matrix Inverse
inv_matrix = inv(matrix)

# Eigenvalues and Eigenvectors
eigenvalues, eigenvectors = eig(matrix)

print("Original Matrix:\n", matrix)
print("Inverse Matrix:\n", inv_matrix)
print("Eigenvalues:", eigenvalues)
print("Eigenvectors:\n", eigenvectors)
Output:
Original Matrix:
 [[1 2]
  [3 4]]
Inverse Matrix:
 [[-2.   1. ]
  [ 1.5 -0.5]]
Eigenvalues: [-0.37228132  5.37228132]
Eigenvectors:
 [[-0.82456484 -0.41597356]
  [ 0.56576746 -0.90937671]]

Best Practices

  • Use Vectorized Operations: Avoid loops; use NumPy’s built-in functions and vectorized operations for better performance.
  • Memory Management: Be mindful of array sizes and data types to optimize memory usage.
  • Avoid Copying Data: Use views instead of copies whenever possible to save memory and improve performance.
  • Use Broadcasting: Leverage broadcasting to perform operations on arrays of different shapes without needing to replicate data.Follow PEP 8: Write clean and readable code by following the Python PEP 8 style guide.

Example Explanation

Let’s walk through a complete example that utilizes various NumPy features to solve a problem:

Problem: Calculate the pairwise distances between points in a 2D space.

import numpy as np

# Define points in 2D space
points = np.array([[1, 2], [3, 4], [5, 6]])

# Calculate pairwise distances
# (x2 - x1)^2 + (y2 - y1)^2
diff = points[:, np.newaxis, :] - points[np.newaxis, :, :]
squared_diff = diff ** 2
distances = np.sqrt(squared_diff.sum(axis=2))

print("Points:\n", points)
print("Pairwise Distances:\n", distances)
Output:
Points:
 [[1 2]
  [3 4]
  [5 6]]
Pairwise Distances:
 [[0.         2.82842712 5.65685425]
 [2.82842712 0.         2.82842712]
 [5.65685425 2.82842712 0.        ]]

Explanation

  • Define Points: We define a 2D array points where each row is a point in 2D space.
  • Calculate Differences: We calculate the differences between each pair of points using broadcasting.
  • Square Differences: We square the differences to prepare for distance calculation.
  • Sum and Square Root: We sum the squared differences along the appropriate axis and take the square root to get the distances.

This example demonstrates the power and efficiency of NumPy in handling numerical computations.