NumPy

Foundation library for Python scientific computing. Provides multi-dimensional arrays, linear algebra, Fourier transforms, and random number generation. Functions as dependency for almost all Python scientific computing libraries, achieving C-level high-speed computation.

PythonScientific ComputingNumerical ComputingMulti-dimensional ArraysMathematics

GitHub Overview

numpy/numpy

The fundamental package for scientific computing with Python.

Stars29,929
Watchers597
Forks11,085
Created:September 13, 2010
Language:Python
License:Other

Topics

numpypython

Star History

numpy/numpy Star History
Data as of: 7/16/2025, 11:10 AM

Framework

NumPy

Overview

NumPy is the foundational library for scientific computing in Python.

Details

NumPy (Numerical Python) was developed by Travis Oliphant in 2006 and serves as the core library for scientific computing in Python. It provides multi-dimensional array manipulation and numerical computation capabilities, functioning as the foundation for virtually all Python data science libraries. Centered around fast N-dimensional array objects (ndarray), it offers efficient numerical operations, linear algebra, Fourier transforms, random number generation, and other features implemented in C. The broadcasting feature enables intuitive mathematical operations between arrays of different shapes, and vectorization achieves processing speeds tens of times faster than pure Python loops. Major data science and machine learning libraries such as Pandas, scikit-learn, TensorFlow, and PyTorch use NumPy internally, establishing it as the absolute cornerstone of Python's data science ecosystem. It is used across a wide range of fields including numerical analysis, scientific simulation, image processing, and machine learning preprocessing.

Pros and Cons

Pros

  • High Performance: Processing tens of times faster than pure Python through C implementation
  • Memory Efficiency: Efficient memory usage and cache optimization
  • Broadcasting: Natural mathematical operations between arrays of different shapes
  • Rich Functionality: Linear algebra, Fourier transforms, statistical functions, and more
  • Ecosystem Core: High compatibility with other libraries
  • Stability: High reliability through long-term development
  • Open Source: Free to use with open source license

Cons

  • Memory Constraints: Large arrays must be loaded into memory all at once
  • Single Thread: Multi-threading is disabled by default
  • Data Type Constraints: Can only hold elements of the same data type
  • API Complexity: High learning cost for beginners due to many features
  • No GPU Support: No native GPU acceleration support
  • String Processing: Limitations in processing non-numerical data

Key Links

Code Examples

Hello World

import numpy as np

# Check NumPy version
print(f"NumPy version: {np.__version__}")

# Create arrays
arr1 = np.array([1, 2, 3, 4, 5])
print(f"1D array: {arr1}")

# 2D array
arr2 = np.array([[1, 2, 3], [4, 5, 6]])
print(f"2D array:\n{arr2}")

# Array information
print(f"Shape: {arr2.shape}")
print(f"Dimensions: {arr2.ndim}")
print(f"Data type: {arr2.dtype}")
print(f"Number of elements: {arr2.size}")

# Basic mathematical operations
result = arr1 * 2
print(f"Scalar multiplication: {result}")

# Array-to-array operations
arr3 = np.array([2, 2, 2, 2, 2])
print(f"Array addition: {arr1 + arr3}")
print(f"Array multiplication: {arr1 * arr3}")

Array Creation and Manipulation

import numpy as np

# Various ways to create arrays

# Initialize with zeros
zeros = np.zeros((3, 4))
print(f"Zero array:\n{zeros}")

# Initialize with ones
ones = np.ones((2, 3, 2))
print(f"\nOnes array shape: {ones.shape}")

# Initialize with specific value
filled = np.full((3, 3), 7)
print(f"\nArray filled with 7:\n{filled}")

# Array of consecutive values
range_arr = np.arange(0, 10, 2)
print(f"\nArithmetic sequence: {range_arr}")

# Evenly spaced array
linspace_arr = np.linspace(0, 1, 5)
print(f"Evenly spaced: {linspace_arr}")

# Identity matrix
identity = np.eye(4)
print(f"\nIdentity matrix:\n{identity}")

# Random array
np.random.seed(42)  # For reproducibility
random_arr = np.random.random((3, 3))
print(f"\nRandom array:\n{random_arr}")

# Normal distribution random numbers
normal_arr = np.random.normal(0, 1, (2, 3))
print(f"\nNormal distribution random:\n{normal_arr}")

# Array reshaping
reshaped = range_arr.reshape(2, 2)
print(f"\nAfter reshaping:\n{reshaped}")

# Array concatenation
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
concatenated = np.concatenate([arr1, arr2])
print(f"\nConcatenated: {concatenated}")

# Stack operations
stacked_v = np.vstack([arr1, arr2])  # Vertical
stacked_h = np.hstack([arr1, arr2])  # Horizontal
print(f"\nVertical stack:\n{stacked_v}")
print(f"Horizontal stack: {stacked_h}")

Mathematical Operations and Statistics

import numpy as np

# Sample data
np.random.seed(42)
data = np.random.normal(50, 15, (5, 4))
print(f"Sample data:\n{data}")

# Basic statistics
print(f"\nMean: {np.mean(data):.2f}")
print(f"Median: {np.median(data):.2f}")
print(f"Standard deviation: {np.std(data):.2f}")
print(f"Variance: {np.var(data):.2f}")
print(f"Minimum: {np.min(data):.2f}")
print(f"Maximum: {np.max(data):.2f}")

# Axis-wise statistics
print(f"\nColumn-wise mean: {np.mean(data, axis=0)}")
print(f"Row-wise mean: {np.mean(data, axis=1)}")

# Mathematical functions
angles = np.array([0, np.pi/4, np.pi/2, np.pi])
print(f"\nAngles: {angles}")
print(f"Sin values: {np.sin(angles)}")
print(f"Cos values: {np.cos(angles)}")

# Exponential and logarithmic functions
x = np.array([1, 2, 3, 4, 5])
print(f"\nExponential: {np.exp(x)}")
print(f"Natural log: {np.log(x)}")
print(f"Base-10 log: {np.log10(x)}")

# Cumulative sum and product
print(f"\nCumulative sum: {np.cumsum(x)}")
print(f"Cumulative product: {np.cumprod(x)}")

# Rounding operations
float_data = np.array([1.23, 4.56, 7.89])
print(f"\nOriginal data: {float_data}")
print(f"Rounded: {np.round(float_data, 1)}")
print(f"Ceiling: {np.ceil(float_data)}")
print(f"Floor: {np.floor(float_data)}")

Array Indexing and Slicing

import numpy as np

# Create 2D array
arr = np.arange(24).reshape(4, 6)
print(f"2D array:\n{arr}")

# Basic indexing
print(f"\nElement [0,0]: {arr[0, 0]}")
print(f"Element [2,3]: {arr[2, 3]}")
print(f"Last element: {arr[-1, -1]}")

# Row and column extraction
print(f"\nFirst row: {arr[0, :]}")
print(f"First column: {arr[:, 0]}")
print(f"Last row: {arr[-1, :]}")

# Slicing
print(f"\nFirst 2 rows: \n{arr[:2, :]}")
print(f"First 3 columns: \n{arr[:, :3]}")
print(f"Center portion: \n{arr[1:3, 2:5]}")

# Slicing with steps
print(f"\nEvery other row: \n{arr[::2, :]}")
print(f"Every other column: \n{arr[:, ::2]}")
print(f"Reversed: \n{arr[::-1, ::-1]}")

# Conditional indexing
mask = arr > 10
print(f"\nElements > 10: {arr[mask]}")

# Multiple conditions
complex_mask = (arr > 5) & (arr < 15)
print(f"Elements between 5 and 15: {arr[complex_mask]}")

# Using where function
result = np.where(arr > 10, arr, 0)  # Keep elements > 10, otherwise 0
print(f"\nWhere result:\n{result}")

# Fancy indexing
rows = np.array([0, 2, 3])
cols = np.array([1, 3, 5])
print(f"\nElements at specified positions: {arr[rows, cols]}")

Linear Algebra and Matrix Operations

import numpy as np

# Create matrices
A = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])
B = np.array([[9, 8, 7],
              [6, 5, 4],
              [3, 2, 1]])

print(f"Matrix A:\n{A}")
print(f"\nMatrix B:\n{B}")

# Matrix addition and subtraction
print(f"\nA + B:\n{A + B}")
print(f"\nA - B:\n{A - B}")

# Element-wise multiplication
print(f"\nA * B (element-wise):\n{A * B}")

# Matrix multiplication (mathematical)
C = np.dot(A, B)
print(f"\nnp.dot(A, B):\n{C}")

# Matrix multiplication using @ operator
D = A @ B
print(f"\nA @ B:\n{D}")

# Transpose
print(f"\nTranspose of A:\n{A.T}")

# Determinant
det_A = np.linalg.det(A)
print(f"\nDeterminant of A: {det_A:.6f}")

# Inverse matrix (for regular matrices)
regular_matrix = np.array([[1, 2], [3, 4]])
inv_matrix = np.linalg.inv(regular_matrix)
print(f"\nRegular matrix:\n{regular_matrix}")
print(f"Inverse matrix:\n{inv_matrix}")

# Eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(regular_matrix)
print(f"\nEigenvalues: {eigenvalues}")
print(f"Eigenvectors:\n{eigenvectors}")

# Solving linear equations
# Ax = b form
coeff_matrix = np.array([[2, 1], [1, 3]])
constants = np.array([3, 4])
solution = np.linalg.solve(coeff_matrix, constants)
print(f"\nLinear equation solution: {solution}")

# Norms
vector = np.array([3, 4])
print(f"\nVector: {vector}")
print(f"L2 norm: {np.linalg.norm(vector)}")
print(f"L1 norm: {np.linalg.norm(vector, 1)}")

Broadcasting and Vectorization

import numpy as np

# Broadcasting examples
print("Broadcasting examples:")

# Scalar and array
scalar = 5
array_1d = np.array([1, 2, 3, 4])
result1 = scalar * array_1d
print(f"Scalar * 1D array: {result1}")

# 1D array and 2D array
array_2d = np.array([[1, 2, 3, 4],
                     [5, 6, 7, 8]])
result2 = array_1d + array_2d
print(f"\n1D + 2D array:\n{result2}")

# Arrays of different shapes
array_col = np.array([[1], [2]])  # (2, 1)
array_row = np.array([10, 20, 30])  # (3,)
result3 = array_col + array_row
print(f"\nColumn vector + row vector:\n{result3}")

# Vectorization examples
print("\n\nVectorization examples:")

# Non-vectorized version (slow)
def slow_square(arr):
    result = []
    for x in arr:
        result.append(x**2)
    return np.array(result)

# Vectorized version (fast)
def fast_square(arr):
    return arr**2

# Performance comparison data
large_array = np.arange(1000000)

# Check results
small_test = np.array([1, 2, 3, 4, 5])
print(f"Original array: {small_test}")
print(f"Non-vectorized: {slow_square(small_test)}")
print(f"Vectorized: {fast_square(small_test)}")

# Vectorizing custom functions
def custom_function(x):
    return x**2 + 2*x + 1

# Using NumPy's vectorize function
vectorized_func = np.vectorize(custom_function)

test_array = np.array([1, 2, 3, 4])
print(f"\nCustom function vectorized: {vectorized_func(test_array)}")

# Direct vectorization (faster)
direct_result = test_array**2 + 2*test_array + 1
print(f"Direct vectorization: {direct_result}")

File I/O and Data Saving

import numpy as np

# Create sample data
data = np.random.rand(5, 3)
print(f"Sample data:\n{data}")

# Save in NumPy binary format
# np.save('data.npy', data)
# loaded_data = np.load('data.npy')
# print(f"\nLoaded data:\n{loaded_data}")

# Save multiple arrays
data2 = np.arange(10)
data3 = np.ones((2, 2))
# np.savez('multiple_arrays.npz', first=data, second=data2, third=data3)
# loaded_multiple = np.load('multiple_arrays.npz')
# print(f"\nMultiple arrays keys: {list(loaded_multiple.keys())}")

# Save in text format
# np.savetxt('data.txt', data, delimiter=',')
# loaded_text = np.loadtxt('data.txt', delimiter=',')
# print(f"\nLoaded from text:\n{loaded_text}")

# Save as CSV (with header)
header = 'col1,col2,col3'
# np.savetxt('data.csv', data, delimiter=',', header=header, comments='')

# Memory-mapped files (for large files)
# large_array = np.arange(1000000)
# np.save('large_file.npy', large_array)
# memory_mapped = np.load('large_file.npy', mmap_mode='r')  # Read-only
# print(f"\nMemory-mapped file shape: {memory_mapped.shape}")

print("File operations are commented out")

# Byte order and data type control
array_int32 = np.array([1, 2, 3], dtype=np.int32)
array_float64 = np.array([1.0, 2.0, 3.0], dtype=np.float64)
print(f"\nint32 array: {array_int32}, dtype: {array_int32.dtype}")
print(f"float64 array: {array_float64}, dtype: {array_float64.dtype}")

# Data type conversion
converted = array_int32.astype(np.float32)
print(f"After conversion: {converted}, dtype: {converted.dtype}")

# Direct binary data manipulation
bytes_data = array_int32.tobytes()
print(f"\nByte data length: {len(bytes_data)}")
restored = np.frombuffer(bytes_data, dtype=np.int32)
print(f"Restored array: {restored}")