Unit-VI: Data Analysis with Python

Lecture 2: NumPy Operations, Broadcasting, and Functions

Estimated: 40-50 minutes

1. Array Operations

NumPy supports element-wise operations on arrays of the same shape:

import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Element-wise operations
print("Addition:", a + b)        # [5 7 9]
print("Subtraction:", b - a)     # [3 3 3]
print("Multiplication:", a * b)  # [ 4 10 18]
print("Division:", b / a)        # [4.  2.5 2. ]
print("Exponentiation:", a ** 2) # [1 4 9]
Note: These operations are performed element-wise, unlike matrix multiplication.
2. Broadcasting

NumPy can perform operations on arrays of different shapes through broadcasting:

# Array and scalar
arr = np.array([[1, 2, 3], [4, 5, 6]])
print("Array + 2:\n", arr + 2)

# Arrays with different dimensions
a = np.array([[1, 2, 3]])
b = np.array([[1], [2], [3]])
print("\nBroadcasted addition:\n", a + b)
Broadcasting Rules:
  1. Dimensions are compared from right to left
  2. Dimensions must be equal or one of them must be 1
  3. Arrays with fewer dimensions are padded with 1's on the left
3. Universal Functions (ufuncs)

NumPy provides fast element-wise operations through universal functions:

Math
Trigonometric
Exponents
arr = np.array([1.2, 2.3, 3.4])

print("Absolute:", np.abs([-1, -2, -3]))
print("Square root:", np.sqrt(arr))
print("Ceiling:", np.ceil(arr))    # [2. 3. 4.]
print("Floor:", np.floor(arr))     # [1. 2. 3.]
print("Round:", np.round(arr))     # [1. 2. 3.]
print("Log (base e):", np.log(arr))
print("Log (base 10):", np.log10(arr))
angles = np.array([0, 30, 45, 60, 90])
radians = np.deg2rad(angles)  # Convert to radians

print("Sin:", np.sin(radians))
print("Cos:", np.cos(radians))
print("Tan:", np.tan(radians))
print("Arcsin:", np.arcsin([0, 0.5, 1]))
print("Arccos:", np.arccos([0, 0.5, 1]))
print("Arctan:", np.arctan([0, 1, np.sqrt(3)]))
arr = np.array([1, 2, 3])

print("e^x:", np.exp(arr))
print("2^x:", np.exp2(arr))
print("3^x:", np.power(3, arr))
print("Natural log:", np.log(arr))
print("Log base 2:", np.log2(arr))
print("Log base 10:", np.log10(arr))
4. Statistical Functions

NumPy provides various statistical operations:

data = np.array([[3, 1, 4], [1, 5, 9], [2, 6, 5]])

print("Sum:", np.sum(data))
print("Sum along rows:", np.sum(data, axis=0))  # Sum of each column
print("Sum along columns:", np.sum(data, axis=1))  # Sum of each row
print("\nMean:", np.mean(data))
print("Median:", np.median(data))
print("Standard deviation:", np.std(data))
print("Variance:", np.var(data))
print("Min:", np.min(data))
print("Max:", np.max(data))
print("Index of max:", np.argmax(data))  # Index of flattened array
print("Indices of max in each row:", np.argmax(data, axis=1))
5. Sorting and Searching

Efficient sorting and searching operations:

arr = np.array([3, 1, 4, 1, 5, 9, 2, 6, 5])

# Sorting
print("Sorted array:", np.sort(arr))  # Returns new array
arr.sort()  # In-place sort
print("After in-place sort:", arr)

# Searching
print("Indices of elements > 3:", np.where(arr > 3))
print("First index of 5:", np.where(arr == 5)[0][0])
print("Count of 5s:", np.count_nonzero(arr == 5))
print("Check if any > 8:", np.any(arr > 8))
print("Check if all > 0:", np.all(arr > 0))
6. Special Functions

NumPy provides various special mathematical functions:

x = np.linspace(0, 2, 5)  # 5 points from 0 to 2

# Bessel functions
from numpy import pi, sin, cos, exp

print("Sine integral:", np.sinc(x))
print("Factorials:", np.math.factorial(5))  # 120

# Linear algebra
from numpy.linalg import inv, det, eig

a = np.array([[1, 2], [3, 4]])
print("\nMatrix inverse:\n", inv(a))
print("Determinant:", det(a))
print("Eigenvalues and eigenvectors:\n", eig(a))
7. Practice Exercise

Given the following array of student scores (out of 100) for 5 subjects:

scores = np.array([
    [85, 90, 78, 92, 88],  # Student 1
    [72, 85, 90, 75, 80],  # Student 2
    [90, 92, 88, 85, 95],  # Student 3
    [65, 70, 75, 80, 85],  # Student 4
    [95, 85, 90, 92, 98]   # Student 5
])

Calculate the following:

  1. Average score for each student
  2. Average score for each subject
  3. Highest score in each subject
  4. Number of students who scored above 85 in each subject