Unit-VI: Data Analysis with Python

Lecture 6: Data Visualization with Matplotlib

Estimated: 60-75 minutes

1. Introduction to Matplotlib

Matplotlib is a powerful 2D and 3D plotting library for Python.

import matplotlib.pyplot as plt
import numpy as np

# Create data
x = np.linspace(0, 10, 100)
y = np.sin(x)

# Create a basic plot
plt.figure(figsize=(8, 4))
plt.plot(x, y, 'b-', label='sin(x)')
plt.title('Sine Wave')
plt.xlabel('X')
plt.ylabel('sin(X)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
2. Basic Plot Types
Line Plot
Scatter
Bar
# Line plot with multiple lines
x = np.linspace(0, 10, 100)
plt.figure(figsize=(8, 4))
plt.plot(x, np.sin(x), 'b-', label='sin(x)')
plt.plot(x, np.cos(x), 'r--', label='cos(x)')
plt.title('Trigonometric Functions')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
# Scatter plot with color mapping
np.random.seed(42)
x = np.random.rand(50)
y = np.random.rand(50)
colors = np.random.rand(50)
sizes = 1000 * np.random.rand(50)

plt.figure(figsize=(8, 6))
plt.scatter(x, y, c=colors, s=sizes, alpha=0.5, cmap='viridis')
plt.colorbar(label='Color intensity')
plt.title('Scatter Plot')
plt.grid(True, alpha=0.3)
plt.show()
# Bar plot
categories = ['A', 'B', 'C', 'D']
values = [3, 7, 4, 2]

plt.figure(figsize=(8, 4))
plt.bar(categories, values, color=['#3498db', '#2ecc71', '#e74c3c', '#f39c12'])
plt.title('Bar Chart')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.show()
3. Working with Subplots
# Create a 2x2 grid of subplots
fig, axs = plt.subplots(2, 2, figsize=(10, 8))
fig.suptitle('Subplot Example', fontsize=16)

# Plot 1: Line plot
x = np.linspace(0, 10, 100)
axs[0, 0].plot(x, np.sin(x), 'b-')
axs[0, 0].set_title('Sine Wave')

# Plot 2: Scatter plot
x = np.random.rand(50)
y = np.random.rand(50)
axs[0, 1].scatter(x, y, c='r')
axs[0, 1].set_title('Random Points')

# Plot 3: Bar plot
categories = ['A', 'B', 'C']
values = [3, 7, 4]
axs[1, 0].bar(categories, values, color='g')
axs[1, 0].set_title('Bar Chart')

# Plot 4: Histogram
data = np.random.normal(0, 1, 1000)
axs[1, 1].hist(data, bins=30, color='purple', alpha=0.7)
axs[1, 1].set_title('Histogram')

# Adjust layout
plt.tight_layout()
plt.show()
4. Saving Figures
# Create a simple plot
plt.figure(figsize=(8, 4))
plt.plot([1, 2, 3, 4], [1, 4, 9, 16], 'b-')
plt.title('Simple Plot')

# Save in different formats
plt.savefig('plot.png', dpi=300, bbox_inches='tight')
plt.savefig('plot.pdf')
plt.savefig('plot.jpg', quality=95)

print("Plots saved as 'plot.png', 'plot.pdf', and 'plot.jpg'")
5. Practice Exercise

Create a visualization using the following dataset:

import pandas as pd
import numpy as np

# Sample data
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
temperature = [12, 15, 18, 22, 25, 28]
rainfall = [80, 70, 60, 45, 40, 35]

# Create a figure with two y-axes
fig, ax1 = plt.subplots(figsize=(10, 5))

# Plot temperature on primary y-axis
color = 'tab:red'
ax1.set_xlabel('Month')
ax1.set_ylabel('Temperature (°C)', color=color)
ax1.plot(months, temperature, color=color, marker='o')
ax1.tick_params(axis='y', labelcolor=color)

# Create a second y-axis for rainfall
ax2 = ax1.twinx()
color = 'tab:blue'
ax2.set_ylabel('Rainfall (mm)', color=color)
ax2.bar(months, rainfall, color=color, alpha=0.3, width=0.5)
ax2.tick_params(axis='y', labelcolor=color)

plt.title('Monthly Temperature and Rainfall')
plt.tight_layout()
plt.show()

Your tasks:

  1. Customize the plot with better colors and styling
  2. Add a legend
  3. Add grid lines
  4. Save the figure as 'weather_plot.png'