How to plot

Authors:

Abdullah A. Fahad (a.fahad@nasa.gov) \ Tahmidul Azom Sany (tsany@gmu.edu) \ Torikul Islam Sanjid (torikul.sanjid@gmail.com) \ Md Tashin Ahammad (tashinahammad.03@gmail.com) \

Basic Plotting with Matplotlib (Scatterplot, Barplot, Heatmap, Boxplot, etc.) Cartopy and Map plotting Plotting Contourplot using Precipitation data Choosing colorbars Wind direction (u and v) Making a Skew-T with wind barbs using MetPy Color maps in Matplotlib and creating customized color bar Interactive plotting

Suggestions 23 june

  • global plot -- Any data (SST better) with cartopy
  • upadate omi

1. Basic Plotting with Matplotlib (Scatterplot, Barplot, Heatmap, Boxplot, etc.)

What is Matplotlib?

Matplotlib is a Python library for creating high-quality visualizations and plots of data.It is the most well-known of all the python visualization packages and provides a variety of capabilities such as:

  • Matplotlib can gernerate high-quality graphics in a number of file types including PDF, SVF, JPG, PNG, BMP and GIF.

  • Matplotlib include line plots, scatter plots, histograms, bar charts, pie charts, and box plots.

  • Additionally, it enables 3D charting, and many additional libraries, such as pandas and Seaborn, have been built on top of it to provide access to Matplotlib's capabilities with less code.

Some basic functions

Let's have a look at some of the basic function that are often used in matplotlib.

Function Description
plt.plot() Creates line plots to represent data.
plt.scatter() Generates scatter plots to display individual data points.
plt.bar() Creates bar plots to visualize categorical data.
plt.hist() Generates histograms to display the distribution of numerical data.
plt.boxplot() Creates boxplots to show the distribution and outliers of a dataset.
plt.pie() Generates pie charts to display proportions or percentages of categorical data.
plt.imshow() Displays images or heatmaps using arrays of data.
plt.show() it displays the created plots
plt.xlabel() This function is used to add a label to the x-axis of the plot.
plt.ylabel() This function is used to add a label to the y-axis of the plot.
plt.title() This function is used to add a title to the plot.
plt.legend() Adds a legend to the plot to identify plotted elements.
plt.xticks() This function is used to set the tick locations and labels for the x-axis.
plt.yticks() This function is used to set the tick locations and labels for the y-axis.
plt.grid() This function is used to add a grid to the plot.
plt.xlim() Sets the limits for the x-axis.
plt.ylim() Sets the limits for the y-axis.
plt.annotate() it is use to write comments on the graph at the specified position
plt.figure(figsize = (x, y)) whenever we want the result to be displayed in a separate window we use this command, and figsize argument decides what will be the initial size of the window that will be displayed after the run
plt.subplot(r, c, i) it is used to create multiple plots in the same figure with r signifies the no of rows in the figure, c signifies no of columns in a figure and i specifies the positioning of the particular plot

These are some common functions used in matplotlib plotting.

Importing Matplotlib

To use matplotlib we need to import it.We will use standard shorthand for matplotlib imports to make things simpler.

In [ ]:
import matplotlib.pyplot as plt

Here we have imported the pyplot module from the matplotlib library as plt.This is a common convention to simplify the usage of pyplot functions throughout the code.

Basically, we'll be using **`plt`** instead of **`matplotlib.pyplot`**

Scatterplot

Step by step guide to create a scatterplot:

**Step 1:**

Firstly, we'll import the necessary packages.We'll use matplotlib for plotting and numpy to produce data.

**Step 2:** Then we'll gernerate random data using numpy to plot.Here,we'll use Numpy's random.normal() function to generate 100 random numbers for two variables:x and y

**Step 3:** In this step we'll plot the data using matplotlib's scatter() function to create a scatter plot of the x and y data points.

Syntax of scatterplot :

matplotlib.pyplot.scatter(x, y, s=None, c=None, marker=None, cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, *, edgecolors=None, plotnonfinite=False, data=None, **kwargs)

**Step 4:** Finally, we use the show() function to display the plot.

In [ ]:
# import packages
import matplotlib.pyplot as plt
import numpy as np

# generate random data
x = np.random.normal(size=100)
y = np.random.normal(size=100)

# plot the data
plt.scatter(x, y)

# show the plot
plt.show()

Barplot

Step by step guide to create a scatterplot:

**Step 1:** Firstly, we'll import the necessary packages.We'll use matplotlib for plotting and numpy to produce data.

**Step 2:**Then we'll gernerate random data using numpy to plot.Here,we'll use Numpy's NumPy's random.randint() function to generate a list of 8 random integers between 1 and 150.

**Step 3:** In this step we'll use the bar() function from pyplot to create a bar graph. The range(len(data)) is used to generate a sequence of indices corresponding to each data point, and data represents the height of each bar. The color parameter is set to '#900C3F' to specify the color of the bars.

Syntax of bar plot:

matplotlib.pyplot.bar(x, height, width=0.8, bottom=None, *, align='center', data=None, **kwargs)

**Step 4:**Finally, we use the show() function to display the plot.

In [ ]:
#Importing packages
import matplotlib.pyplot as plt
import numpy as np
#Creating random list of 8 numbers.
data = list(np.random.randint(1, 150, 8))
#Plotting the horizontal bar.
plt.bar(range(len(data)), data, color='#900C3F')
#showing the plot
plt.show()

we can also create a horizontal bar chart.

In [ ]:
#Importing packages
import matplotlib.pyplot as plt
import numpy as np
#Creating random list of 8 numbers.
data = list(np.random.randint(1, 150, 8))
#Plotting the horizontal bar.
plt.barh(range(len(data)), data, color='#900C3F')
#showing the plot
plt.show()

Heatmap

A heatmap is a graphical representation of data where values are represented as colors in a two-dimensional grid. Heatmaps are particularly useful for visualizing patterns, correlations, or distributions in data matrices.

Let's see a heatmap

**Explanation:**

Here we see a heatmap representing the harvest data of local farmers. Each cell in the heatmap corresponds to a combination of a vegetable and a farmer, with the color intensity representing the harvest amount. The color intensity of each cell gives us the values of harvest with help of color bar included.

Let's make a simple Heatmap:

Step by step guide to create heatmap:

**Step 1:** Firstly, we'll import the necessary packages.We'll use matplotlib for plotting and numpy to produce data.

**Step 2:**Then we'll gernerate random data using numpy to plot.Here,we'll use Numpy's NumPy's np.random.rand() function to generate a 10x10 random matrix.

**Step 3:** In this step we'll use the imshow() function from pyplot to create a heatmap.The data matrix is used as the input data. The cmap parameter is set to autumn to use the autumn color map.

Syntax of Heat map:

matplotlib.pyplot.imshow(X, cmap=None, alpha=None)
X :- this is input data matrix which is to be displayed
cmap :- Colormap we use t0 dispay the heatmap
alpha :- it specifies the opacity or transpiracy of the heatmap

**Step 4:** a color bar is then added.

**Step 5:**we set the title and labels.

**Step 6:**Finally, we use the show() function to display the plot.

In [ ]:
import matplotlib.pyplot as plt
import numpy as np

# Generate random data
data = np.random.rand(10, 10)  # Generate a 10x10 random matrix

# Create a heatmap
plt.imshow(data, cmap='autumn')

# Add a color bar
plt.colorbar()

# Set title and labels
plt.title('Heatmap Example')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')

# Show the plot
plt.show()

Boxplot

A boxplot also known as a box and whisker plot is a graphical representation of the distribution of a dataset.

  • It can summarizes key statistical measures such as the median,quartiles, and potential outliers. *The boxplot provides a compact way to visualize the spread and skewness of the data.

It has the following component:

**Box:** The box represents the interquartile range (IQR), which contains the middle 50% of the data. The bottom edge of the box represents the 25th percentile (lower quartile), while the top edge represents the 75th percentile (upper quartile). The line inside the box represents the Median.

**Whiskers:** The whiskers extend from the box to the minimum and maximum non-outlier data points. They give an idea of the range of the dataset, excluding any outliers.

**Outliers:** Individual data points that are considered outliers are plotted as individual points outside the whiskers. They are typically defined as points that fall outside a certain multiple of the IQR from the edges of the box.By default, the range is set to 1.5 times the IQR.

Step by step guide to create a boxplot:

**Step 1:** Firstly, we'll import the necessary packages.We'll use matplotlib for plotting and numpy to produce data.

**Step 2:**Then we'll gernerate random dataset using numpy to plot.Here,we'll use Numpy's NumPy's random.normal() function where the loc parameter specifies the mean of the distribution, which is set to 0 in this case. The scale parameter specifies the standard deviation of the distribution, which is set to 1. The size parameter determines the number of data points in the dataset, which is set to 100.Then we added some outliers to data and created the final data.

**Step 3:** In this step we'll use the boxplot() function from pyplot. The data array is passed as the argument, which contains the dataset to be visualized.

Syntax of boxplot:

matplotlib.pyplot.boxplot(x, notch=None, sym=None, vert=None, whis=None, positions=None, widths=None, patch_artist=None, bootstrap=None, usermedians=None, conf_intervals=None, meanline=None, showmeans=None, showcaps=None, showbox=None, showfliers=None, boxprops=None, labels=None, flierprops=None, medianprops=None, meanprops=None, capprops=None, whiskerprops=None, manage_ticks=True, autorange=False, zorder=None, capwidths=None, *, data=None)

**Step 4:**We add title and labels.

**Step 5:**Finally, we use the show() function to display the plot.

In [ ]:
import matplotlib.pyplot as plt
import numpy as np

# Generating random data
data = np.random.normal(loc=0, scale=1, size=100)

# Adding outliers
outliers = np.array([-3, 3])  # Example outliers
# Final data
data_with_outliers = np.concatenate((data, outliers))

# Create a boxplot
plt.boxplot(data_with_outliers)

# Set title and labels
plt.title('Boxplot Example')
plt.xlabel('Data')

# Show the plot
plt.show()
In [ ]:

2. Cartopy and Map plotting

Cartopy is a Python package designed for geospatial data processing in order to make it easier to work with and analyze data that is related to the Earth's surface. It is built on top of several other popular scientific Python packages such as NumPy, Matplotlib, and Shapely, and provides a way to create maps, plot data on maps, and perform various geographical data analysis tasks.

Please note that when using cartopy in colab, it may sometimes crash. To avoid errors, please run the following cell. You can ignore the contents of the cell, and simply copy and paste the code whenever you need to use cartopy. After successfully installing all necessary components, please restart the runtime.

Runtime --> Restart Runtime --> do not run again after restart

In [ ]:
# Run this cell when you are in colab and then restart runtime
# Do not worry about this cell, you can always copy paste

!apt-get install libproj-dev proj-data proj-bin
!apt-get install libgeos-dev
!pip install cython
!pip install cartopy

!apt-get -qq install python-cartopy python3-cartopy
!pip uninstall -y shapely    # cartopy and shapely aren't friends (early 2020)
!pip install shapely --no-binary shapely
Reading package lists... Done
Building dependency tree       
Reading state information... Done
libproj-dev is already the newest version (7.2.1-1~focal0).
libproj-dev set to manually installed.
proj-data is already the newest version (7.2.1-1~focal0).
proj-data set to manually installed.
The following NEW packages will be installed:
  proj-bin
0 upgraded, 1 newly installed, 0 to remove and 38 not upgraded.
Need to get 170 kB of archives.
After this operation, 485 kB of additional disk space will be used.
Get:1 http://ppa.launchpad.net/ubuntugis/ppa/ubuntu focal/main amd64 proj-bin amd64 7.2.1-1~focal0 [170 kB]
Fetched 170 kB in 0s (425 kB/s)
Selecting previously unselected package proj-bin.
(Reading database ... 122541 files and directories currently installed.)
Preparing to unpack .../proj-bin_7.2.1-1~focal0_amd64.deb ...
Unpacking proj-bin (7.2.1-1~focal0) ...
Setting up proj-bin (7.2.1-1~focal0) ...
Processing triggers for man-db (2.9.1-1) ...
Reading package lists... Done
Building dependency tree       
Reading state information... Done
libgeos-dev is already the newest version (3.9.1-1~focal0).
libgeos-dev set to manually installed.
0 upgraded, 0 newly installed, 0 to remove and 38 not upgraded.
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: cython in /usr/local/lib/python3.10/dist-packages (0.29.34)
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting cartopy
  Downloading Cartopy-0.21.1.tar.gz (10.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.9/10.9 MB 86.4 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy>=1.18 in /usr/local/lib/python3.10/dist-packages (from cartopy) (1.22.4)
Requirement already satisfied: matplotlib>=3.1 in /usr/local/lib/python3.10/dist-packages (from cartopy) (3.7.1)
Requirement already satisfied: shapely>=1.6.4 in /usr/local/lib/python3.10/dist-packages (from cartopy) (2.0.1)
Collecting pyshp>=2.1 (from cartopy)
  Downloading pyshp-2.3.1-py2.py3-none-any.whl (46 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 46.5/46.5 kB 5.3 MB/s eta 0:00:00
Collecting pyproj>=3.0.0 (from cartopy)
  Downloading pyproj-3.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.7/7.7 MB 55.0 MB/s eta 0:00:00
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.1->cartopy) (1.0.7)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.1->cartopy) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.1->cartopy) (4.39.3)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.1->cartopy) (1.4.4)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.1->cartopy) (23.1)
Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.1->cartopy) (8.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.1->cartopy) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.1->cartopy) (2.8.2)
Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from pyproj>=3.0.0->cartopy) (2022.12.7)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7->matplotlib>=3.1->cartopy) (1.16.0)
Building wheels for collected packages: cartopy
  Building wheel for cartopy (pyproject.toml) ... done
  Created wheel for cartopy: filename=Cartopy-0.21.1-cp310-cp310-linux_x86_64.whl size=11102773 sha256=179a617d0278069c1ed854f1f61cebc93e56f87dade36cdce47639a96b73b842
  Stored in directory: /root/.cache/pip/wheels/30/b0/1a/1c1909e00c76653dc4e2ff48555257c0eb2d1698280c8d9955
Successfully built cartopy
Installing collected packages: pyshp, pyproj, cartopy
Successfully installed cartopy-0.21.1 pyproj-3.5.0 pyshp-2.3.1
E: Unable to locate package python-cartopy
Found existing installation: shapely 2.0.1
Uninstalling shapely-2.0.1:
  Successfully uninstalled shapely-2.0.1
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting shapely
  Downloading shapely-2.0.1.tar.gz (275 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 275.5/275.5 kB 17.4 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy>=1.14 in /usr/local/lib/python3.10/dist-packages (from shapely) (1.22.4)
Building wheels for collected packages: shapely
  Building wheel for shapely (pyproject.toml) ... done
  Created wheel for shapely: filename=shapely-2.0.1-cp310-cp310-linux_x86_64.whl size=969692 sha256=35bddb8214194f4870c91d517c12419d610404ae1742225bc75791ef2390beda
  Stored in directory: /root/.cache/pip/wheels/07/bd/06/4e979fa263bca266484ee65f5aab8e6b1c9b20f8caa6f2d7da
Successfully built shapely
Installing collected packages: shapely
Successfully installed shapely-2.0.1
In [ ]:
# importing libraries

import warnings
warnings. filterwarnings("ignore")

import matplotlib.pyplot as plt

# importing cartopy
import cartopy.feature as cf
import cartopy.crs as ccrs
import cartopy.mpl.ticker as cticker
from cartopy.util import add_cyclic_point

Let's start by creating a basic map using cartopy. To do this, we first need to define an axes and a projection. Once the projection is defined, we can add coastlines to the map, specifying the linewidth and color. We'll delve deeper into projections later on.

In [ ]:
map = plt.axes(projection=ccrs.PlateCarree()) # defining axes and projection* and assigned it to `map`
map.coastlines() # add coastlines
Out[ ]:
<cartopy.mpl.feature_artist.FeatureArtist at 0x7ff02c14e3b0>

Add Features to the Map

We add features by calling the add_feature() method on-axis object and passing a feature object available from the cartopy.feature module.

We can add the following features (part of cartopy.feature) to our map:

  • COASTLINE: Adds coastline around all continents.
  • LAND: Adds land on top of world map.
  • LAKES: Adds big lakes of the world.
  • BORDERS: Adds country borders for the whole world.
  • OCEAN: Adds ocean with a color.
  • RIVERS: Adds big rivers of the world.
In [ ]:
# Adding more features and add stock images
plt.figure(figsize=(12,12))
map4 = plt.axes(projection=ccrs.PlateCarree())

map4.add_feature(cf.LAND)
map4.add_feature(cf.OCEAN)
map4.add_feature(cf.COASTLINE)
map4.add_feature(cf.BORDERS, linestyle=':')
map4.add_feature(cf.LAKES, alpha=0.5)
map4.add_feature(cf.RIVERS)

map4.stock_img() # try comment this
Out[ ]:
<matplotlib.image.AxesImage at 0x7ff02c1acaf0>

Regional Map

Need to provide the longitude and latitude ranges as arguments of the set_extent method on the axes object:

ax.set_extent([min_lon, max_lon, min_lat, max_lat])
In [ ]:
# Cropping the map with set_extent
plt.figure(figsize=(12,12))
map5 = plt.axes(projection=ccrs.PlateCarree())

map5.add_feature(cf.LAND)
map5.add_feature(cf.OCEAN)
map5.add_feature(cf.COASTLINE)
map5.add_feature(cf.BORDERS, color='k') # you can also change color, linewidth etc
map5.add_feature(cf.LAKES, alpha=0.5)
map5.add_feature(cf.RIVERS)

map5.set_extent([87.5, 93, 20, 27]) # define the boundary
In [ ]:
# Do not worry about this cell, you will almost never need it
fig = plt.figure(figsize=(13, 14))
fig.suptitle('Different Types of Projections', fontsize=20, y=0.92)

projections = {'PlateCarree': ccrs.PlateCarree(), 'AlbersEqualArea': ccrs.AlbersEqualArea(),
               'AzimuthalEquidistant': ccrs.AzimuthalEquidistant(), 'EquidistantConic': ccrs.EquidistantConic(),
               'LambertConformal': ccrs.LambertConformal(), 'LambertCylindrical': ccrs.LambertCylindrical(),
               'Mercator': ccrs.Mercator(), 'Miller': ccrs.Miller(), 'Mollweide': ccrs.Mollweide(),
               'Orthographic': ccrs.Orthographic(), 'Robinson': ccrs.Robinson(), 'Sinusoidal': ccrs.Sinusoidal(),
               'Stereographic': ccrs.Stereographic(), 'TransverseMercator': ccrs.TransverseMercator(),
               'InterruptedGoodeHomolosine': ccrs.InterruptedGoodeHomolosine(),
               'RotatedPole': ccrs.RotatedPole(), 'OSGB': ccrs.OSGB(), 'EuroPP': ccrs.EuroPP(),
               'Geostationary': ccrs.Geostationary(), 'NearsidePerspective': ccrs.NearsidePerspective(),
               'EckertI': ccrs.EckertI(), 'EckertII': ccrs.EckertII(), 'EckertIII': ccrs.EckertIII(),
               'EckertIV': ccrs.EckertIV(), 'EckertV': ccrs.EckertV(), 'EckertVI': ccrs.EckertVI(),
               'Gnomonic': ccrs.Gnomonic(),
               'LambertAzimuthalEqualArea': ccrs.LambertAzimuthalEqualArea(),
               'NorthPolarStereo': ccrs.NorthPolarStereo(), 'OSNI': ccrs.OSNI(),
               'SouthPolarStereo': ccrs.SouthPolarStereo()}

for index, projection in enumerate(projections.items()):
    ax = fig.add_subplot(7, 5, index+1, projection=projection[1])
    ax.coastlines()
    ax.set_title(projection[0])
In [4]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [5]:
# Load the precipitation and temperature data from NetCDF files
data = xr.open_dataset("/content/drive/MyDrive/DataSets/ERA5_monthly_global_precipitation_temp.nc").load()
data
Out[5]:
<xarray.Dataset>
Dimensions:    (longitude: 1440, latitude: 721, time: 12)
Coordinates:
  * longitude  (longitude) float32 0.0 0.25 0.5 0.75 ... 359.0 359.2 359.5 359.8
  * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
  * time       (time) datetime64[ns] 2022-01-01 2022-02-01 ... 2022-12-01
Data variables:
    t2m        (time, latitude, longitude) float32 247.1 247.1 ... 244.5 244.5
    tp         (time, latitude, longitude) float32 0.0004254 ... 0.0002506
Attributes:
    Conventions:  CF-1.6
    history:      2023-07-04 13:11:35 GMT by grib_to_netcdf-2.25.1: /opt/ecmw...
In [10]:
temperature =  data['t2m']
yearly_temp = temperature.mean(dim='time')

# Create a Cartopy Mercator projection
projection = ccrs.Robinson()

# Create the figure and axis
plt.figure(figsize=(16,8))
ax = plt.axes(projection=ccrs.Robinson())

# Add map features
ax.add_feature(cf.COASTLINE, linewidth=0.2)
ax.add_feature(cf.BORDERS, linewidth=0.2)
ax.add_feature(cf.RIVERS)
ax.gridlines(draw_labels=True, linewidth=0.3)

# Plot the contour map of annual precipitation
cs = ax.contourf(data['longitude'], data['latitude'], (yearly_temp - 273.15), np.linspace(-55, 55, 23),cmap='coolwarm' , transform=ccrs.PlateCarree())

# Add a colorbar
cbar = plt.colorbar(cs, ax=ax, orientation='vertical', pad=0.05)
cbar.set_label('$Temperature {^\circ}C$')

# Set the plot title
plt.title('Mean temperature of year 2022')

# Show the plot
plt.show()

3. Plotting Contourplot using Precipitation data

Now, its time to plot a real climate data with everything you learned earlier. For this, we use 2022 Precipitation data over Bangladesh and its sorrounding region.

importing libraries.

The following cell will first uninstall some libraries and then reinstall the specified version.

last two line will clear the output screen.

In [1]:
!pip uninstall -y shapely
!pip uninstall -y importlib-metadata
!pip install shapely==1.7.1 --no-binary shapely
!pip install importlib-metadata==4.13.0
!pip install cartopy==0.21
from google.colab import output
output.clear()

After installing new python libraries, runtime needs to be restared. You can do it maually or run the following cell to kill this runtime. Then run the next cell, that will automatically connect the runtime and run.

In [ ]:
import os
os.kill(os.getpid(), 9)
In [1]:
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cf
import calendar
import requests

Download and load data

We specify the URL in the file_url variable.

Next, we make a request to the URL using requests.get(). The response object contains the content of the file.

We then open a file in binary write mode ("wb") at the desired location ("/content/data.nc") and write the content of the response into it.

Finally, we use xr.open_dataset() from xarray to open the downloaded file and load it into the data variable. You can replace "/content/data.nc" with your desired file path.

After executing this code, we have the data loaded into the data variable, and proceed with further analysis or operations using xarray.

In [2]:
file_url = "https://drive.google.com/uc?id=1Q5mzA7aDhRA2baMLZ71eio-ZFHQ3mGNN&export=download"
file_path = "/content/data.nc"

response = requests.get(file_url)
with open(file_path, "wb") as f:
    f.write(response.content)

data = xr.open_dataset(file_path).load()
data
Out[2]:
<xarray.Dataset>
Dimensions:    (longitude: 29, latitude: 33, time: 8760)
Coordinates:
  * longitude  (longitude) float32 87.0 87.25 87.5 87.75 ... 93.5 93.75 94.0
  * latitude   (latitude) float32 28.0 27.75 27.5 27.25 ... 20.5 20.25 20.0
  * time       (time) datetime64[ns] 2022-01-01 ... 2022-12-31T23:00:00
Data variables:
    u10        (time, latitude, longitude) float32 0.8939 1.087 ... -1.246
    v10        (time, latitude, longitude) float32 -0.4852 -1.151 ... -0.8678
    tp         (time, latitude, longitude) float32 1.02e-05 2.856e-05 ... 0.0
Attributes:
    Conventions:  CF-1.6
    history:      2023-06-02 13:06:47 GMT by grib_to_netcdf-2.25.1: /opt/ecmw...

Seasonal spatial distribution.

  1. Extracting precipitation data for a whole year: In this step, the code extracts the precipitation data from the loaded dataset. It multiplies the 'tp' variable by 1000 to convert the unit from meters to millimeters. The longitude and latitude values are also extracted from the dataset.

  2. Converting hourly data to daily data: Here, the code resamples the precipitation data from hourly to daily frequency using the .resample() method with a time frequency of '1D'. It then calculates the sum of precipitation for each day using the .sum() method.

  3. Calculating the seasonal spatial distribution: Here, the code defines a list of season names (seasons) and a corresponding list of lists (season_months) containing the month indices for each season. The months are grouped as Winter (DJF: December, January, February), Pre-monsoon (MAM: March, April, May), Monsoon (JJA: June, July, August), and Post-monsoon (SON: September, October, November).

  4. Calculating the seasonal mean precipitation: This code snippet calculates the mean precipitation for each season. It iterates over the season_months list, selects the corresponding data from daily_precipitation using .sel() and the isin() method to match the months, and then calculates the mean along the time dimension (dim='time') using .mean(). The seasonal mean precipitation values are stored in the seasonal_means list.

These operations help extract the precipitation data for a whole year, convert it to daily data, and calculate the seasonal mean precipitation for each predefined season.

In [ ]:
# Extract the precipitation data for a whole year
precipitation = data['tp'] * 1000
lon = data['longitude']
lat = data['latitude']

# Convert hourly data to daily data
daily_precipitation = precipitation.resample(time='1D').sum()

# Seasonal Spatial Distribution
seasons = ['Winter (DJF)', 'Pre-monsoon (MAM)', 'Monsoon (JJA)', 'Post-monsoon (SON)']  # Winter, Spring, Summer, Autumn
season_months = [[12, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]]

# Calculate the seasonal mean precipitation
seasonal_means = []
for months in season_months:
    seasonal_data = daily_precipitation.sel(time=daily_precipitation['time.month'].isin(months))
    seasonal_mean = seasonal_data.mean(dim='time')
    seasonal_means.append(seasonal_mean)

This code generates a 2x2 grid of subplots, where each subplot represents the seasonal spatial distribution of precipitation. Let's break it down step by step:

  1. Create the subplots: The plt.subplots() function is used to create a figure with a grid of subplots. The nrows=2, ncols=2 arguments specify a 2x2 grid, and figsize=(12, 12) sets the size of the figure. The subplot_kw parameter sets the projection to ccrs.PlateCarree(), which is a map projection for visualizing global data.

  2. Plot each seasonal mean precipitation: Using a loop, each subplot is accessed individually (ax). The contourf() function is used to plot the seasonal mean precipitation data (seasonal_means[i]) on each subplot. Additional features such as coastlines, borders, ocean, lakes, land, and rivers are added using add_feature(). The map extent is set using set_extent(min_lon, max_lon, min_lat, max_lat), and gridlines are drawn with gridlines().

  3. Add colorbar and subplot titles: A vertical colorbar is added to each subplot using colorbar(). Subplot titles (seasons[i]) are set using set_title(). The colorbar ticks and label font sizes are customized.

  4. Set the figure title and adjust layout: The overall figure title is set using suptitle(). The layout is adjusted using tight_layout() to optimize spacing between subplots.

  5. Display the plot: Finally, plt.show() is called to display the plot.

Executing this code will generate a figure with four subplots, each showing the seasonal spatial distribution of precipitation. The color of the contourf plot represents the precipitation amount, and the colorbar provides a reference for the precipitation values in millimeters per day.

In [ ]:
# Create the subplots
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(12, 12),
                        subplot_kw={'projection': ccrs.PlateCarree()})

# Plot each seasonal mean precipitation on the subplots
for i, ax in enumerate(axs.flat):
    cs = ax.contourf(lon, lat, seasonal_means[i], 30, extend='both', transform=ccrs.PlateCarree())
    ax.add_feature(cf.COASTLINE)
    ax.add_feature(cf.BORDERS, linewidth=1)
    ax.add_feature(cf.OCEAN)
    ax.add_feature(cf.LAKES)
    ax.add_feature(cf.LAND)
    ax.add_feature(cf.RIVERS)
    ax.set_extent([87, 94, 20, 28])
    bx = ax.gridlines(draw_labels=True, linewidth=0.2)
    ax.set_title(seasons[i], fontsize=12, pad=10, y=1.04, loc="left")

    # Add vertical colorbar to each subplot with a custom size
    cbar = plt.colorbar(cs, ax=ax, orientation='vertical', pad=0.1, shrink=0.8)
    cbar.ax.tick_params(labelsize=10)
    cbar.ax.set_title('mm/day', fontsize=10)

# Add a title to the figure
fig.suptitle('Seasonal Spatial Distribution of Precipitation', fontsize=20, fontweight='bold')

# Adjust the spacing between subplots and display the plot
plt.tight_layout()

plt.show()
/usr/local/lib/python3.10/dist-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_ocean.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)
/usr/local/lib/python3.10/dist-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_land.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)
/usr/local/lib/python3.10/dist-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_coastline.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)
/usr/local/lib/python3.10/dist-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_cultural/ne_10m_admin_0_boundary_lines_land.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)
/usr/local/lib/python3.10/dist-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_lakes.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)
/usr/local/lib/python3.10/dist-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_rivers_lake_centerlines.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)

Temporal distribution plot

This code calculates the monthly total precipitation, calculates the mean precipitation for each month, and plots the temporal distribution of precipitation. Let's break it down step by step:

  1. Calculate the monthly total precipitation: The resample() method with time='1M' is used to resample the daily precipitation data (daily_precipitation) to monthly frequency. The sum(dim='time') method then calculates the sum of precipitation for each month. The result is stored in the monthly_precipitation variable.

  2. Calculate the mean precipitation for each month: Using NumPy, the np.mean() function is applied twice to calculate the mean precipitation. The first mean() call computes the mean along the latitude axis (axis=2), and the second mean() call computes the mean along the longitude axis (axis=1). The result is stored in the mean_monthly_precipitation variable.

  3. Get the month labels: A list comprehension is used to extract the month labels from the monthly_precipitation['time'] values. Each value is converted to a string, split at the hyphen ('-'), and the second part (the month) is selected. The result is stored in the months variable.

  4. Plot the temporal distribution of precipitation: A figure with a size of (10, 6) is created using plt.figure(figsize=(10, 6)). The plot() function is then used to plot the mean monthly precipitation (mean_monthly_precipitation) against the month labels (months). The marker style is set to 'o', the linestyle is set to '-', and the color is set to 'blue'. The x-axis label is set to 'Month', the y-axis label is set to 'Precipitation (mm)', and the title is set to 'Monthly Temporal Distribution of Precipitation'. Gridlines are enabled using plt.grid(True), and the x-axis tick labels are rotated by 45 degrees using plt.xticks(rotation=45). The layout is adjusted using plt.tight_layout().

Executing this code will calculate the monthly total precipitation, compute the mean precipitation for each month, and plot the temporal distribution of precipitation. The resulting plot will show the variation of precipitation over the months, providing insights into the seasonal patterns of precipitation.

In [ ]:
# Calculate the monthly total precipitation
monthly_precipitation = daily_precipitation.resample(time='1M').sum(dim='time')

#Calculate the mean precipitation for each month
mean_monthly_precipitation = np.mean(np.mean(monthly_precipitation, axis=2), axis=1)

# Get the month labels
months = [str(month).split('-')[1] for month in monthly_precipitation['time'].values]

# Plot the temporal distribution of precipitation
plt.figure(figsize=(10, 6))
plt.plot(months, mean_monthly_precipitation, marker='o', linestyle='-', color='blue')
plt.xlabel('Month')
plt.ylabel('Precipitation (mm)')
plt.title('Monthly Temporal Distribution of Precipitation')
plt.grid(True)
plt.xticks(rotation=45)
plt.show()

4. Choosing colorbars

Colormaps are extremely useful tool for plotting scientic data.They are used to make visualization look more appealing as well as self explanatory.

Classes of colormaps

Sequential

  • Sequential color maps are used to represent data that varies continuously from low to high values.
  • They are suitable for visualizing data such as temperature, elevation, or concentration, where there is a clear ordering and progression from one end of the color map to the other.
  • Examples of sequential color maps include "viridis", "plasma", "inferno", and "cividis".
  • These color maps typically have a perceptually uniform progression of colors, ensuring that the visual changes are consistent and proportional to the data values. *Sequential color maps are often used in heatmaps, contour plots, and line plots to represent the magnitude or intensity of a variable.

Example
In [ ]:
import matplotlib.pyplot as plt
import numpy as np

# Generate random data
x = np.linspace(0, 10, 100)
y = np.sin(x)

# Create a sequential color map
cmap = 'viridis'

# Plot the data using the color map
plt.scatter(x, y, c=y, cmap=cmap)

# Add a color bar
plt.colorbar()

# Set title and labels
plt.title('Sequential Color Map Example')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')

# Show the plot
plt.show()

Diverging Color Maps

  • Diverging color maps are used to represent data that has a critical midpoint or a point of interest from which values diverge in both positive and negative directions.
  • They are suitable for visualizing data such as temperature anomalies, differences from a reference value, or any data where positive and negative deviations are meaningful.
  • Diverging color maps have distinct colors for positive and negative values, with a neutral color representing the midpoint or reference value.
  • Examples of diverging color maps include "RdBu", "coolwarm", and "seismic".
  • Diverging color maps are commonly used in heatmaps, contour plots, and scatter plots to highlight deviations from a baseline or central value.

Example
In [ ]:
import matplotlib.pyplot as plt
import numpy as np

# Generate random data
x = np.linspace(-5, 5, 100)
y = np.linspace(-5, 5, 100)
X, Y = np.meshgrid(x, y)
Z = np.sin(X) * np.cos(Y)

# Create a diverging color map
cmap = 'RdBu'

# Plot the data using the color map
plt.contourf(X, Y, Z, cmap=cmap)

# Add a color bar
plt.colorbar()

# Set title and labels
plt.title('Diverging Color Map Example')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')

# Show the plot
plt.show()

Qualitative Color Maps:

  • Qualitative color maps are used to represent categorical or discrete data without an inherent order or progression.
  • They are suitable for visualizing data such as different classes or categories, where the color assignment does not convey any quantitative information.
  • Qualitative color maps have a set of distinct colors with no specific ordering. Examples of qualitative color maps include "tab10", "tab20", and "Set3".
  • Qualitative color maps are often used in bar charts, pie charts, and scatter plots to differentiate different groups or categories.

Example
In [ ]:
import matplotlib.pyplot as plt
import numpy as np

# Generate random data
x = np.linspace(0, 10, 100)
y = np.sin(x)

# Create a qualitative color map
cmap = 'Set3'

# Plot the data using the color map
plt.scatter(x, y, c=y, cmap=cmap)

# Add a color bar
plt.colorbar()

# Set title and labels
plt.title('Qualitative Color Map Example')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')

# Show the plot
plt.show()

5. Wind direction (u and v)

In [ ]:
file_url = "https://drive.google.com/uc?id=1Q5mzA7aDhRA2baMLZ71eio-ZFHQ3mGNN&export=download"
file_path = "/content/data.nc"

response = requests.get(file_url)
with open(file_path, "wb") as f:
    f.write(response.content)

data = xr.open_dataset(file_path).load()
data
Out[ ]:
<xarray.Dataset>
Dimensions:    (longitude: 29, latitude: 33, time: 8760)
Coordinates:
  * longitude  (longitude) float32 87.0 87.25 87.5 87.75 ... 93.5 93.75 94.0
  * latitude   (latitude) float32 28.0 27.75 27.5 27.25 ... 20.5 20.25 20.0
  * time       (time) datetime64[ns] 2022-01-01 ... 2022-12-31T23:00:00
Data variables:
    u10        (time, latitude, longitude) float32 0.8939 1.087 ... -1.246
    v10        (time, latitude, longitude) float32 -0.4852 -1.151 ... -0.8678
    tp         (time, latitude, longitude) float32 1.02e-05 2.856e-05 ... 0.0
Attributes:
    Conventions:  CF-1.6
    history:      2023-06-02 13:06:47 GMT by grib_to_netcdf-2.25.1: /opt/ecmw...

This following cell calculates the seasonal mean U and V wind components by converting the original hourly data to daily resolution and then averaging them over the specified seasons. The resulting seasonal means are stored in seasonal_means_u and seasonal_means_v lists.

  1. Extracting U and V wind components for a whole year:

    • The code assumes that you have a netCDF dataset (data) containing the U and V wind components.
    • data['u10'] and data['v10'] access the U and V wind component variables, respectively, assuming they are named 'u10' and 'v10' in the dataset.
    • lon and lat variables represent the longitude and latitude coordinates, respectively.
  2. Converting hourly data to daily data:

    • The U and V wind component data are originally in hourly resolution.
    • u_component.resample(time='1D').mean() aggregates the U wind component data into daily resolution by resampling and taking the mean value for each day.
    • v_component.resample(time='1D').mean() does the same for the V wind component data.
    • The resulting u_daily and v_daily variables hold the daily averaged U and V wind component data, respectively.
  3. Defining seasons and corresponding months:

    • The code defines a list of seasons: Winter (DJF), Pre-monsoon (MAM), Monsoon (JJA), and Post-monsoon (SON).
    • season_months is a list of lists, where each inner list contains the month indices corresponding to a specific season.
  4. Calculating the seasonal mean wind components:

    • The code iterates over each season defined in season_months.
    • For each season, it selects the relevant months' data from u_daily and v_daily using u_daily['time.month'].isin(months) and v_daily['time.month'].isin(months), respectively.
    • The selected data is stored in seasonal_data_u and seasonal_data_v variables.
    • seasonal_mean_u and seasonal_mean_v are computed by taking the mean along the time dimension (dim='time') for the seasonal data.
    • The seasonal mean U and V wind components are appended to the seasonal_means_u and seasonal_means_v lists, respectively.
In [ ]:
# Extract the U and V wind components for a whole year
u_component = data['u10']
v_component = data['v10']
lon = data['longitude']
lat = data['latitude']

# Convert hourly data to daily data
u_daily = u_component.resample(time='1D').mean()
v_daily = v_component.resample(time='1D').mean()

# Seasonal Spatial Distribution
seasons = ['Winter (DJF)', 'Pre-monsoon (MAM)', 'Monsoon (JJA)', 'Post-monsoon (SON)']  # Winter, Spring, Summer, Autumn
season_months = [[12, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]]

# Calculate the seasonal mean wind components
seasonal_means_u = []
seasonal_means_v = []
for months in season_months:
    seasonal_data_u = u_daily.sel(time=u_daily['time.month'].isin(months))
    seasonal_data_v = v_daily.sel(time=v_daily['time.month'].isin(months))
    seasonal_mean_u = seasonal_data_u.mean(dim='time')
    seasonal_mean_v = seasonal_data_v.mean(dim='time')
    seasonal_means_u.append(seasonal_mean_u)
    seasonal_means_v.append(seasonal_mean_v)

This code creates a 2x2 grid of subplots, where each subplot represents the seasonal spatial distribution of the wind components. Wind vectors are plotted using quiver plots, and a quiver key is added to indicate the wind speed scale. Various cartopy features are added to enhance the visual representation.

  1. Creating subplots:

    • The code creates a figure with 2 rows and 2 columns using plt.subplots(nrows=2, ncols=2, figsize=(12, 12), subplot_kw={'projection': ccrs.PlateCarree()}).
    • The figsize parameter sets the size of the figure, and subplot_kw sets the projection to ccrs.PlateCarree(), which indicates a Plate Carrée projection for the subplots.
  2. Plotting each seasonal mean wind component:

    • The code iterates over the subplots using for i, ax in enumerate(axs.flat).
    • u_wind and v_wind retrieve the seasonal mean U and V wind components for the current subplot index (i).
    • The wind speed (speed) is computed by taking the square root of the squared U and V wind components.
    • The wind vectors are plotted using ax.quiver(lon, lat, u_wind, v_wind, speed, transform=ccrs.PlateCarree(), cmap='cool', width=0.003), creating a quiver plot of wind vectors on the current subplot.
    • Various cartopy features such as coastlines, borders, ocean, lakes, land, and rivers are added to the plot using ax.add_feature().
    • ax.set_extent([87, 94, 20, 28]) sets the geographic extent of the plot.
    • Gridlines are added to the plot using ax.gridlines().
    • The title of the subplot is set using ax.set_title() with the corresponding season name.
    • A quiver key is added to the plot using ax.quiverkey() to indicate the scale of the wind vectors.
  3. Adding a title and adjusting layout:

    • The overall title of the figure is set using fig.suptitle('Seasonal Spatial Distribution of Wind Components', fontsize=20, fontweight='bold').
    • The spacing between subplots is adjusted using plt.tight_layout() to improve the overall layout.
  4. Displaying the plot:

    • Finally, the plot is displayed using plt.show().
In [ ]:
# Create the subplots
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(12, 12),
                        subplot_kw={'projection': ccrs.PlateCarree()})

# Plot each seasonal mean wind component on the subplots
for i, ax in enumerate(axs.flat):
    u_wind = seasonal_means_u[i]
    v_wind = seasonal_means_v[i]
    speed = np.sqrt(u_wind**2 + v_wind**2)

    # Quiver plot of wind vectors
    q = ax.quiver(lon, lat, u_wind, v_wind, speed, transform=ccrs.PlateCarree(), cmap='cool', width=0.003)

    ax.add_feature(cf.COASTLINE)
    ax.add_feature(cf.BORDERS, linewidth=1)
    ax.add_feature(cf.OCEAN)
    ax.add_feature(cf.LAKES)
    ax.add_feature(cf.LAND)
    ax.add_feature(cf.RIVERS)
    ax.set_extent([87, 94, 20, 28])
    bx = ax.gridlines(draw_labels=True, linewidth=0.2)
    ax.set_title(seasons[i], fontsize=12, pad=10, y=1.04, loc="left")

    # Add quiver key to each subplot
    quiver_key = ax.quiverkey(q, 0.9, 1.06, 5, r'5 ${ms^{-1}}$', labelpos='E', coordinates='axes')

# Add a title to the figure
fig.suptitle('Seasonal Spatial Distribution of Wind Components', fontsize=20, fontweight='bold')

# Adjust the spacing between subplots and display the plot
plt.tight_layout()
plt.show()

7. Color maps in Matplotlib and creating customized color bar

8. Interactive plotting