Instead of manually selecting and submitting multiple request you can use cdsapi.
Prerequisite:
cdsapi.Client()
and the error message will tell you the location)#Install the CDS API client
!pip install cdsapi
import cdsapi
Sending request
The following cell sends requests to download the variables "relative_humidity, specific_humidity, u_component_of_wind, v_component_of_wind", at three pressure levels (150hPa, 500hPa, 850hPa) of all 24 hours everyday of 1981 to 1984 from the region at longitude 79 to 101 and latitude 25 to 0 in netcdf format.
#Example:
#you can also get the templet code from "Show API request" option after selecting data on your desired dataset
#remove download.nc from the templet. Also for the loop to work I replaced year value with loop counter.
request_id = [] #Each request has an unique id to be used later when dowloading
for i in range(1981,1985):
year = str(i)
c = cdsapi.Client(wait_until_complete=False, delete=False)
r = c.retrieve(
'reanalysis-era5-pressure-levels', #Name of the dataset
{
'product_type': 'reanalysis', #All the desired variables and coordinates
'variable': [
'relative_humidity', 'specific_humidity',
'u_component_of_wind', 'v_component_of_wind',
],
'pressure_level': [
'150', '500', '850',
],
'year': year,
'month': ['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12'],
'day': [
'01', '02', '03',
'04', '05', '06',
'07', '08', '09',
'10', '11', '12',
'13', '14', '15',
'16', '17', '18',
'19', '20', '21',
'22', '23', '24',
'25', '26', '27',
'28', '29', '30',
'31',
],
'time': [
'00:00', '01:00', '02:00',
'03:00', '04:00', '05:00',
'06:00', '07:00', '08:00',
'09:00', '10:00', '11:00',
'12:00', '13:00', '14:00',
'15:00', '16:00', '17:00',
'18:00', '19:00', '20:00',
'21:00', '22:00', '23:00',
],
'area': [ #[North, west, south, east]
25, 79, 0,
101,
],
'format': 'netcdf',
},)
request_id.append(r.reply['request_id']) #Storing request ID's
Saving Request ID's(Important)
Request takes may take long time to complete(Check here to see your request status) To avoid losing request ID's copy and paste them to another cell so they are saved even if you delete runtime.
request_id
request_id = ['fa0989d9-cea1-4a99-a842-9043912b1398',
'91b407ec-7986-4c25-861d-1e1ca108b62b',
'12338118-7efa-4138-b3ee-6d08cca25c0e',
'ec54ffd4-ba53-4c57-88e0-4a1f961d24aa']
Downloading data
At a later time when the requests are complete the following cell will sequencially download the files
#At a later time when requests are completed
for i in range(len(request_id)): # request_id from above
reply = dict(request_id=request_id[i])
new_client = cdsapi.Client(wait_until_complete=False, delete=False)
result = cdsapi.api.Result(new_client, reply)
result.update()
reply = result.reply
if reply['state'] == 'completed':
result.download('/content/drive/MyDrive/data/'+str(1981+i) +'.nc') #Specifying download location and file name ##netcdf file extention .nc
Interpolation is a technique of estimating the value of a function at a point that is not directly or actually measured by using the konwn data points.
In case of climate physics, interpolation is used to estimate the value of a climate variable such as temperature or precipitation at a location that is not actually measured.This is done by using the values of the climate variable at nearby locations that are known.
Here we'll create a dummy dataarray data
which has temperature values over a time axis.The time
array represents the time coordinate values and the temperature
array contains corresponding temperature values.We suppose this is an observed data array.
Then we define a new time array,new_time
where we want to know the values of the temperatures.But we don't have any values at these new_time
points in the observed data array.This is the stage where we use interpolation to know the temperature values at these new_time
values.
syntax of interpolation:
DataArray.interp(coords=None, method='linear', assume_sorted=False, kwargs=None, **coords_kwargs)
It has the following arguments:
coords
: Mapping from dimension names to the new coordinates. New coordinate can be a scalar, array-like or DataArray. If DataArrays are passed as new coordinates, their dimensions are used for the broadcasting. Missing values are skipped.method
: A string indicating which method to use for interpolation. The following methods are supported:linear
: Linear interpolation.nearest
: Nearest-neighbor interpolation.zero
: Interpolates all values to zero.slinear
: 2nd-order spline interpolation with first derivatives at the endpoints set to zero.quadratic
: 3rd-order spline interpolation with second derivatives at the endpoints set to zero.cubic
: 4th-order spline interpolation with third derivatives at the endpoints set to zero.assume_sorted
: A boolean indicating whether the new coordinates are assumed to be sorted. If True, the interpolation is performed more efficiently.kwargs
: Additional keyword arguments are passed to the interpolation routine.coords_kwargs
: Keyword arguments for the DataArray.sel() method used to select the values to interpolate.The DataArray.interp()
method returns a new DataArray with the interpolated values.
import numpy as np
import xarray as xr
# Create a sample DataArray with temperature values
time = np.arange(0, 5)
temperature = np.array([20, 25, 22, 18, 15])
data = xr.DataArray(temperature, coords={'time': time}, dims=['time'])
# Define new time values for interpolation
new_time = np.arange(0, 4.5, 0.5)
# Interpolate the temperature DataArray to the new time values using linear interpolation
interpolated_data = data.interp(coords={'time': new_time}, method='linear')
# Print the original and interpolated DataArrays
print("Original DataArray:")
print(data)
print("\nInterpolated DataArray:")
print(interpolated_data)
Important
If we carefully look at the time of original data array and time of the interpolated data array we will see both data range from o to 4, but the intervals are diffrerent.
But it is allowed to have the new time values extend beyond the range of the original time values when performing interpolation. In Xarray, the DataArray.interp()
method can handle extrapolation, which means it can estimate values outside the range of the original data based on the provided interpolation method.
For example, if you have original time values ranging from 0 to 4 and we define new time values from 0 to 8, the DataArray.interp()
method will still perform the interpolation using the defined interpolation method (e.g., linear) for the new time values within the original range (0 to 4), and then it will extrapolate values for the new time values beyond the original range (4 to 8).
However, it's important to note that extrapolation can be less reliable.
import matplotlib.pyplot as plt
# Create a figure with two subplots
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Plot the original data in the first subplot
ax1.plot(data['time'], da, 'bo-', label='Original')
ax1.set_xlabel('Time')
ax1.set_ylabel('Temperature')
ax1.set_title('Originally observed Temperature')
# Plot the interpolated data in the second subplot
ax2.plot(interpolated_data['time'], interpolated_data, 'r.-', label='Interpolated')
ax2.set_xlabel('Time')
ax2.set_ylabel('Temperature')
ax2.set_title('Interpolated Temperature')
ax2.legend()
# Adjust the spacing between subplots
plt.tight_layout()
# Display the plot
plt.show()
# Spatial interpolation add Global
# Shift griding 0-360 and -180 to +180 LON SHift
# Run this cell when you are in colab and then restart runtime
# Do not worry about this cell, you can always copy paste
!apt-get install libproj-dev proj-data proj-bin
!apt-get install libgeos-dev
!pip install cython
!pip install cartopy
!apt-get -qq install python-cartopy python3-cartopy
!pip uninstall -y shapely # cartopy and shapely aren't friends (early 2020)
!pip install shapely --no-binary shapely
# Let's download data by another method. This could take couple of minutes.
!wget https://downloads.psl.noaa.gov/Datasets/noaa.oisst.v2.highres/sst.mon.mean.nc
# Let's install packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import xarray as xr
import cartopy.crs as ccrs
# Read the data using xarray
data = xr.open_dataset('/content/sst.mon.mean.nc').sel(time=slice('1982','2022'))
data
data.sst[0,:,:].plot()
# Let's explore our original lat and lon value
number_of_lon = len(data.lon.values)
number_of_lat = len(data.lat.values)
original_lon, original_lat = [], []
for i in range(number_of_lon):
original_lon.append(data.lon.values[i])
for j in range(number_of_lat):
original_lat.append(data.lat.values[j])
print('Original Longitude')
print(original_lon[0:3], original_lon[-4:-1])
print('logitudinal resolution: ', original_lon[1]-original_lon[0])
print('----------------------------------------------')
print('Original Latitude')
print(original_lat[0:3], original_lat[-4:-1])
print('latitudinal resolution: ', original_lat[1]-original_lat[0])
Now we will change the resolution from 0.25*0.25
to 4*4
.
Suggestion
# Extract variable to interpolate
var = data['sst']
# Define new grid to interpolate to
lon_new = np.arange(0, 361, 4)
lat_new = np.arange(-90, 91, 4)
# Create new dataset with new grid
data_new = xr.Dataset({'lon': (['lon'], lon_new), 'lat': (['lat'], lat_new)})
# Interpolate variable to new grid
var_new = var.interp(lon=lon_new, lat=lat_new)
# Add interpolated variable to new dataset
data_new['sst'] = var_new
data_new
data_new.sst[0,:,:].plot()
plt.figure(figsize=(10, 6))
plt.contourf(lon_new, lat_new, data_new.sst[0,:,:],levels = np.arange(-30,31),cmap='YlOrRd')
plt.colorbar(label='SST')
plt.title('Sea Surface Temperature')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.show()
plt.figure(figsize=(10, 6))
plt.contourf(data.lon, data.lat, data.sst[0,:,:] ,levels = np.arange(-50,50, 2),cmap='YlOrRd')
plt.colorbar(label='SST')
plt.title('Sea Surface Temperature')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.show()
levels = np.arange(-5, 36, 5)
fig = plt.figure(figsize=(10,10))
dataset = [data, data_new]
title = ['SST Original 0.25*0.25', 'SST Interpolated 4*4' ]
for i in range(1, 3):
ax = plt.subplot(2, 1, i)
plt.contourf(dataset[i-1].lon, dataset[i-1].lat, dataset[i-1].sst[0,:,:], levels=levels, cmap='YlOrRd')
plt.title(title[i-1])
#colorbar_setting
plt.subplots_adjust(bottom=0.1, right=0.8, top=0.9)
cax = plt.axes([0.84, 0.1, 0.025, 0.8]) #plt.axes([x_position,y_position,width,height])
plt.colorbar(cax=cax)
# Need suggestions on grid shifting
-- 0--360
-- -180 -- +180
# google