Geospatial analysis using Python

In this tutorial, we will learn about what is geospatial analysis, the Applications of geospatial analysis, and the key libraries to perform geospatial analysis using Python. We will also see a geospatial analysis of data using one of the Python libraries.

What is geospatial analysis?

It is an analysis performed on geospatial data. Geospatial data also called spatial data is the type of data that contains information about the geographic location of a country, state, or any other place on the Earth in terms of longitude or latitude. In geospatial analysis we can locate a particular area also we can calculate the size, length, or other attributes of the area.

Application of geospatial analysis

  • Natural resource allocation and management
  • Agricultural management.
  • Urban development
  • Transportation
  • Military management
  • Study of climate, crimes, Business, real estate, etc.

What are the libraries we use for geospatial analysis in Python?

  • Geopandas: It is an extended version of the panda’s library. It gives geodataframe like dataframe in pandas. It helps to simplify working with geospatial data.
  • Fiona: It helps in reading and writing in geospatial files. We can write geospatial data in GeoJSON, Shapefiles, etc. format.
  • Shapely: It allows us to perform geometric operations in Geospatial data. We can modify, and create points, lines, etc on the map.
  • Folium: We can create interactive maps using folium. We can add markers, and visualize different locations on a map using Folium.

Which is the widely used library for geospatial analysis?

Geopandas is the widely used library to perform analysis on geospatial data. Sometimes Geopandas can be used combined with Shapely, Fiona, etc. to perform analysis.

How to perform geospatial analysis using Python?

First, we will be required to install the Geopandas library, which is a widely used library to perform geospatial analysis.

We can install it using the following command in pip

pip install geopandas

This will install the geopandas library.

Now we will import the required libraries

import geopandas as gpd
import warnings
warnings.filterwarnings("ignore")warnings.filterwarnings("ignore")

In the above code geopandas we are using to read and perform analysis on geospatial data. Warning we are importing to avoid warning messages in output and our output would look very clean and neat.

To import the geospatial data and display the first five rows:

df = gpd.read_file(r'C:\Users\DELL\Desktop\world.ship)
df.head(5)

In the above code, import .shp means shapefile from the system. Shapefiles are the types of files used to store geographic information.

Output:

Geospatial analysis using Python

To get a basic idea about the data.

df.info()

This will give column names, datatype, and non-null values.

Output:

Geospatial analysis

To get unique names in the NAME column:

df.NAME.unique()

This will give unique names in the NAME column.

Output:

geopandas

To perform visualizations (Geospatial)

To plot all the locations in the file:

df.plot()

This will plot all the locations in the given data.

Output:

perform geospatial visualizations

Suppose we don’t want to plot a particular area. We can do it using the following command:

df = df[df['NAME'] != 'India']
df.plot()

This will give output without India on the map.

Output:

perform geospatial visualization

To visualize a particular country.

df[df.NAME=="Germany"].plot()

This will return Germany as output.

Output:

germany geospatial

To select particular columns from the data:

df = df[['NAME', 'geometry']]
df.tail(3)

In the above code, the area is used to calculate the area of each country based on the geometry of the country. The above code displays the last 3 rows in the new data frame.

Output:

country based on the geometry of the country

To create a new column name Country_area

 

df['Country_area'] = df.area
df.head(3)

This will create new column named Country_area which indicates the area of each country. we are using the .area command to calculate the area of each country.

Output:

calculate the area of each country in Python

To draw a map with colors

df.plot(column = 'NAME', camp = 'hsv')

In the above code, we are plotting a colormap for the column named NAME. In the above cmap=’hsv’ indicates the colormap parameter we have set with color values based on the Hue Saturation Value model which is used to represent colors in terms of Hue, Saturation, and Values.

Output:

each country in Python with color maps

To visualize the countries according to their size:

df['Country_area'] = df.Country_area/1000000
df.plot(column = 'Country_area' , cmap = 'hsv' , legend = True, legend_kwds = {'label': "Area of the country (in Square Kilometer)"}, figsize = (10,10))

In the above code first, we are calculating the area in terms of square kilometre and then we are plotting it using the above command. Figsize indicates the output figure size we have set is (10,10). The legend=True indicates we will be having a map with a sidebar that indicates color values and their meaning.

Output:

a map with a sidebar that indicates color values in Python

In the above article, we learned about basic visualization techniques using the Geopandas library. We can perform complex analysis on geospatial data using libraries like Shapely, folium Fiona, etc.

Leave a Reply

Your email address will not be published. Required fields are marked *