It takes time to process a large amount of spatial data with GIS alone </ b>, and I felt the need for code processing in order to proceed with my research. Also, the plugins that exist in QGIS do not do enough network analysis and graphic simplification, and Python's rich library is an effective tool.
This time, I will use Automating GIS processes provided by the University of Helsinki as a study material. It consists of 7 weeks, and you can learn especially about visualization using GeoPnadas and matplotlib. In addition, homework (practice questions) for students is given each week, and this article mainly provides examples of answers. Part 1: Week1-Week2 Part 2: Week3-Week4 Part 3: Week5-Week6 (Week7 has no issues) Part 4 (under construction): Final Assignment
The following is a part of the contents of the weekly teaching materials.
week | Contents |
---|---|
Week1 | point,line,Drawing polygons |
Week2 | Introducing Geopandas,CRS settings,Distance calculation |
Week3 | Spatial coupling,Neighborhood analysis |
Week4 | Spatial coupling 2,Data tiering |
Week5 | Static / dynamic map drawing, |
Week6 | OSM analysis,Network analysis |
Week7 | QGIS plugin |
Week 1
As a condition of the problem, some requirements are required for the function to function properly.
from shapely.geometry import Point, LineString, Polygon
#point
def create_point_geom(x, y):
point = Point(x, y)
return point
#Takes a line list as an argument
def create_line_geom(points):
assert type(points)=="list", "Input should be a list!"
assert len(points)>=2, "LineString object requires at least two Points!"
line = LineString([points[0], points[1]])
return line
#Takes a polygon list as an argument
def create_poly_geom(coords):
assert type(coords) is list, "Input should be a list!"
assert len(coords)>=3, "Polygon object requires at least three Points!"
for i in coords:
assert type(i) is tuple, "All list values should be coordinate tuples!"
poly = Polygon(coords)
return poly
The assert statement is not essential, so some parts are omitted.
#Center of gravity argument is point or line or polygon
def get_centroid(gem):
assert type(gem) == 'shapely', "Input should be a Shapely geometry!"
assert gem.geom_type in ['Point', 'LineString', 'Polygon'], "Input should be a Shapely geometry!"
centroid = gem.centroid
return centroid
#area
def get_area(poly):
return poly.area
#distance
def get_length(geom):
if geom.geom_type == 'LineString':
return geom.length
elif geom.geom_type == 'Polygon':
return geom.exterior.length
This is a simple data processing exercise using Helsinki's mobile data. from_x, from_y indicate the starting point, to_x, to_y indicate the arrival point, and the goal is to obtain the total distance traveled from these data.
import pandas as pd
#Data read
data = pd.read_table("data/travelTimes_2015_Helsinki.txt", sep=";",)
data = data[['from_x','from_y', 'to_x', 'to_y', 'total_route_time',]]
#Creating coordinate data
orig_points = []
dest_points = []
from shapely.geometry import Point
for index, row in data.iterrows():
orig = Point(row['from_x'], row['from_y'])
dest = Point(row['to_x'], row['to_y'])
orig_points.append(orig)
dest_points.append(dest)
#Creating a moving line
from shapely.geometry import LineString
lines = []
for orig, dest in zip(orig_points, dest_points):
line = LineString([orig, dest])
lines.append(line)
#Obtaining the total distance traveled
total_length = 0
for line in lines:
total_length += line.length
Week2 Learn a library called Geopandas, an extension of Pandas. Like Pandas, it's very easy to use and space manipulation is easy.
import geopandas as gpd
import matplotlib.pyplot as plt
from shapely.geometry import Polygon
#Coordinate data to use
longitudes = [29.99671173095703, 31.58196258544922, 27.738052368164062, 26.50013542175293, 26.652359008789062, 25.921663284301758, 22.90027618408203, 23.257217407226562,
23.335693359375, 22.87444305419922, 23.08465003967285, 22.565473556518555, 21.452774047851562, 21.66388702392578, 21.065969467163086, 21.67659568786621,
21.496871948242188, 22.339998245239258, 22.288192749023438, 24.539581298828125, 25.444232940673828, 25.303749084472656, 24.669166564941406, 24.689163208007812,
24.174999237060547, 23.68471908569336, 24.000761032104492, 23.57332992553711, 23.76513671875, 23.430830001831055, 23.6597900390625, 20.580928802490234, 21.320831298828125,
22.398330688476562, 23.97638702392578, 24.934917449951172, 25.7611083984375, 25.95930290222168, 26.476804733276367, 27.91069221496582, 29.1027774810791, 29.29846954345703,
28.4355525970459, 28.817358016967773, 28.459857940673828, 30.028610229492188, 29.075136184692383, 30.13492774963379, 29.818885803222656, 29.640830993652344, 30.57735824584961,
29.99671173095703]
latitudes = [63.748023986816406, 62.90789794921875, 60.511383056640625, 60.44499588012695, 60.646385192871094, 60.243743896484375, 59.806800842285156, 59.91944122314453,
60.02395248413086, 60.14555358886719, 60.3452033996582, 60.211936950683594, 60.56249237060547, 61.54027557373047, 62.59798049926758, 63.02013397216797,
63.20353698730469, 63.27652359008789, 63.525691986083984, 64.79915618896484, 64.9533920288086, 65.51513671875, 65.65470886230469, 65.89610290527344, 65.79151916503906,
66.26332092285156, 66.80228424072266, 67.1570053100586, 67.4168701171875, 67.47978210449219, 67.94589233398438, 69.060302734375, 69.32611083984375, 68.71110534667969,
68.83248901367188, 68.580810546875, 68.98916625976562, 69.68568420410156, 69.9363784790039, 70.08860778808594, 69.70597076416016, 69.48533630371094, 68.90263366699219,
68.84700012207031, 68.53485107421875, 67.69471740722656, 66.90360260009766, 65.70887756347656, 65.6533203125, 64.92096710205078, 64.22373962402344, 63.748023986816406]
#Polygonization
coordpairs = list(zip(longitudes, latitudes))
poly = Polygon(coordpairs)
#Creating a GeoDataFrame
geo = gpd.GeoDataFrame(index=[0], columns=['geometry'])
geo['geometry'] = poly
#Illustrated
import matplotlib.pyplot as plt
geo.plot()
#Save shp file
fp = 'polygon.shp'
geo.to_file(fp)
The hateful CRS is here. The apply function works well with pandas and can be processed with short code.
#Library
import pandas as pd
from shapely.geometry import Point, LineString, Polygon
from pyproj import CRS
import matplotlib.pyplot as plt
#read csv
data = pd.read_csv('data/some_posts.csv')
#Point data creation
data = pd.read_csv('data/some_posts.csv')
make_point = lambda row:Point(row['lat'],row['lon'])
data['geometry'] = data.apply(make_point, axis=1)
#You need to specify the transformation geometry and coordinate system to GeoDataFrame.
geo = gpd.GeoDataFrame(data, geometry='geometry',crs=CRS.from_epsg(4326).to_wkt())
#plot
geo.plot()
Calculate the distance traveled by each user from the coordinate data posted on SNS. Coordinate system conversion will appear, but note the following points. CRS is a nuisance everywhere. .. ..
-Because the GeoDataFrame data is not converted only by the definition of the coordinate system (ex. Data.crs = CRS.from_espg (4276)), the conversion of the coordinate system (ex. Data = data.to_crs (epsg = 4276)) is required. -If the coordinate system of GeoDataFrame is not defined, define the coordinate system before converting the coordinate system.
import geopandas as gpd
import pandas as pd
from pyproj import CRS
from shapely.geometry import Point, LineString, Polygon
#Data read
data = gpd.read_file('Kruger_posts.shp')
#Coordinate system conversion
data = data.to_crs(epsg=32735)
#Create a moving line by classifying by id(If there is only one post, it has not moved,Enter a None value)
grouped = data.groupby('userid')
movements = gpd.GeoDataFrame(columns=['userid', 'geometry'])
for key, group in grouped:
group = group.sort_values('timestamp')
if len(group['geometry'])>=2:
line = (LineString(list(group['geometry'])))
else:
line=None
movements.at[count, 'userid'] = key
movements.at[count, 'geometry'] = line
movements.crs = CRS.from_epsg(32735)
#Calculation of travel distance
def cal_distance(x):
if x['geometry'] is None:
return None
else:
return x['geometry'].length
movements['distance'] = movements.apply(cal_distance, axis=1)
#Average distance traveled,Maximum value,minimum value
print(mean(movements['distance'])) #138871.14194459998
print(max(movements['distance'].dropna())) #8457917.497356484
print(min(movements['distance'])) #0.0
Recommended Posts