This article participates in Power BI Advent Calendar 2019 by Prince @ yugoes1021.
I've written various geography-related series in Power BI, but from the point of view of simply visualizing them on a map, there isn't much material left, so it tends to be an article that pokes the corner of a heavy box. (However, I don't have the resources or time to touch the Web version or Embedded)
Geographical analysis with Power BI (basic) Geographical analysis with Power BI (Application 1) Geographical analysis with Power BI (Application 2) US map with Power BI 2017 Advent Calendar Geographical analysis with Power BI (2018 summary) 2018 Advent Calendar
So I settled on the possibility that it would be a map using R or Python on Power BI Desktop. Extensions in R and Python are introduced in various places, so here we will only use the official links.
Create Power BI visuals using R (https://docs.microsoft.com/en-us/power-bi/desktop-r-visuals) Run Python scripts in Power BI Desktop (https://docs.microsoft.com/en-us/power-bi/desktop-python-scripts)
As an issue, we will use the point data visualization that we have used as a benchmark as it is. We are evaluating using the same Uber open data as before. San Francisco taxi probe data.
Basically, for the purpose of simple display, we are focusing on how to call R, Python in Power BI, how to display point data in each library, how to dynamically change the display range, etc. Each library has a very different idea, so it would be nice if you could convey that as well, but it's not possible with this size. .. ..
R
R has more variations. Power BI has more history than Python. A little stumbling block is the R version and installation location used by Power BI. You can set it on the option page below, so specify the R interpreter you want to use and always use. This will save you the trouble of installing the library.
However, the same interpreter may save the environment in the user folder, in which case you will need to install it in the global interpreter environment.
library(maps)
It's an old library. Basically, it displays various blank maps and displays the data on it. (The ggmap in the code is only used for the convenience function to get the bounding box) You can superimpose points with the with function.
library(maps)
library(ggmap)
sbbox <- make_bbox(lon = dataset$longitude, lat = dataset$latitude, f = 0)
map('usa', col = "grey", fill = TRUE, bg = "white", border = 0,
xlim = c(sbbox[1], sbbox[3]), ylim = c(sbbox[2], sbbox[4]))
with(dataset, points(longitude, latitude, pch = 1, col = 'blue', cex = .2))
library(sf)
A library for handling spatial data appropriately. You need to convert it to an sf format data frame once. You can plot the data frame directly.
library(sf)
library(sp)
dfsf <- dataset %>% st_as_sf(coords = c('longitude', 'latitude'), crs = 4236)
plot(dfsf, col = "blue", pch = 21)
library(tmap)
It is a library that allows you to draw various thematic maps relatively easily. It is convenient because you can switch between the normal plot mode and the view mode that launches the Leaflet viewer. Like others, Power BI couldn't do anything like launching a browser on Leaflet's external screen. However, as you can see below, the basemap cannot be pasted in plot mode. Sorry.
library(tmap)
library(dplyr)
library(sf)
library(sp)
dfsf <- dataset %>% st_as_sf(coords = c('longitude', 'latitude'), crs = 4236)
tmap_mode("plot")
map <- tm_shape(dfsf, name = "uber") +
tm_symbols(shape = 21, col = "blue", size = 0.05) +
tm_basemap("Stamen.Watercolor")
map
library(ggplot2)
The ability to draw maps is integrated into ggplot. Perhaps the most common data processor is usually the one that works best for you.
library(ggplot2)
library(mapproj)
library(ggmap)
sbbox <- make_bbox(lon = dataset$longitude, lat = dataset$latitude, f = 0)
usmap <- map_data("state")
ggplot() +
geom_polygon(data = usmap, aes(x = long, y = lat, group = group), fill = "grey", alpha = 0.5) +
geom_point(data = dataset, aes(x = longitude, y = latitude)) +
theme_void() + coord_map(xlim = c(sbbox[1], sbbox[3]), ylim = c(sbbox[2], sbbox[4]))
library(ggmap)
If you want a more detailed background map, this is it. It is necessary to register the API Key, probably because the restrictions of the Google Maps API have become tight. Also, be sure to get the following development version with useful registration functions.
If you install the latest version by the following method in your R environment, a function called register_google that allows key setting will be included, so upgrade it.
devtools::install_github("dkahle/ggmap")
library(ggplot2)
library(mapproj)
library(ggmap)
register_google(key = "YOUR_API_KEY")
sbbox <- make_bbox(lon = dataset$longitude, lat = dataset$latitude, f = 0)
map <- get_stamenmap(bbox = sbbox, zoom = 13, maptype = "toner-lite")
ggmap(map) +
geom_point(aes(x = longitude, y = latitude), color = "blue" ,data = dataset, alpha = .5)
Python
Python has a full-fledged map-based visualization library such as Folium and Shapely, and a geographic data processing library such as geo pandas that is very easy to handle, but when I tried it on Power BI, it did not work easily. did. There was also a person who wanted to run Folium, but as shown below, it seems that only a limited library works with the current Power BI, so I decided to give up obediently. .. ..
Help to implement Python Script - Microsoft Power BI Community
The following Python packages (non-Intel MKL) are currently supported for use in your Power BI reports. Reference: Python packages and versions
- Matplotlib
Python also sets the interpreter below. I think it will be Anaconda, but please note that even if you install a new library, it will not work in Power BI.
Matplotlib
Among them, Matplotlib seems to have a library called mpl_toolkits: basemap. It is not a Matplotlib standard and must be installed. Currently, pip installation is not supported and conda etc. is used.
conda install -c anaconda basemap
After installing with, it can be used in the Anaconda environment.
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
m = Basemap(llcrnrlon=BBox[0],llcrnrlat=BBox[2],urcrnrlon=BBox[1],urcrnrlat=BBox[3])
m.drawcoastlines()
x, y = m(dataset.longitude, dataset.latitude)
m.plot(x, y, 'o')
plt.show()
Results in VS Code:
However, it didn't work in Power BI because it wasn't a library other than Matplotlib in the first place. orz
We are using the same data as before, so let's compare it with the standard library. I tried to narrow down the number of records in advance with the query editor. Python makes tea muddy and simply displays a 2D graph.
It is displayed without any problem including the standard map. It's just the number displayed.
The standard map will give a message that all points are not displayed. It seems that there are no major omissions as I saw others. The speed doesn't change much either.
ArcGIS has begun to play. The standard map seems to be randomly sampled, and the range of appearance has not changed much. I don't know that other libraries are running on Power BI, and they seem to be able to see them all. (really?) It doesn't change much except that tmap and ggmap are a little slow. You won't have to wait a minute.
At this point, it seems that data is being thinned out for R visuals as well. Also, in the Uber data, there is a car that goes to Las Vegas, so ggmap takes time to display the whole (map enlargement ratio needs to be adjusted)
In such a simple map, it is not meaningful to use R code to visualize it, but if you need special drawing or calculation, you can embed a library firmly in R and use it. , I thought there might be a turn.
Recommended Posts