当前位置: 首页 > news >正文

小试R空间处理新库sf

不久以前在StackOverflow上看到一个空间处理的问题,很有意思,便借着这个问题试了试R新出的空间处理库sf

问题描述

Starting from a shapefile containing a fairly large number (about 20000) of potentially partially-overlapping polygons, I'd need to extract all the sub-polygons originated by intersecting their different "boundaries".

In practice, starting from some mock-up data:

library(tibble)
library(dplyr)
library(sf)

ncircles <- 9
rmax     <- 120
x_limits <- c(-70,70)
y_limits <- c(-30,30)
set.seed(100) 
xy <- data.frame(
  id = paste0("id_", 1:ncircles), 
  x = runif(ncircles, min(x_limits), max(x_limits)),
  y = runif(ncircles, min(y_limits), max(y_limits))) %>% 
  as_tibble()

polys <- st_as_sf(xy, coords = c(2,3)) %>% 
  st_buffer(runif(ncircles, min = 1, max = 20)) 

plot(polys[1])  

clipboard.png

I'd need to derive an sf or sp multipolygon containing ALL and ONLY the polygons generated by the intersections, something like:

clipboard.png

(note that the colors are there just to exemplify the expected result, in which each "differently colored" area is a separate polygon which doesn't overlay any other polygon)

I know I could work my way out by analyzing one polygon at a time, identifying and saving all its intersections and then "erase" those areas form the full multipolygon and proceed in a cycle, but that is quite slow.

I feel there should be a more efficient solution for this, but I am not able to figure it out, so any help would be appreciated! (Both sf and sp based solutions are welcome)

UPDATE:

In the end, I found out that even going "one polygon at a time" the task is far from simple! I'm really struggling on this apparently "easy" problem! Any hints? Even a slow solution or hints for starting on a proper path would be appreciated!

UPDATE 2:

Maybe this will clarify things: the desired functionality would be similar to the one described here:

https://it.mathworks.com/matl...

UPDATE 3:

I awarded the bounty to @shuiping-chen (thanks !), whose answer correctly solved the problem on the example dataset provided. The "method" has however to be generalized to situations were "quadruple" or "n-uple" intersections are possible. I'll try to work on that in the coming days and post a more general solution if I manage !

我的回答

Input

I modify the mock-up data a bit in order to illustrate the ability to deal with multiple attributes.

library(tibble)
library(dplyr)
library(sf)

ncircles <- 9
rmax     <- 120
x_limits <- c(-70,70)
y_limits <- c(-30,30)
set.seed(100) 
xy <- data.frame(
  id = paste0("id_", 1:ncircles), 
  val = paste0("val_", 1:ncircles),
  x = runif(ncircles, min(x_limits), max(x_limits)),
  y = runif(ncircles, min(y_limits), max(y_limits)),
  stringsAsFactors = FALSE) %>% 
  as_tibble()

polys <- st_as_sf(xy, coords = c(3,4)) %>% 
  st_buffer(runif(ncircles, min = 1, max = 20)) 
plot(polys[1])

clipboard.png

Basic Operation

Then define the following two functions.

  • cur: the current index of the base polygon

  • x: the index of polygons, which intersects with cur

  • input_polys: the simple feature of the polygons

  • keep_columns: the vector of names of attributes needed to keep after the geometric calculation

get_difference_region() get the difference between the base polygon and other intersected polygons; get_intersection_region() get the intersections among the intersected polygons.

library(stringr)
get_difference_region <- function(cur, x, input_polys, keep_columns=c("id")){
  x <- x[!x==cur] # remove self 
  len <- length(x)
  input_poly_sfc <- st_geometry(input_polys)
  input_poly_attr <- as.data.frame(as.data.frame(input_polys)[, keep_columns])
  
  # base poly
  res_poly <- input_poly_sfc[[cur]]
  res_attr <- input_poly_attr[cur, ]
  
  # substract the intersection parts from base poly
  if(len > 0){
    for(i in 1:len){
      res_poly <- st_difference(res_poly, input_poly_sfc[[x[i]]])
    }
  }
  return(cbind(res_attr, data.frame(geom=st_as_text(res_poly))))
}


get_intersection_region <- function(cur, x, input_polys, keep_columns=c("id"), sep="&"){
  x <- x[!x<=cur] # remove self and remove duplicated obj 
  len <- length(x)
  input_poly_sfc <- st_geometry(input_polys)
  input_poly_attr <- as.data.frame(as.data.frame(input_polys)[, keep_columns])
  
  res_df <- data.frame()
  if(len > 0){
    for(i in 1:len){
      res_poly <- st_intersection(input_poly_sfc[[cur]], input_poly_sfc[[x[i]]])
      res_attr <- list()
      for(j in 1:length(keep_columns)){
        pred_attr <- str_split(input_poly_attr[cur, j], sep, simplify = TRUE)
        next_attr <- str_split(input_poly_attr[x[i], j], sep, simplify = TRUE)
        res_attr[[j]] <- paste(sort(unique(c(pred_attr, next_attr))), collapse=sep)
      }
      res_attr <- as.data.frame(res_attr)
      colnames(res_attr) <- keep_columns
      res_df <- rbind(res_df, cbind(res_attr, data.frame(geom=st_as_text(res_poly))))
    }
  }
  return(res_df)
}

First Level

Difference

Let's see the difference function effect on the mock-up data.

flag <- st_intersects(polys, polys)

first_diff <- data.frame()
for(i in 1:length(flag)) {
  cur_df <- get_difference_region(i, flag[[i]], polys, keep_column = c("id", "val"))
  first_diff <- rbind(first_diff, cur_df)
}
first_diff_sf <- st_as_sf(first_diff, wkt="geom")
first_diff_sf
plot(first_diff_sf[1])

clipboard.png

Intersection


first_inter <- data.frame()
for(i in 1:length(flag)) {
  cur_df <- get_intersection_region(i, flag[[i]], polys, keep_column=c("id", "val"))
  first_inter <- rbind(first_inter, cur_df)
}
first_inter <- first_inter[row.names(first_inter %>% select(-geom) %>% distinct()),]
first_inter_sf <- st_as_sf(first_inter, wkt="geom")
first_inter_sf
plot(first_inter_sf[1])

clipboard.png

Second Level

use the intersection of first level as input, and repeat the same process.

Difference

flag <- st_intersects(first_inter_sf, first_inter_sf)
# Second level difference region
second_diff <- data.frame()
for(i in 1:length(flag)) {
  cur_df <- get_difference_region(i, flag[[i]], first_inter_sf, keep_column = c("id", "val"))
  second_diff <- rbind(second_diff, cur_df)
}
second_diff_sf <- st_as_sf(second_diff, wkt="geom")
second_diff_sf
plot(second_diff_sf[1])

clipboard.png

Intersection

second_inter <- data.frame()
for(i in 1:length(flag)) {
  cur_df <- get_intersection_region(i, flag[[i]], first_inter_sf, keep_column=c("id", "val"))
  second_inter <- rbind(second_inter, cur_df)
}
second_inter <- second_inter[row.names(second_inter %>% select(-geom) %>% distinct()),]  # remove duplicated shape
second_inter_sf <- st_as_sf(second_inter, wkt="geom")
second_inter_sf
plot(second_inter_sf[1])

clipboard.png

Get the distinct intersections of the second level, and use them as the input of the third level. We could get that the intersection results of the third level is NULL, then the process should end.

Summary

We put all the difference results into close list, and put all the intersection results into open list. Then we have:

  • When open list is empty, we stop the process

  • The results is close list

Therefore, we get the final code here (the basic two functions should be declared):

# init
close_df <- data.frame()
open_sf <- polys

# main loop
while(!is.null(open_sf)) {
  flag <- st_intersects(open_sf, open_sf)
  for(i in 1:length(flag)) {
    cur_df <- get_difference_region(i, flag[[i]], open_sf, keep_column = c("id", "val"))
    close_df <- rbind(close_df, cur_df)
  }
  cur_open <- data.frame()
  for(i in 1:length(flag)) {
    cur_df <- get_intersection_region(i, flag[[i]], open_sf, keep_column = c("id", "val"))
    cur_open <- rbind(cur_open, cur_df)
  }
  if(nrow(cur_open) != 0) {
    cur_open <- cur_open[row.names(cur_open %>% select(-geom) %>% distinct()),]
    open_sf <- st_as_sf(cur_open, wkt="geom")
  }
  else{
    open_sf <- NULL
  }
}

close_sf <- st_as_sf(close_df, wkt="geom")
close_sf
plot(close_sf[1])

clipboard.png

clipboard.png

相关文章:

  • Gnuplot 使用技巧
  • js去除空格12种方法
  • 与、或、异或、取反、左移和右移
  • oracle数据库设置了默认字段,但默认值无效
  • MFC应用程序向导生成的最简单程序HelloMFC详解
  • miniconda管理python[windows10]
  • eclipse中字体太小
  • C# Math函数 字符串、整数装换
  • 存储与虚拟机主机管理(五)
  • 短文本合并重复(去重)的简单有效做法
  • css布局,左右固定中间自适应实现
  • 写给兔小白的js教程(4)
  • 锐捷网络:让“店商”感知“大数据”的力量
  • 未来已来,4K激活字库产业新世代
  • Hadoop家族学习路线图
  • .pyc 想到的一些问题
  • Brief introduction of how to 'Call, Apply and Bind'
  • CentOS7 安装JDK
  • echarts的各种常用效果展示
  • express + mock 让前后台并行开发
  • JS进阶 - JS 、JS-Web-API与DOM、BOM
  • 程序员最讨厌的9句话,你可有补充?
  • 大快搜索数据爬虫技术实例安装教学篇
  • 仿天猫超市收藏抛物线动画工具库
  • 搞机器学习要哪些技能
  • 海量大数据大屏分析展示一步到位:DataWorks数据服务+MaxCompute Lightning对接DataV最佳实践...
  • 基于Android乐音识别(2)
  • 利用jquery编写加法运算验证码
  • 那些被忽略的 JavaScript 数组方法细节
  • 批量截取pdf文件
  • 如何抓住下一波零售风口?看RPA玩转零售自动化
  • 手写双向链表LinkedList的几个常用功能
  • 扩展资源服务器解决oauth2 性能瓶颈
  • ​渐进式Web应用PWA的未来
  • ​决定德拉瓦州地区版图的关键历史事件
  • ​香农与信息论三大定律
  • "无招胜有招"nbsp;史上最全的互…
  • $(function(){})与(function($){....})(jQuery)的区别
  • $Django python中使用redis, django中使用(封装了),redis开启事务(管道)
  • %@ page import=%的用法
  • (八十八)VFL语言初步 - 实现布局
  • (博弈 sg入门)kiki's game -- hdu -- 2147
  • (附源码)springboot助农电商系统 毕业设计 081919
  • (附源码)ssm教师工作量核算统计系统 毕业设计 162307
  • (附源码)计算机毕业设计高校学生选课系统
  • (六)库存超卖案例实战——使用mysql分布式锁解决“超卖”问题
  • (强烈推荐)移动端音视频从零到上手(下)
  • (原创) cocos2dx使用Curl连接网络(客户端)
  • (转)Android学习系列(31)--App自动化之使用Ant编译项目多渠道打包
  • (转)shell中括号的特殊用法 linux if多条件判断
  • ... 是什么 ?... 有什么用处?
  • .apk 成为历史!
  • .NET CORE Aws S3 使用
  • .NET 自定义中间件 判断是否存在 AllowAnonymousAttribute 特性 来判断是否需要身份验证
  • .NET业务框架的构建