• 生产环境_移动目标轨迹压缩应用和算法处理-Douglas-Peucker轨迹压缩算法


    场景:

    我目前设计到的场景是:以路面上行驶的汽车为例,即在地图应用中,对GPS轨迹数据进行压缩,减少数据传输和存储开销,因为轨迹点太频繁了,占用空间太大,运行节点太慢了,经过小组讨论需要上这个算法。

    涉及到的算法

    1. Douglas-Peucker算法:该算法通过递归地将轨迹分割为线段,并丢弃那些与整体轨迹偏差较小的线段,从而实现轨迹的压缩。
      1. Visvalingam-Whyatt算法:该算法基于三角形面积的概念,通过不断移除面积最小的点来达到轨迹压缩的目的

                                    图片来源:郑宇博士《computing with spatial trajectories》

    Haversine公式计算距离和Douglas-Peucker压缩算法代码实现-scala版

    1. import org.apache.spark.sql.{DataFrame, SparkSession}
    2. import org.apache.spark.sql.functions._
    3. import scala.math._
    4. // 定义表示点的类
    5. case class Point(lon: Double, lat: Double, time: String, id: String)
    6. // Haversine距离计算函数
    7. def haversineDistance(point1: Point, point2: Point): Double = {
    8. val R = 6371000.0 // 地球半径(米)
    9. val dLat = toRadians(point2.lat - point1.lat)
    10. val dLon = toRadians(point2.lon - point1.lon)
    11. val a = pow(sin(dLat / 2), 2) + cos(toRadians(point1.lat)) * cos(toRadians(point2.lat)) * pow(sin(dLon / 2), 2)
    12. val c = 2 * atan2(sqrt(a), sqrt(1 - a))
    13. R * c
    14. }
    15. // Douglas-Peucker轨迹压缩函数
    16. def douglasPeucker(points: List[Point], epsilon: Double): List[Point] = {
    17. if (points.length < 3) {
    18. return points
    19. }
    20. val dmax = points.view.zipWithIndex.map { case (point, index) =>
    21. if (index != 0 && index != points.length - 1) {
    22. perpendicularDistance(point, points.head, points.last)
    23. } else {
    24. 0.0
    25. }
    26. }.max
    27. if (dmax > epsilon) {
    28. val index = points.view.zipWithIndex.maxBy { case (point, index) =>
    29. if (index != 0 && index != points.length - 1) {
    30. perpendicularDistance(point, points.head, points.last)
    31. } else {
    32. 0.0
    33. }
    34. }._2
    35. val recResults1 = douglasPeucker(points.take(index+1), epsilon)
    36. val recResults2 = douglasPeucker(points.drop(index), epsilon)
    37. recResults1.init ::: recResults2
    38. } else {
    39. List(points.head, points.last)
    40. }
    41. }
    42. val spark = SparkSession.builder().appName("TrajectoryCompression").getOrCreate()
    43. // 接入包含lon、lat、time和id列的DataFrame
    44. //https://blog.csdn.net/qq_52128187?type=blog,by_laoli
    45. val data = Seq(
    46. (40.7128, -74.0060, "2023-11-18 08:00:00", "1"),
    47. (40.7215, -74.0112, "2023-11-18 08:05:00", "1"),
    48. (40.7312, -74.0146, "2023-11-18 08:10:00", "1"),
    49. (40.7356, -74.0162, "2023-11-18 08:15:00", "1"),
    50. (40.7391, -74.0182, "2023-11-18 08:20:00", "1"),
    51. (40.7483, -74.0224, "2023-11-18 08:25:00", "1"),
    52. (40.7527, -74.0260, "2023-11-18 08:30:00", "1")
    53. ).toDF("lon", "lat", "time", "id")
    54. // 为DataFrame添加id列
    55. val dfWithId = data.withColumn("id", monotonically_increasing_id())
    56. // 将DataFrame转换为Point列表
    57. val points = dfWithId.as[(Double, Double, String, Long)].collect()
    58. .map(p => Point(p._1, p._2, p._3, p._4.toString)).toList
    59. // 执行轨迹压缩
    60. val compressedPoints = douglasPeucker(points, epsilon = 10)
    61. // <- 设置epsilon值
    62. // 将压缩后的数据重新转换为DataFrame
    63. import spark.implicits._
    64. val df2 = compressedPoints.toDF("lon", "lat", "time", "id")

    参考文章

    • Douglas, D.H., and Peucker, T.K. "Algorithms for the reduction of the number of points required to represent a digitized line or its caricature." The Canadian Cartographer 10.2 (1973): 112-122.
    • Visvalingam, M., and Whyatt, J.D. "Line generalization by repeated elimination of the smallest-area triangle." Cartographic Journal 30.1 (1993): 46-51.
    • 轨迹数据压缩的Douglas-Peucker算法(附代码及原始数据) - 知乎
  • 相关阅读:
    numpy PIL tensor之间的相互转换
    如何用程序开启 平滑屏幕字体边缘 c++
    NodeJs实战-待办列表(2)-待办列表增删
    multiprocessing.pool详解
    有什么可以自动保存微信文件的方法么?
    C++语法基础(3)——分支结构程序设计
    数据结构详细笔记——串
    华为交换机ssh远程登录配置命令
    Jetson-XAVIAR NX 上编译tensorflow-lite
    Python FastApi 解决跨域及OPTIONS预请求处理
  • 原文地址:https://blog.csdn.net/qq_52128187/article/details/134528123