Sorry, you need to enable JavaScript to visit this website.
Research & Innovation

Learn about Geo-Spark with Mohamed Sarwat

Building Systems to Enable Spatial Data Science at Scale

In the last 20 years, Geo-spatial data (extracted from GPS traces, Geo-tagged social media, weather maps, natural disasters, satellites imagery, and epidemic situations) has become wildly ubiquitous. This has led to the rise of spatial data science as a field,which usually refers to extracting meaningful information from Geo-spatial data. However, the lack of scalability and interactivity in state-of-the-art spatial data systems makes it extremely difficult for a data scientist to store, retrieve, explore, analyze,visualize, and learn from large-scale Geo-spatial data.

This webinar will shed light on Geo-Spark, an open source data system that builds upon the core engine of Apache Spark to efficiently process large-scale Geo-spatial data in a cluster computing environment.

Internally, Geo-Spark represents Geo-spatial data as a SpatialRDD, which is tailored for Apache Spark in-memory data processing paradigm. Geo-Spark allows users to write their spatial data processing tasks in Spatial SQL, compiles the input SQL into a set of optimized SpatialRDD operations, and finally executes such operations in the cluster.

Mohamed Sarwat, assistant professor at Arizona State University, will give an overview of Hippo a lightweight indexing scheme that outperforms de-facto database indexes such B-tree and R-tree in terms of storage and maintenance overhead, while still executing range queries at a comparative performance to such indexes.

Furthermore, a data scientist may sometimes allow for a slight trade-off between the accuracy and scalability of the analysis. To allow for such trade-off, Sarwat will present a sampling middleware system called Tabula, which sits between the data system and the data science tool to make the inherently iterative human-in-the-loop analysis process more seamless and interactive.

Speaker Bio:
  • Mohamed Sarwat:

    Assistant Professor of Computer Science, Arizona State University

If you've enjoyed this content, click below to find out more about the Publisher: Qatar Computing Research Institute (QCRI)
Tags
data-science
Age Group
All Ages
Language
english

No Reviews

Be the first to share your feedback on this class by clicking below.

Let us know your thoughts:

How likely are you to recommend this class to your friends ?

  1. Very likely
    %
    Neutral
    %
    Not likely
    %