Urban Remote Sensing with Spatial Big Data: Comparison
Please note this is a comparison between Version 1 by Danlin Yu and Version 2 by Lindsay Dong.

During the past decades, multiple remote sensing data sources, including nighttime light images, high spatial resolution multispectral satellite images, unmanned drone images, and hyperspectral images, among many others, have provided fresh opportunities to examine the dynamics of urban landscapes. In the meantime, the rapid development of telecommunications and mobile technology, alongside the emergence of online search engines and social media platforms with geotagging technology, has fundamentally changed how human activities and the urban landscape are recorded and depicted. The combination of these two types of data sources results in explosive and mind-blowing discoveries in contemporary urban studies, especially for the purposes of sustainable urban planning and development. Urban scholars are now equipped with abundant data to examine many theoretical arguments that often result from limited and indirect observations and less-than-ideal controlled experiments. For the first time, urban scholars can model, simulate, and predict changes in the urban landscape using real-time data to produce the most realistic results, providing invaluable information for urban planners and governments to aim for a sustainable and healthy urban future. 

  • urban studies
  • remote sensing
  • spatial big data

1. Introduction

Remote sensing technologies have experienced unprecedented development over the past decades, thanks primarily to sensor advancements and continuously increasing information infrastructure [1][123]. One of the key advancements in remote sensing technology development, and closely related to urban science, is object detection from remote sensing images. After an intensive review of recent deep learning-based object detection progress, Li, et al. [2][124] proposed a large-scale, publicly available benchmark for object detection in optical remote (DIOR) sensing images, which contains 23,463 images and 192,472 instances, covering 20 object classes. The benchmark established the baseline for scholars to develop and validate their own study, which is particularly useful in urban science.

Clearly, while research in urban studies now primarily falls within the fields of environmental sciences and studies, and focuses mostly on sustainable development, agenda, approaches, action plans, and strategic operations, works that take advantage of the most recent developments in observational technology (remote sensing), geotagged data generating platforms (spatial big data), and advanced spatiotemporal data analysis techniques (such as spatial econometrics and Bayesian hierarchical spatiotemporal modeling, among many others) are only starting to take off.

2. Remote Sensing and the Advancement of Urban Science

In the early development stages of remote sensing technology, the term “big data” was not on the horizon. Back then, applications of remote sensing technology were primarily for observation, change detection, and information extraction, limited by the available spatial and temporal resolutions [3][151]. The rapid development of various sensors and the accumulation of remote sensing images in the recent decade, coupled with the recognition of us entering a “big data” era, however, has greatly changed the ways remote sensing images are stored, processed, analyzed, and utilized. In their study, Xu, et al. [4][152] regard remote sensing as a form of “big data” (remote sensing big data) and proposed a modular framework attempting to connect the data (remote sensing images) and computation (big data computation). This is especially effective with the advancement in computer science and computational capabilities of today’s networked hardware and software environment. Consequently, Xu, et al. [5][153] argue that cloud computing is an effective way to activate and mine large-scale heterogeneous data such as remote sensing big data. In addition, Zhang, et al. [6][154] study suggests that deep learning algorithms are effective and efficient ways to process and analyze remote sensing big data, including geometric and radiometric rectification and processing, cloud detection and removal, data fusion, object identification and extraction, land-use and cover classification, change evaluation, and multitemporal analysis. The coupling of remote sensing and big data starts off with a mutually supportive relationship. While accumulative remote sensing images are undoubtedly a form of spatial big data, spatial big data also extends its horizon to include data acquired from geotagged social sensing, in which the sensors are none other than the people who are also part of the dynamic urban space complex.

2.1. Remote Sensing and Its Application to Urban Studies

The term, “urban remote sensing,” or, rather, applying remote sensing technologies to study urban phenomena and urban environments, only appears in the late 1950s. Norman and colleagues started to explore the urban environments in the late 1950s using aerophotos to interpret the social structure, human geography, and human ecology of cities [7][8][9][157,158,159]. A report submitted to NASA and the Geological Survey [10][160] attempted to use color infrared aerial photos to analyze urban residential environments in the Los Angeles basin. As meticulously noted in their report, the authors stated that applications of remote sensing techniques in urban studies were slower than in other fields such as land use land cover change detection, water resource management, and forest management. They argued that this was because of the “great diversity of the urban environment,” and the “complex nature of the spatial relationships” among different urban elements. In addition, the remote sensing techniques at the time were also limited by the available spatial and temporal resolutions of the remote sensing products that were typically coarse for typical urban applications. Urban environments, unlike in other fields where remote sensing found lively applications, require much smaller spatial and much shorter temporal resolutions to produce meaningful and actionable study results. Still, the sheer volume of information that is contained within the remote sensing products (even though most of such products are in physical paper formats, and often produced by airplane-borne unstable sensors for urban application), was very tempting for urban scholars, especially since such approaches provided timely and abundant information that traditional approaches fall short on, such as large area land use change detection [11][12][13][14][15][16][17][18][8,161,162,163,164,165,166,167], urban waterbody and green space extraction and mapping [19][20][21][22][23][24][168,169,170,171,172,173], urban environmental justice evaluation [22][25][26][171,174,175], and urban heat island detection and mechanism studies [27][28][29][30][31][32][33][34][35][36][37][38][39][176,177,178,179,180,181,182,183,184,185,186,187,188], among many others.

2.1.1. Extracting and Analyzing Physical Environments of Urban Areas

Using remote sensing images (be it aerophotos or satellite borne sensed images) to detect land use land cover change, detection of environmental condition changes, and monitor urban heat island phenomenon were among the most obvious choices due to the different reflectivity in both panchromatic and multispectral bands of different land use land cover types, and the thermal signatures under different temperatures.Common algorithms that classify land use land cover [19][40][41][42][43][44][45][46][47][48][49][50][168,191,192,193,194,195,196,197,198,199,200,201], detect and monitor the general urban environments [51][52][53][54][55][56][57][58][59][202,203,204,205,206,207,208,209,210], air quality assessment and pollution hotspot identification [60][61][62][63][211,212,213,214], and extract the percentage of impervious surfaces [32][64][65][66][67][68][69][70][71][72][73][74][75][76][181,215,216,217,218,219,220,221,222,223,224,225,226,227] are widely applied in this aspect of urban studies. This is understandable since the practices are a natural extension of applying remote sensing techniques to study natural environments. However, urban areas are more fragmented, more complex, and fluctuate more often and more irregularly than natural environments. Still, the newly developed machine learning algorithms, such as random forest [77][78][228,229], support vector machine [53][79][204,230], neural network [14][80][81][82][83][163,231,232,233,234], deep learning [6][84][85][86][154,235,236,237], and estimation techniques, including categorized and regression tree (CART) [87][238], geographically weighted regression [68][69][88][89][81,219,220,239], and Bayesian learning [90][240], among many others, provide an ever increasing arsenal for urban scholars to take advantage of the growing remote sensing datasets, be it regular 30 m spatial resolution multispectral images or sub-meter spatial resolution hyperspectral images. Undoubtedly, applying remote sensing techniques to study urban environments, air quality assessment, and urban land use land covers will continue to dominate the frontline of urban remote sensing scholarly activities.

2.1.2. Morphological Analysis of Urban Landscapes

Analyzing urban morphology and detecting urban spatial patterns from remote sensing data is straightforward, and of particular importance for urbanization assessments. As noted in the studies by Zhu, et al. [91][241], an accurate account of urban morphological features is “at the core of many international endeavors to address issues of urbanization, such as the United Nations’ call for Sustainable Cities and Communities” [91][241]. From the late 1980s onwards, urbanization has picked up its pace, especially in developing countries, due to increased globalization and industrialization worldwide. One of the major issues of rapid urbanization, as manifested in the developed world right after the Second World War, is the rapid and uncontrollable urban sprawl that caused the urban centers to decline and suburban and exurban areas to emerge with spider-web-like highway networks. Not only did the decline of urban centers exacerbate the deterioration of urban environments and socioeconomic prosperity in the urban centers and the entirety of urban areas as a whole, but also the natural environments that used to surround the cities fragmented. Natural habitats for many species, including endangered ones, were disrupted, and pristine forests, wetlands, and waterbodies were infringed upon and polluted [92][93][94][95][96][97][98][99][12,149,150,242,243,244,245,246]. Morphological analysis appears to be a powerful tool enabling urban scholars and practitioners to understand, monitor, model, and predict the extent of urban sprawl and the change of urban spatial structures [88][92][100][101][12,81,247,248].

2.1.3. Deducing Demographic, Social and Economic Characteristics of Cities

While a healthy urban environment and accurate account of urban morphology are surely critical for a sustainable urban future, the urban complex is a myriad of intermingling environmental, social, demographic, economic, and physical build-up, and unique land cover (impervious surface) elements, among many others. At the center of the urban environment are the urban residents and all of the activities that are caused by or occurring around them. A sustainable urban future only makes sense when there is a harmonic relationship and a virtuous relationship between the urban dwellers and the urban environment. Scholars, especially those in the social science and humanity fields of studies, also attempted to utilize remote sensing techniques in their respective domains. For instance, using remote sensing techniques to estimate the population in an urban area was an early attempt to capitalize on remote sensing images’ convenient accessibility and cost-effectiveness compared to a full-scale census or even a 1% or 5% demographic survey (such as the American Community Survey conducted annually). In recent years, other than the common multispectral remote sensing images, nighttime light images collected from the US Defense Meteorological Satellite Program’s Operational Linescan System (OLS) sensor (from 1971–2011) and the later NASA launched Suomi National Polar-orbiting Partnership (NPP) satellite and NOAA-20 satellite (since 2018), which carried the visible infrared imaging radiometer suite (VIIRS) instrument and produced day/night band (DNB) data, are attracting much attention in socioeconomic, demographic, and building environmental fields of study. This is due to the fact that the intensity of various modern human activities in a place is closely related to the amount of energy consumed there. Nighttime light emission provides an immediate proxy for the intensity of energy consumption, hence a good proxy for a wide variety of human socioeconomic activities [102][103][104][105][257,258,259,260]. In addition, the new generation of nighttime light satellites with a much finer spatial resolution (130 m), like the luojia1-01, are also providing much needed data that might be more suitable for urban studies [102][106][257,261]. While nighttime light remote sensing data have been available since the early 1970s, early studies often focused on using nighttime light remote sensing data as a proxy to map the city [107][108][262,263] due to the relatively coarse resolution (2.7 km in spatial resolution) and poor, inconsistent radiometric quality due to the lack of on-board calibration. The improved spatial resolution (375 and 750 m depending on the band, and 130 m for the luojia1-01) and onboard radiometric calibration for the VIIRS instrument greatly enhanced the application scope of nighttime light images in urban studies. It was soon found that nighttime light data was a very promising data source in urban studies to estimate population size [106][109][110][111][112][261,264,265,266,267], explore the urban socioeconomic landscape [113][114][45,268], estimate poverty [102][257], model urban morphology, expansion, and growth [115][116][117][118][119][105,269,270,271,272], and investigate urban energy exchange with the environment [39][117][120][121][122][123][188,270,273,274,275,276], among many other things. This booming application of nighttime light data in urban studies is understandable. While it is true that there are many sources of illumination during nighttime, most notably moonlight and surface albedo, the light produced from various anthropogenic activities is the most obvious and consistent information. The intensity and density of light distribution are directly related to the intensity and density of human activities. For instance, Chen and Nordhaus [124][277] examined the usefulness of the VIIRS data in the estimation of economic activity with both US states and metropolitan statistical areas (MSAs). Not surprisingly, with enhanced spatial resolution and wider coverage, their results suggested that high-resolution VIIRS light data provides a better prediction for an MSA’s GDP than for state GDP. This suggests that lights may be more closely related to urban sectors than rural sectors, hence better suited for urban-related studies. 

2.2. Social Sensing—A New Frontier of Remote Sensing and Interface with “Big Data” Semantics

The term “social sensing” refers primarily to people’s ability to perceive and make inferences about what others think and do in their own environments [125][285]. In their edited seminal book, Social Sensing, Wang, et al. [126][286] define social sensing to be a set of sensing and data collection paradigms where data are collected from humans or devices on their behalf. In this definition, society as a whole (humans, or devices on their behalf) is the context and object for sensing and sensing the means of data acquirement. This is viewed as a direct result of the proliferation of social media and social network platforms such as Facebook, Twitter, LinkedIn, Sina Weibo, Google Search, and Baidu Search, among others. The recent outbreak of the COVID-19 disease has further accelerated the use of these social media platforms to facilitate data acquisition and database construction, which in turn provides powerful means to fight back against the spread of the disease [127][128][287,288]. Aggarwal and Abdelzaher [129][289] presented a broad overview of social sensing and suggested that the growing availability of such socially sensed data provide a natural way to predict and monitor individual as well as societal behaviors, trends, and patterns. The rise of social sensing, coupled with the embedded geotag capabilities via embedded GPS of ever-increasingly available smart devices, and the internet-enabled data sharing mechanism, enabled the arrival of a context-aware computing environment, which proves to be particularly useful and relevant in urban studies [130][290]. It remains debatable whether social sensing is a type of remote sensing since remote sensing has traditionally referred to information acquired from electromagnetic energy sensors that collect information generated by electromagnetic energy. Social sensing, however, relies more on individual perceptions and observations of their environments and is facilitated by the rapidly developed telecommunication technology and widely available personal mobile devices with geotagged social medial platforms. In their research, Liu, et al. [131][295] regarded each individual who supplied information via social media platforms as playing a “role of a sensor,” which might be analogous to the electromagnetic energy sensors as in traditional remote sensing. This analogy bridges social sensing with remote sensing, if not regarding social sensing as a form of remote sensing. In addition, they also argued that social sensing information captures socioeconomic features well, while traditional remote sensing information might need complex algorithms and conversions (such as using nighttime light data, high-resolution images for impervious surfaces identification, etc.) to do so [131][295].

2.3. Limitations and Challenges of Remote Sensing in Urban Science

While applauding the integration of remote sensing data sources as a great jump in urban studies/science, it is also acutely recognized in especially the urban scientific scholarly community that there exist significant challenges in this new frontier. As pointed out in the early studies by Mullens Jr and Senger [10][160], spatial resolution is a big hurdle in applying remote sensing technologies to urban science. The spatial resolution of remote sensing data determines the level of detail that can be obtained from an image. For example, satellite images with a low spatial resolution may not be able to capture small-scale features, such as individual buildings or small patches of vegetation. This limitation can be particularly challenging when studying urban areas, where high spatial resolution is often needed to accurately capture the complex and heterogeneous urban environment. Admittedly, more recent remote sensing sensors and equipment, including satellites and unmanned drones, are able to provide sufficient spatial resolution for urban areas. However, the conundrum of cost, availability, and added noise with finer spatial resolution could quickly amount to a grave challenge for urban scholars to effectively take advantage of this new data source [132][133][125,296].

3. The Emergence of Big Data Thinking, and How Big Data Supports Urban Studies/Science

3.1. The Big Data Era

While applying remote sensing information in urban studies has proven to be a long road to trek, the recent buzzword, “Big Data,” seems to be naturally suited to studying urban phenomena from the onset. The essence of big data is not necessarily a new concept, though the term was initially used in the early 1990s. From a broad perspective, big data is only relative to the analytical approaches and means (hardware)—collectively the computational capability. When theour computing power was low, a dataset that could not be adequately analyzed by the then computational capability was legitimately considered “big data” in the sense that it was too “big” to be processed. In the precomputer and premodern transportation and telecommunication era, data accumulation and analytical power often went hand in hand in a parallel fashion. While we understood data could be potentially big, the data that concerned us often was within an analytically manageable level. Alternatively, statistical approaches that “sampled” the population satisfied theour need to explore and understand the story behind the data. Such an analytical paradigm changed dramatically during the globalization and high-speed, high-powered computational era when clustered computation became increasingly popular for data management and analysis [134][135][297,298]. Accumulation of information was explosive, and, while the computational power and analytical power were also growing, it was in no way parallel to the increased amount of information. As a matter of fact, the renowned urban geographer, Batty [136][103] cited an anonymous source defining “big data” being “any data that cannot fit into an Excel spreadsheet.” This is particularly true in urban science since the highly dynamic everyday urban events are now able to be recorded, layered, assessed, analyzed, and incorporated into real-time decision making for a more livable and sustainable urban environment [137][138][30,299]. The development of the general idea of “big data” also originated from constantly arising urban development and planning problems that could not be adequately handled by conventional means [139][140][300,301], as noted in the seminal book by Mayer-Schönberger and Cukier [141][302].

3.2. Big Data Thinking

It is generally agreed that there are roughly three phases of the concept and understanding of “big data” [142][303], based on how data is accumulated, stored, and analyzed. The first phase concerns primarily the structured content of information, roughly covering the period from 1970–2000. It was directly linked to the long-standing domain of database management. During this phase, data storage, extraction, and optimization techniques were the foci. The prominent development in this phase was the transition from flat-file data storage to hierarchical data storage, to the development of relational database management systems (RDBMS) which is still used today as a standard data storage format to facilitate fundamental data analytics. Data warehousing, data mining through traditional statistical analysis, and dynamic near-real time information updating via online dashboards and scorecards were the primary activities in this phase of big data development. The second phase of big data development started from the early 2000s to around 2010 when the internet and relevant web applications produced enormous amounts of data. In addition, search engines including Yahoo®, Google®, and Baidu ®, among others, had also produced enormous amounts of web-based unstructured content. Big data development in this phase, therefore, is concerned with primarily exacting regularities from the seemingly irregular, unstructured data. For instance, many big internet commerce companies, such as Amazon®, eBay®, and major online news agencies often analyzed customer behaviors through their click rate, content-viewing trends, search logs, and even IP-address associated geographic locations to generate highly targeted, specific content and recommendations for their customers. The massive increase of data resulting from fast-growing web traffic and the wide reach of the internet globally during this phase demanded more advanced data analytical techniques. Coupled with increased computational power, new network analysis, web mining, and spatiotemporal analysis methods emerged rapidly during this phase. The third phase of big data development was from 2010 until now. This is the phase when mobile devices (mobile phones, tablets, and mobile workstations, among many others) dominated the consumer electronics market. In 2020, it was estimated that there were 10 billion devices that were connected to the internet [143][304]. The emergence of social media and mobile browsing and mobile devices’ constant connection to the internet, coupled with the embedded GPS tracking device, enables us to collect enormous amounts of data regarding individual behaviors, and movements, and even deduce individual health status, shopping preferences, and detailed daily activity patterns. Not only are the numbers of mobile devices increasing, sensor-based and internet-enabled devices, such as smart TVs, internet-enabled thermostats, smartwatches, and household appliances, all belonging to this so-called “Internet of Things” (IoT), are also increasing in numbers rapidly. These devices generate huge amounts of data almost constantly as well.

3.3. Big Data Supported Urban Studies/Science

Through a meta-analysis of 48 urban big data studies, Wang and Yin [144][306] identified the essential qualities of urban big data. In a nutshell, urban big data focuses on refined spatiotemporal features and individual attributes at very fine levels (a street block, a building, etc.), and also has the capacity and impact to depict, predict, and manage cities through the complex interactions among individual data points and the collective trend such interactions demonstrate. This investigation agrees well with Batty [145][4] insightful observation that “cities are complex systems that mainly grow from the bottom up, their size and shape following well-defined scaling laws that result from intense competition for space.” The emergence of urban big data provides a much-needed means to support the investigation of cities from the “bottom-up,” and supplies a pathway to evaluate and investigate the scaling laws. An integrated urban theory is being gradually developed based on centuries of investigations of urban economics, urban land use, urban spatial and social structures, and urban transportation systems. Understanding the urban landscape and inherent urban growth dynamics requires indepth investigation facilitated by modern network science, allometric growth theory, and fractal geometry. With the arrival of mobile devices, the IoT stirred “urban big data” and infuses enormous information to facilitate the theoretical breakthrough of urban science as well as the socioeconomic environments of cities [131][295]. In the forum Dialogues in Human Geography, Batty [136][103] argues that the arrival of urban big data represents a sea change in understanding what happens where and when in cities. This is especially true with new methodological advancements for analyzing social sensing data for urban studies, such as temporal signature analysis, text analysis, and image analysis [146][307]. In addition, due to the dynamic characteristics of urban big data, it is shifting the emphasis of urban studies from longer term strategic planning to short-term thinking about how cities function and can be managed. This is evident in recently published big-data driven urban studies; see [147][148][149][150][151][152][54,114,115,117,122,308] for a few examples. Long-term planning, missions, and visions for urban development are critical for sustainable urban development, in both socioeconomic and environmental aspects. Long-term perspectives, however, are an averaged accumulation of short-term dynamics. The advent of urban big data and available means to acquire the data enable the in-depth exploration and understanding of short-term dynamics of the everyday urban landscape. Studies of urban vibrancy have recently seen booming growth as a response to this change, which provides a chance for long-term planning to set a more practical goal based on everyday dynamics. In a recent study, Jia, Liu, Du, Huang and Fei [150][117] argue that urban vibrancy plays an important role in evaluating the quality of urban areas and guiding urban construction. The concept of urban vibrancy was proposed in 1961 by an American writer and urban activist, Jane Jacobs [153][309], in an attempt to oppose the then modernist urban planning efforts that overlooked and oversimplified the complexity of human lives in diverse communities within cities. In her mind, cities are prosperous, healthy, and sustainable only when their neighborhoods are vibrant and lively. Instead of intensive, large-scale, city-wide “renewal” or formulated planning practices, she valued urban vibrancy that originated from individual urban communities as an integrated part of a truly sustainable city. Her advocation for dense mixed-use development and walkable streets has influenced later urban sustainable planning practices that focus on walkability and compact city spatial development in the US. The purpose of the vibrant planning idea is to bring “people” together instead of structured and formulated, grey, and impervious land uses that signify what cities used to be.

3.4. Big Data Facilitated Urban and Rural Integrated Development

3.3. Big Data Facilitated Urban and Rural Integrated Development

In the recent trend of urban development, urban agglomeration [154][155][156][311,312,313] becomes a focus for many urban scholars. One of the key features within an urban agglomeration is the integrated development of urban centers and peripheral areas, including the rural area within the urban agglomeration [155][312]. In this trend of study, spatial big data plays an increasingly important role in facilitating integrated development in both urban and rural areas. For instance, in 2010, Wang and Kilmartin [157][314] analyzed the call detail record data generated by mobile networks to reflect the dynamic behavior of humans across a range of temporal and spatial scales in Uganda. They examined the responses of subscribers to an economic incentive program regarding the mobile calling rate and identified distinctive patterns of rural and urban areas. More importantly, the analysis of the call detail record also reveals heightened economic activities in both urban and rural regions in Uganda. The approach reflects an objective spatial pattern that was naturally reflected in people’s daily activities based on their economic status. In another study, Fang, Yu, Zhang, Fang and Liu [156][313] designed a web crawler to acquire 500,000 sets of geotagged Sina Weibo data in the Greater Beijing area (Beijing–Tianjin–Heibei) to study the spatial linkage between various places within the urban agglomeration. The results from analyzing the Sina Weibo data suggest a strong hierarchical structure existed within the urban agglomeration with the three cities (Beijing, Tianjin, and Shijiazhuang). The strongest linkage presents at the centers, however, the rural areas are loosely connected, even to the urban centers. They contended that the application of spatial big data reveals the need for more strategies to integrate urban and rural development for the healthy construction of vibrant urban agglomerations.

3.5. Limitations and Challenges of Applying Spatial Big Data in Urban Studies/Science

Conceptually, the availability and understandability of spatial big data, especially the ones acquired from social media platforms and global search engines, are easy to grasp. The meaning of such data and what it poses for urban science is also intriguing and informative. The hurdle is how to dig the stories out of the massive amount of information. With the increasing availability of spatial data from various sources, such as satellites, vehicle-bound sensors, social media, and search engines, the amount of data that needs to be processed and analyzed has grown exponentially. This requires significant computational resources and expertise, which can be a challenge for researchers with limited access to these resources or limited training in processing the data [158][159][160][132,322,323]. In addition, with the increased amount of data, the need for appropriate data management and quality control is also increasing. Spatial data can be complex and often requires pre-processing and cleaning before it can be analyzed. This is a time-consuming and challenging task, particularly when dealing with data from multiple sources or when integrating data from different spatial scales as is often required in urban studies [151][161][162][163][121,122,324,325].