2. Crowdsourcing: Definitions
Since Howe
[1], several studies have provided different definitions of crowdsourcing. These definitions are important as they provide a basis for what should be considered crowdsourcing and what should not. For example, some studies perceive YouTube and Wikipedia as crowdsourcing
[23] while others do not
[24].
In urban planning, concepts such as problem-solving, idea generation, and collaborative mapping are widely accepted as crowdsourcing
[23][25][26][27], while data collection methods such as social media scraping and crowdsensing are subject for debate
[25][28] Brabham
[25] defines crowdsourcing as a top-down approach to solving planning problems. This definition includes approaches such as idea generation for smart city solutions
[23][29] but excludes data collection methods such as crowdsensing, Public Participation Geographic Information Systems (PPGIS), social media, etc. Nakatsu et al.
[28] argue for a broader definition that includes “geo-located data collection” (e.g., GPS tracking, a form of crowdsensing) but excludes social media. Their main argument for excluding social media was the absence of explicit outsourcing of a task
to the crowd. Furthermore, although social media have been widely adopted as crowdsourced data, the method usually consists of extracting people’s posts (social media scraping) through Application Programming Interfaces (APIs) without their consent. This could raise some ethical concerns as the people whose posts are extracted may not be willing to participate in data collection. Besides, Howe, who introduced the concept of crowdsourcing, also defined it as a voluntary process. Finally, Estellés-Arolas and González-Ladrón-De-Guevara
[24] have provided a definition of crowdsourcing based on a thorough review of the existing literature. They found voluntary participation and a clearly defined task among the main criteria for crowdsourcing. Based on the aforementioned studies, the adoption of methods that do not necessarily require voluntary participation (such as social media scraping and crowdsensing) may be problematic. However, it would be too simplistic to discard all studies using social media or crowdsensing without exploring cases where the participation is voluntary and the task clearly defined. The next subsections address this issue in detail.
2.1. Social Media Data
Although most studies use social media scraping, there are specific cases in which the methods described meet the criteria
wresearche
rs described above. These cases are:
2.2. Crowdsensing
Crowdsensing leverages the proliferation of low-cost sensing devices and citizen engagement for collecting and sharing data in different domains (environment monitoring, traffic management, waste management, etc.). Participation in crowdsensing can be voluntary or non-voluntary. For example, a crowdsensing application can combine sensing data (e.g., GPS data) with lobation-based service network datasets such as social media check-ins
[31]. Thus, similar to social media,
wresearche
rs will carefully identify the studies in which participation in crowdsensing is voluntary.
Web-based PPGIS has also been used to crowdsource data for urban planning
[32]. Some Web-based PPGIS projects provide an online platform where participants can share local knowledge through open calls, which is consistent with the basic principles of crowdsourcing.
Therefore, in line with the arguments discussed above,
wresearche
rs adopt a broader definition of crowdsourcing which covers voluntary crowdsensing, dedicated social media campaigns, and collaborative websites (web-based PPGIS, collaborative mapping, and idea generation).
3. Main research areas
3.1. Urban Morphology
These studies use data shared by the public to examine the urban forms, their formation, and evolution, as well as their impact on different aspects of urban life. The main elements of urban forms investigated in the reviewed papers are land use, infrastructures, and housing. The GS experiences fast urbanization which negatively affects the aforementioned elements, and strong measures need to be taken in order to overcome the challenges. In terms of land use, studies in the GS focused on the classification of functional zones so as to determine the main areas where human activities usually occur
[33][34][35]. Such studies are important for the GS as they can help, among others, detect rapid urbanization and can therefore help better manage the existing resources. Crowdsourcing is, in this case, a source of training datasets for the classification algorithms. Regarding infrastructures, they should be a major domain of investigation due to the lack of basic infrastructure in many areas of the GS
[36]. Some studies investigated the effects of the road network on cyclist behavior
[37]. Studies on urban design focus on the effects of the urban landscape and street configuration on human activities and/or behavior. For example, Mohamed & Stanek
[30] examined the effects of street configuration on sexual harassment, while other researchers analyzed the impact of the urban landscape on physical activities
[38][39]. Such studies can help guide future urban design so as to build safer, more equitable, and healthier urban environments. Regarding housing, it has been a major cause of concern in the GS, mainly due to the lack of affordable housing and the proliferation of informal settlements. Sub-Saharan Africa has the highest proportion of slums in the world (50.2%), followed by Central and Southern Asia (48.2%)
[40]. To tackle these challenges, some studies have involved the public in the mapping of informal settlements in the GS. However, they usually rely on the most basic forms of community mapping with paper drawings and limited sample sizes
[41][42]. With the proliferation of smartphones in some parts of the GS, more advanced methods through crowdsourcing could help reach larger samples.
3.2. Urban Transportation
Due to its importance and several implications on different aspects of urban life, transportation is among the most represented areas among the reviewed papers (16 papers). The wide variety of domains covered also justifies the large number of papers in the reviewed literature. As a service designed for the public, transportation is heavily impacted by the way people behave through time and space as well as their response to different transportation-related services. Investigating travelers’ behavior could help understand their impact on the urban space (e.g., through their travel patterns) and help draw more data-driven policies to support better transportation planning in the GS. In some cities of the GS, crowdsourcing has been used to examine users’ travel behavior through travel patterns
[43], route choice
[44], travel behavior’s impact on congestion
[45], etc. Travelers’ responses to mobility services as well as strategies to improve them were also investigated. Musakwa and Selala
[46] used crowdsourced GPS data to investigate cycling patterns, while other studies developed multimodal or public transportation networks with crowdsourced data
[47][48]. Other studies also focused on the traffic signal optimization
[49], traffic density estimation
[50], etc. Given the large number of social media users among young people, researchers have also looked for ways to involve the youth in transportation planning by crowdsourcing through dedicated social media pages.
3.3. Environmental Monitoring and Management
In an era of sustainable urban planning, research on how public engagement could foster the development of more sustainable cities has become a trend in some cities of the GS. This is also in line with the United Nations’ 2030 agenda for sustainable development goals (SDGs) regarding sustainable cities and communities
[51], which supports the improvement of urban planning in participatory and inclusive ways. For this reason, researchers have leveraged the power of public engagement through crowdsourcing to monitor the environment and, in some cases, develop decision support systems for both the public and decision-makers. The proliferation of smartphones has made this process easier as smartphones can capture and share data without any technical knowledge from the users. This made possible the collaborative collection of noise data
[52], air temperature from smartphone batteries
[53][54], the reporting of pollution of coastal zones
[55], etc.
3.4. Data Collection and Optimization
These studies demonstrate the potential of crowdsourcing as a source of data for the GS as well as ways to optimize the data collection methods. For example, in China, several research efforts have developed new methods to increase the spatio-temporal coverage of voluntary crowdsensing tasks to obtain larger and more representative datasets while minimizing the cost and improving privacy. These methods include protecting participants’ privacy, increasing the coverage distribution of sensing tasks through incentive mechanisms
[56], and enhancing data forwarding performance through cooperative data forwarding mechanisms
[57][58]. Taking into consideration the characteristics of the GS, other studies showed different solutions to involve the public in data gathering and experiment design
[59]. Recently, there has been a growing trend on the potential for crowdsourcing as a data collection method for monitoring sustainable development goals (SDGs) in the GS. Pateman et al.
[60] provided a review on the use of citizen science for monitoring SDGs in low-and-middle-income countries, while Fraisl et al.
[6] introduced a citizen science tool (Picture Pile) for monitoring SDGs.
3.5. Assessment of Crowdsourcing Methods for Urban Planning
Some studies have assessed crowdsourcing methods in the context of urban planning in the GS. Given the novelty of crowdsourcing in the GS, such studies are crucial when assessing its applicability and usefulness for cities in this part of the world. If most studies adopt a more objective approach using statistical evaluations (through the density, accuracy, nature of the crowd, etc.), others opt for a subjective method through users’ perceptions (perceived usefulness, perceived ease of use, perceived satisfaction, etc.). The objective assessments mainly focused on collaborative mapping and were conducted in China
[61][62], Turkey
[63], Kenya
[64], as well as cities in Argentina and Uruguay
[65], most of them focusing on OSM. Regarding the subjective assessments, Cilliers & Flowerday
[66] investigated the subjective factors affecting the intention to use the Interactive Voice Response (IVR) system in South Africa, while Bugs et al.
[67] examined the perceived ease of use, perceived usefulness, and satisfaction with a Web-based PPGIS platform for urban planning in Brazil.
3.6. Smart City Management
Smart cities put the public at the center of the planning process. Therefore, participatory approaches such as crowdsourcing play an important role as they allow the public to share their ideas and opinions for more efficient planning practices. However, the GS is behind the rest of the world in terms of smart city management due to a lack of basic infrastructure and a clear understanding of what a smart city should be in local contexts. For this reason, crowdsourcing could start with an exchange on steps towards smart city transformation in the context of the GS. This is the method adopted by Kumar et al.
[29], who crowdsourced ideas (idea generation) for smart city transformation in India. Another step would be to consult the public on the efficient management of the existing resources, as demonstrated by other studies in the GS
[68].
3.7. Urban Demographics
The rapid population growth in many cities of the GS, especially African cities, raises some challenges which could be mitigated with data-driven methods. Such methods could help monitor the changes in the population, predict future trends and implement proactive policies to face future challenges. However, despite the potential advantages for the GS, urban population estimation has not been widely investigated in the area as all reviewed studies were conducted in China
[69][70][71][72]. In the aforementioned studies, crowdsourcing (collaborative mapping through OSM) was adopted as supplementary open data so as to improve the accuracy of the mapping algorithms.
3.8. Disaster Detection and Management
If natural disasters are common in all regions of the world, the GS is particularly vulnerable to them due to the lack of resources for disaster detection and management. Crowdsourcing, especially collaborative mapping, has played an important role in helping the GS face these challenges. One of the main examples is the use of OSM for disaster relief during the 2010 earthquake in Haiti. Some studies have shown how public engagement can help improve flood mapping in the GS
[73][74][75]. Crowdsourced data can supplement other datasets (e.g., wireless sensor networks data) to develop spatial decision support systems (SDSS) for flood management, as demonstrated by Horita et al.
[75].