Shenyang, located in Liaoning Province, China, has experienced different periods in history, such as a frontier military town, a subsidiary of the South Manchurian Railway, a national heavy industrial base, and a city in transition (Xu et al., 2019). Currently, Shenyang is also the only mega-city in Northeast China with an administrative area of 1.29 × 104 km2 and a resident population of 8.32 × 106 in 2018 (Shenyang Statistics Bureau, 2020). It now has ten municipal districts, two counties and one county-level city under its jurisdiction. We choose the core area of Shenyang (hereinafter referred to as ‘Shenyang Proper’) as the research area, including five central districts (Heping, Shenhe, Dadong, Huanggu, and Tiexi) and four suburban districts (Hunnan, Shenbei New District, Yuhong, and Sujiatun) (Fig. 1).
The residential population and the employees’ information were extracted from the cell phone signalling data for one consecutive month in July 2018 as the original data. The cell phone signalling data consists of both the active signalling data and passive signalling data. Active signalling data are generated when users switch on and off their cell phones, make calls, send and receive SMS, or carry their cell phones to move their location so that the base station connected to the cell phone changes. The passive signalling data generate by linking the cell phone to the base station occasionally. The algorithm is deployed in the mobile operator’s platform. It first measured the workplace and residence of individual users, and then took the 250 m × 250 m resolution grid as the statistical unit to get the number of people working and living in each grid. The heat map, obtained from operators, 1.51 × 107 items in total, representing the total population appearing in each grid per hour per day, has been chosen to calculate the leisure population in each grid. The operator obtained the heat map from the active and passive cell phone signalling data collected from July 16th to July 22nd, 2018, by the algorithm deployed at the operator. We selected non-holiday heat map to calculate the leisure population. We clipped the employment population, residential population and heat map stored in grides form by the administrative boundary data and unified the projection coordinate system as WGS-84 coordinate system, UTM.
The calculation method of individual user’s workplace is to calculate the active and passive observed base station locations of users aged 16–64 years old during the observation period (9:00–17:00) and to measure the actual location of users based on the weighted extrapolation of base station locations and frequencies (referred to as ‘base station weighted center of mass algorithm’). The calculation method of individual user’s residence is to calculate the active and passive observed base station locations of users during the observation period (21:00–8:00 the next day) and to measure the actual location of users based on the base station weighted center of mass algorithm. The ‘heat map’ is based on the active and passive cell phone signalling data received by each grid in each hour of the day, and the user’s actual location square is determined according to the weighted mass algorithm of the base station. The above algorithm is based on continuous active and passive cell phone signalling for one month. Compared with the previous short-term (1–2 weeks) and only one type of signalling (active or passive signalling) data, its identities the location of individual users more accurately by increasing the number of signalling data samples. At the same time, some constraints have been added in speculating the workplace and residence of individual users to accurately locate the user’s workplace and residence. For the first constraint, the algorithm automatically added up the frequency of the user’s occurrence at each location within a month and then taking the most frequent location as the user’s location. For the second constraint, the occurrence of the most frequent location should be more than 10 d. Thus, the algorithm can effectively filter out the positioning errors of individual users and further improve the accuracy of the number of people working and living in each grid. Using the above method, we finally identified about 2.32 million residents and 1.43 million workers in Shenyang City. There is a strong correlation between the residential population data of each district and the government demographic data (Shenyang Statistics Bureau, 2020), indicating that the method of speculating residential people and working people by cell phone data is feasible (Fig. 2).
Figure 2. Comparison of the number of population identified by cell phone data and the demographics in different district of Shenyang City, China
The Points of Interest (POI) data of Shenyang City were mainly collected from Amap, with a total of 4.90 × 105 items and recorded the name, type, latitude, and longitude information of urban socio-economic sector entities item by item. After data cleaning, a total of nine types of urban facility POI data are used to analyse the interrelationship between population activities and urban infrastructure, namely daily services (convenience stores, beauty salons, intermediaries), medical care (healthcare services), public services (public facilities), transportation (bus stations, parking lots), restaurants (food services), shopping (shopping malls, hypermarkets, supermarkets), attractions (scenic spots), accommodation (hotels, hostels, guest houses), and financial offices (excluding ATMs). Finally, we got a total of 1.81×105 city facility POIs. The data of road network includes expressways, ring roads, national roads, provincial roads, railroads, and county roads. The socio-economic data includes the demographic data of Shenyang City in 2019 (Shenyang Statistics Bureau, 2020). The ‘Land use planning map of the central urban area of Shenyang’, ‘Spatial structure planning map of central urban area’ and other planning documents were collected from the ‘Shenyang City’s Master Plan (2011–2020)’ (Chinese Urban Planning Society, 2016).
The ﬂowchart of the proposed research framework is illustrated in Fig. 3. Based on the residential and working population numbers in grids obtained by cell phone signalling and other auxiliary data, we measured the residential and employment population densities at different scales. Then, by integrating cell phone data, heat map data, boundary data and demographic data, we calculated the leisure population and mapped the distribution of residential population, employment population and leisure population at different scales. Then we extracted multi-scale and multi-type UFAs of Shenyang City by the multi-level functional area division method we constructed. Finally, with the support of POI data, we calculated the correlation between active population density and the urban facilities density.
The district units are the basic administrative units of urban research and can reflect the overall urban functional layout at the macro-level (Tian et al., 2010). The block units are the basic support for urban morphological structure cognition (Xue et al., 2020b). The grid unit can characterize the detailed of land functions from a more microscopic perspective (Crooks et al., 2015). Therefore, this study takes the grid, block, and district as three different unit-scale (Fig. 3). The grid and district units were obtained directly from the cell phone signalling data, while the block units were divided based on the study area boundary data and road data. The road space is not within the scope of this study object, so roads areas were excluded from the study area. According to the average width of different roads, the roads area at all levels, i.e., highway, ring road, national highway, provincial highway, railroad, county road, were widened by 40 m, 40 m, 40 m, 40 m, 30 m, and 20 m, respectively. At the same time, the unoccupied green land within the ring road were excluded too. Finally, a total of 330 blocks units were formed for functional area classification.
The spatial join tool of ArcGIS 10.3 is applied to obtain the number of residential and employment population maps at different scales. Referring to the study by Niu et al. (2014), we considered 15:00 on weekends as the most intensive period for leisure activities. So, the number of leisure population in each grid was obtained by calculating the average number of people during 15:00–16:00 on Saturdays and Sundays. To distinguish the types of mixed functional areas more accurately and to reduce the identification errors of non-leisure areas where users were located during the rest time, three indicators were used to identify different categories of functional areas. The indicators include residential population density, employment to residential population density ratio, leisure population to residential population density ratio at 15:00 on weekends (Table 1). Niu et al. (2014) classified the functional areas in the central city of Shanghai into high, medium, and low residential population densities, and only selected areas with ‘high’ and ‘low’ population densities for the study. However, the research area of this study consists of both the central and the suburban areas, if the population density is only classified as ‘high’ and ‘low’, the large suburban areas with ‘medium’ population density will be ignored. Therefore, the medium-density residential areas are considered in the functional area identification method. The residential population density values were classified into three levels including high (H), medium (M) and low (L) by the natural interruption method (Table 1). The ratio of employment to residential population density less than or equal to 1 is defined as ‘la’ and greater than 1 is defined as ‘ha’. The ratio of leisure population density to residential population density at 15:00 on weekends less than or equal to 1 is defined as ‘lb’ and greater than 1 is defined as ‘hb’ (Table 1). According to different combinations of residential population density level (RPDL), the ratio of employment to residential population density level (RERPDL), and the ratio of leisure to residential population density level (RLRPDL), we identified different functional area. For example, when RPDL is H, RERPDL is ha, and RLRPDL is hb, the area is defined as a ‘residential-employment-leisure mixed functional area with high residential population density’, represented by ‘H: R-E-L’. In Fig. 3, we refer to mixed functional areas with different residential population density levels as ‘mixed functional areas’. Using the spatial join tool of ArcGIS 10.3, the ratio of employment to residential population density and the ratio of leisure population density to residential population density at 15:00 on weekends were calculated for different spatial units. Through the calculation of three indicators, the areas were divided into residential functional areas (R), employment functional areas (E), leisure functional areas (L) as well as mixed functional areas (R-E, R-L, E-L, R-E-L) composed of different functions. The mixed functional areas were divided into more detailed types (see the last column of Table 1) according to the residential population density level within each unit.
Residential population density Ratio of employment to residential population density Leisure population to residential population density ratio at 15:00 on weekends Functional area H ha hb H: R-E-L ha lb H: R-E la hb H: R-L la lb H: R M ha hb M: R-E-L ha lb M: R-E la hb M: R-L la lb M: R L ha hb L: E-L ha lb L: E la hb L: L la lb L: R Notes: High level (H), Medium level (M), Low level (L)
Table 1. Identification criteria of Shenyang City’s urban functional area in July 2018
In order to analyse the spatial distribution of population activities, we divided Shenyang into three circles: 1) the central urban area within the Third Ring Road, 2) the suburban area outside the Third Ring Road and within the Third Ring Road, and 3) the remaining area outside the Third Ring Road. The population density of residence, employment, and leisure is divided into five levels by the natural interruption method. The population density of residence, employment and leisure at grid scale gradually decreased from center to periphery in the central urban area. There are several high-density centers in the suburbs and many high-density small-scale centers in the suburban area (Fig. 4). This distribution characteristics of the population density is consistent with the urban structure pattern (i.e., ‘main city + sub-city + multi-center’) proposed in the Shenyang City’s Master Plan (2011–2020). The distribution characteristics of residential and leisure population densities are similar, with population densities ranging from 0 to 24 224 persons/km2 and 0 to 20.368 persons/km2, respectively. The population density decreased in a stepwise manner along the First, Second and Third Ring Roads. The employment population density values ranged from 0 to 40 496 persons/km2, showing a more significant concentricity and a steep decline outside the Second Ring Road.
At a grid scale, the single functional area covered 2.24 × 103 km2, which was dominated by the residential functional area (H: R, Table 1) with 1.55 × 103 km2, the leisure functional area (L:L, Table 1) with 515 km2 and the employment functional area (L: E, Table 1) with 175 km2. The mixed functional area covered 254 km2 and was dominated by the employment-leisure functional area (L: E-L) with 222 km2, followed by the mixed residential-leisure functional area (R-L, including M: R-E and H:R-E) with 17 km2, the mixed residential-employment-leisure functional area (R-E-L, including H: R-E-L and M: R-E-L) with 13 km2, and the mixed residential-employment functional area (R-E, including H: R-E and M: R-E) with 2.4 km2. Single functional areas were widely distributed (Fig. 5a), while mixed functional areas had the characteristic of gathering in the central urban area (Fig. 5b). Especially, the area of R-E-L within the Second Ring Road accounted for 89% of all the R-E-L all over Shenyang City. This is mainly because the central urban area was densely populated with residential and office places and the populations were highly mixed. While the population in the near and far suburban areas was relatively sparse and the different types of functional spaces were dispersed. In addition, single and mixed functional areas appeared to be synergistically clustered in the near and far suburbs (Fig. 5c, Fig. 5d).
Figure 5. Spatial distribution of single functional areas include leisure functional area with low residential population density (L: L), employment functional area with low residential population density (L: E) and residential functional area with high residential population density (H: R) and mixed functional areas includes employment-leisure functional area (E-L), residential-leisure functional area (R-L), residential-employment functional area (R-E) and residential-employment-leisure functional area (R-E-L) in Shenyang City, China
At a block scale, the single functional area, only leisure functional area (L: L), covered 3.04 × 103 km2 and widely distributed in the periphery of the city. The mixed functional area covered 389 km2 (Fig. 6a) and was dominated by H: R-L (283.7 km2), followed by the E-L that includes M: E-L and H: E-L (95.3 km2) and the R-E-L that include H: R-E-L and M: R-E-L (9.6 km2). The H: R-L were mainly distributed in the central urban area, extending southward along Shenyang City’s central axis (i.e., Youth Street) to the suburbs. The E-L were strolling in the suburbs in the shape of ‘satellites’. The R-E-L mainly located along Taiyuan Street and Zhongjie Street that were proposed to be the future ‘municipal centers’ in the Shenyang City’s Master Plan (2011–2020)’. They were planned to be developed into a comprehensive service center. At a district scale, it contained three kinds of UFAs—L:L, H: R-L and M:R-L (Fig. 6b). The L:L locate in the northern and southern Shenyang, the M:R-L and H: R-L were in the eastern and western Shenyang, reflecting that the residential function was strong in the west, weak in the east, and weaker in the north and south. There were few mixed functional areas at any scale. But there were more residential-leisure mixed functional areas. The residential function and employment function tended to be distributed independently.
Figure 6. Spatial distribution of single functional areas (leisure functional area with low residential population density, L: L) and mixed functional areas include employment-leisure functional area with low residential population density (L:E-L), residential-employment-leisure mixed functional area with high residential population density (H: R-E-L), residential-leisure mixed functional area with high residential population density (H: R-L), employment-leisure mixed functional area with medium residential population density (M: E-L), residential-employment-leisure mixed functional area with medium residential population density (M: R-E-L) and residential-leisure mixed functional area with medium residential population density (M: R-L) (left: block scale, right: district scale) in Shenyang City, China
Comparing the identification results from the points of scales, we found that the functional area types at block scale or district scale are less than those at grid scale. The functions tend to be diverse and mixed as the study unit becomes larger, showing a scale effect (Fig. 7). The small-scale study unit accommodates a smaller population and reveals diverse local functional details, which is helpful for refined space management. While at the bigger scale, local functional area details are not easy to be discovered, one block or district is more likely to be identified as mixed functional area. However, block-scale UFAs can be compared with traditional urban spatial structure mode, which helps to discover new urban structure theory in newly urbanized area and promotes the urban master planning.
At a grid scale, the residential functional areas were becoming increasingly scarce from the center to the periphery. There are few residential functional areas in the areas that were planned to be built into ‘new cities’. The H: R were concentrated in the central urban area. The M: R decrease from the First Ring Road outward (Fig. 8a). The L: R were widely distributed outside the Third Ring Road. The residential functional areas mainly correspond to residential land of the land use planning map (Fig. 8b), showing that the identification results of UFA are reliable. There were some multiple clusters of employment functional areas in suburban industrial new city, economic development zones and high-tech industrial zones. These employment functional areas corresponded to the industrial land and storage land of the land use planning map, which proves the high reliability of the identification results. The big employment functional areas were mostly surrounded by L: R, so the industrial new city area should strengthen the comprehensive function development to promote the internal circulation operation of the new city itself. There were few leisure functional areas in the central urban area, but more in the suburbs.
Figure 8. Comparison between the single functional areas and the land use planning map of central city of Shenyang City, China. The single functional areas include residential functional area with high residential population density (H:R), residential functional area with medium residential population density (M:R), employment functional area with low residential population density (L:E), residential functional area with low residential population density (L:R) and leisure functional area with low residential population density (L:L)
The distribution of R-E-L was consistent with planning expectations, which was driven by policy forces. There were four contiguous gathering centers of R-E-L in the central city (Fig. 9). Through comparison with digital maps and field surveys, it shows that the four gathering centers located in Shenyang North Railway Station, Taiyuan Street, Xiaodong Road and Northeastern University area respectively. These areas were going to be built into ‘regional comprehensive service industry centers’ in the ‘Shenyang City’s Master Plan (2011–2020)’. The R-E were concentrated in the old town of Tiexi District and southern Huanggu District. Since the industrial enterprises were relocated into the new town of Tiexi District when the municipal government’s replanning of Tiexi District in 2002, the old industrial Tiexi was transformed into a modern commercial and residential center (Xue et al., 2016). The R-E was the least distributed. The E-L were scattered in the whole city, but less within the Second Ring Road.
Figure 9. Spatial distribution of mixed functional areas include residential-leisure mixed functional area with high residential population density (H: R-L), residential-employment mixed functional area with high residential population density (H: R-E), residential-employment-leisure mixed functional area with high residential population density (H: R-E-L), residential-leisure mixed functional area with medium residential population density (M: R-L), residential-employment mixed functional area with medium residential population density (M: R-E), residential-employment-leisure mixed functional area with medium residential population density (M: R-E-L) and employment-leisure mixed functional area with low residential population density (L: E-L)
The correlation coefficients between urban facilities and active population density at block scale passed the two-sided hypothesis test. Daily service facilities and catering facilities were highly correlated with the residential population density. Transportation facilities, catering facilities, and daily service facilities were highly correlated with the employment population density. Transportation facilities, daily service facilities and catering facilities were highly correlated with the leisure population density (Table 2). Therefore, four types of facilities, namely, transportation, catering, daily service and shopping, were most closely related to population activities. Three types of facilities, namely, public service, accommodation and financial office, were less closely related to population activities. Medical and attraction facilities were least related to population activities. In addition, there was also a high degree of pairwise correlation among residential population density, employment population density and leisure population density.
RP EP LP A1 A2 A3 A4 A5 A6 A7 A8 A9 RP 1 EP 0.932 1 LP 0.976 0.968 1 A1 0.916 0.888 0.94 1 A2 0.565 0.527 0.562 0.555 1 A3 0.732 0.81 0.808 0.824 0.409 1 A4 0.906 0.943 0.943 0.941 0.542 0.837 1 A5 0.867 0.911 0.929 0.964 0.533 0.894 0.953 1 A6 0.907 0.877 0.926 0.957 0.582 0.789 0.906 0.927 1 A7 0.416 0.454 0.468 0.517 0.262 0.613 0.474 0.551 0.5 1 A8 0.654 0.758 0.776 0.798 0.43 0.802 0.791 0.887 0.758 0.452 1 A9 0.681 0.851 0.74 0.706 0.352 0.751 0.795 0.791 0.673 0.407 0.696 1 Notes: RP is residential population density, EP is employment population density, LP is leisure population density; A1–A9 are daily service facility density, medical facility density, public service facility density, transportation facility density, restaurant facility density, shopping facility density, attraction facility density, accommodation facility density and financial office facility density, respectively
Table 2. Correlation coefficient matrix between active population density / (persons/km²) and urban facility density / (units/km²)
Multi-source Data-driven Identification of Urban Functional Areas: A Case of Shenyang, China
- Received Date: 2022-05-14
- Accepted Date: 2022-09-09
- Available Online: 2022-11-02
- Publish Date: 2023-01-05
- human-land relationship /
- multi-source big data /
- urban functional area /
- identification method /
- Shenyang City
Abstract: Urban functional area (UFA) is a core scientific issue affecting urban sustainability. The current knowledge gap is mainly reflected in the lack of multi-scale quantitative interpretation methods from the perspective of human-land interaction. In this paper, based on multi-source big data include 250 m × 250 m resolution cell phone data, 1.81 × 105 Points of Interest (POI) data and administrative boundary data, we built a UFA identification method and demonstrated empirically in Shenyang City, China. We argue that the method we built can effectively identify multi-scale multi-type UFAs based on human activity and further reveal the spatial correlation between urban facilities and human activity. The empirical study suggests that the employment functional zones in Shenyang City are more concentrated in central cities than other single functional zones. There are more mix functional areas in the central city areas, while the planned industrial new cities need to develop comprehensive functions in Shenyang. UFAs have scale effects and human-land interaction patterns. We suggest that city decision makers should apply multi-sources big data to measure urban functional service in a more refined manner from a supply-demand perspective.
|Citation:||XUE Bing, XIAO Xiao, LI Jingzhong, ZHAO Bingyu, FU Bo, 2023. Multi-source Data-driven Identification of Urban Functional Areas: A Case of Shenyang, China. Chinese Geographical Science, 33(1): 21−35 doi: 10.1007/s11769-022-1320-2|