**Abstract**

Due to the widespread applications in real-world scenarios, metro ridership prediction is a crucial but challenging task in intelligent transportation systems. However, conventional methods either ignore the topological information of metro systems or directly learn on physical topology, and cannot fully explore the patterns of ridership evolution. To address this problem, we model a metro system as graphs with various topologies and propose a unified Physical-Virtual Collaboration Graph Network (PVCGN), which can effectively learn the complex ridership patterns from the tailor-designed graphs. Specifically, a physical graph is directly built based on the realistic topology of the studied metro system, while a similarity graph and a correlation graph are built with virtual topologies under the guidance of the inter-station passenger flow similarity and correlation. These complementary graphs are incorporated into a Graph Convolution Gated Recurrent Unit (GC-GRU) for spatial-temporal representation learning. Further, a Fully-Connected Gated Recurrent Unit (FC-GRU) is also applied to capture the global evolution tendency. Finally, we develop a Seq2Seq model with GC-GRU and FC-GRU to forecast the future metro ridership sequentially. Extensive experiments on two large-scale benchmarks (e.g., Shanghai Metro and Hangzhou Metro) well demonstrate the superiority of our PVCGN for station-level metro ridership prediction. Moreover, we apply the proposed PVCGN to address the online origin-destination (OD) ridership prediction and the experiment results show the universality of our method.

**Framework**

We develop a unified Physical-Virtual Collaboration Graph Network (PVCGN) to address the station-level metro ridership prediction. PVCGN consists of an encoder and a decoder, both of which contains two Collaborative Gated Recurrent Modules (CGRMs). The encoder takes {Xt−n+1, Xt−n+2,..., Xt} as input and the decoder forecasts a future ridership sequence {Xˆ t+1, Xˆt+2,..., Xˆ t+m} with fully-connected (FC) layers.

Our CGRMs consists of a Graph Convolution Gated Recurrent Unit (GC-GRU) and a Fully-Connected Gated Recurrent Unit (FC-GRU). FC denotes a fully-connected layer.

Specifically, PVCGN incorporates a physical graph, a similarity graph and a correlation graph into a Graph Convolution Gated Recurrent Unit to facilitate the spatial-temporal representation learning. Furthermore, a Fully-Connected Gated Recurrent Unit is utilized to learn the semantic feature of global evolution tendency. Finally, we apply a Seq2Seq model to sequentially forecast the metro ridership at the next several time intervals.

**Experiment**

We compare the proposed PVCGN with eleven representative approaches on the whole testing sets (including all time intervals and all metro stations). The methods include Historical Average (HA), Random Forest (RF), Gradient Boosting Decision Trees (GBDT), Multiple Layer Perceptron (MLP), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), ASTGCN [1], STG2Seq [2], DCRNN [3], GCRNN, Graph-WaveNet [4]. Their performance on SHMetro and HZMetro datasets are summarized as follows.

We employ the proposed PVCGN to forecast the online metro origin-destination (OD) ridership on the SHMetro dataset, which is more challenging, because the complete OD distribution can not be obtained immediately in online metro systems.

**Conclusion**

In this work, we propose a unified Physical-Virtual Collaboration Graph Network to address the station-level metro ridership prediction. Unlike previous works that either ignored the topological information of a metro system or directly modeled on physical topology, we model the studied metro system as a physical graph and two virtual similarity/correlation graphs to fully capture the ridership evolution patterns. Specifically, the physical graph is built on the basis of the metro realistic topology. The similarity graph and correlation graph are constructed with virtual topologies under the guidance of the historical passenger flow similarity and correlation among different stations. We incorporate these graphs into a Graph Convolution Gated Recurrent Unit (GC-GRU) to learn spatial-temporal representation and apply a Fully-Connected Gated Recurrent Unit (FC-GRU) to capture the global evolution tendency. Finally, these GRUs are utilized to develop a Seq2Seq model for forecasting the ridership of each station. To verify the effectiveness of our method, we construct two real-world benchmarks with mass transaction records of Shanghai metro and Hangzhou metro and the extensive experiments on these benchmarks show the superiority of the proposed PVCGN.

In future works, we would pay more attention to the online origin-destination ridership prediction and several improvements should be considered. First, the data of unfinished orders can also provide some useful information and we attempt to estimate the potential OD distribution of unfinished orders. Second, the metro ridership evolves periodically. For instance, the ridership at 9:00 of every weekday is usually similar. Therefore, we should also utilize the periodic distribution of OD ridership to facilitate representation learning. Last but not least, some external factors (such as weather and holiday events) may greatly affect the ridership evolution and we should incorporate these factors to dynamically forecast the ridership.

**Acknowledgement**

This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant U1811463, Grant 61876045, and Grant 61976250; in part by the Natural Science Foundation of Guangdong Province under Grant 2017A030312006; in part by the Guangdong Basic and Applied Basic Research Foundation under Grant 2020B1515020048; in part by the Zhujiang Science and Technology New Star Project of Guangzhou under Grant 201906010057; and in part by the CCF-Tencent Open Research Fund. The Associate Editor for this article was X. Ma.

**References**

[1] S. Guo, Y. Lin, N. Feng, C. Song, and H. Wan, “Attention based spatial-temporal graph convolutional networks for traffic flow forecasting,” in Proc. AAAI, vol. 33, 2019, pp. 922–929.

[2] L. Bai et al., “Stg2seq: Spatial-temporal graph to sequence model for multi-step passenger demand forecasting,” 2019, arXiv:1905.10069. [Online]. Available: https://arxiv.org/abs/1905.10069

[3] Y. Huang, Y. Weng, S. Yu, and X. Chen, “Diffusion convolutional recurrent neural network with rank influence learning for traffic forecasting,” in Proc. 18th IEEE Int. Conf. Trust, Secur. Privacy Comput. Commun., Aug. 2019, pp. 678–685.

[4] Z. Wu, S. Pan, G. Long, J. Jiang, and C. Zhang, “Graph WaveNet for deep spatial-temporal graph modeling,” in Proc. Int. Joint Conf. Artif. Intell., Aug. 2019, pp. 1907–1913.