By assuming that crowd walking load is a joint stationary stochastic process, a cross-spectral model for the low and high pedestrian densities in restricted traffic is proposed based on the coherence function and auto power-spectral density of individual walking load. The 3-D motion capture technology popularly used in medical science is introduced, to resolve the existing problems in crowd walking load experiment—incapable of recording the actions of all the pedestrians at the same time and the movements of pedestrians during walking. Based on the statistical analysis of tests, the distributions of walking frequency, time lag, and coherence function under different pedestrian densities are determined, respectively. Furthermore, the methodology of calculating structural acceleration response in terms of root mean square at the check point of the structure is established based on the cross-spectral model and the stochastic vibration theoryUnder the assumption that the locations of pedestrians on the structure are fixed, the proposed model is verified by a comparison between the crowd walking experiment on a footbridge and the corresponding theoretical prediction using the developed methodology. Hence, the proposed model provides a new frequency domain method for the analysis of structural response and the assessment of vibration serviceability under crowd walking load.