The full face hard rock tunnel boring machine (TBM) automatically generates massive excavation data during the tunnel construction process. Proper screening and cleansing of excavation data is crucial to data quality, which also has great guiding significance for the intelligent construction of tunnel engineering. Therefore, based on the characteristics of TBM excavation data in the Yinchuo Project, an integrated TBM excavation data preprocessing method is proposed, which includes complete excavation segment extraction, internal excavation segmentation, and excavation parameter noise reduction. To verify the effectiveness of the proposed data preprocessing method, a torque cut depth index (TPI) prediction model is developed by the long short-term memory (LSTM) algorithm, which has strong temporal prediction capabilities. The results demonstrate that the proposed data preprocessing method can significantly improve data quality and enhance the prediction accuracy of deep learning models. For the validation dataset, R2 increases from 0.503 to 0.721, R′ ascends from 0.809 to 0.900, and MRE plummets from 3.107 to 0.096. These research achievements bear profound implications for enhancing the precision and reliability of TBM tunneling data, thereby offering invaluable insights for further exploration in the related domain.