CatBoost for big data: an interdisciplinary review

JT Hancock, TM Khoshgoftaar - Journal of big data, 2020 - Springer
Abstract Gradient Boosted Decision Trees (GBDT's) are a powerful tool for classification and
regression tasks in Big Data. Researchers should be familiar with the strengths and …

[HTML][HTML] Machine learning for spatial analyses in urban areas: a scoping review

Y Casali, NY Aydin, T Comes - Sustainable cities and society, 2022 - Elsevier
The challenges for sustainable cities to protect the environment, ensure economic growth,
and maintain social justice have been widely recognized. Along with the digitization …

[HTML][HTML] Using explainable machine learning to understand how urban form shapes sustainable mobility

F Wagner, N Milojevic-Dupont, L Franken… - … Research Part D …, 2022 - Elsevier
Municipalities are increasingly acknowledging the importance of urban form interventions
that can reduce intra-city car travel in achieving more sustainable cities. Current academic …

Real-time hard-rock tunnel prediction model for rock mass classification using CatBoost integrated with Sequential Model-Based Optimization

Y Bo, Q Liu, X Huang, Y Pan - Tunnelling and underground space …, 2022 - Elsevier
In-time perception of changing geological conditions is crucial for safe and efficient TBM
tunneling. Precisely detecting or predicting the rock mass qualities ahead of the tunnel face …

Integration of dockless bike-sharing and metro: Prediction and explanation at origin-destination level

C Fu, Z Huang, B Scheuer, J Lin, Y Zhang - Sustainable Cities and Society, 2023 - Elsevier
Dockless bike-sharing is an effective solution for the metro's first-and last-mile connections.
To create a more bicycle-friendly environment, there is a need to accurately predict the use …

ConvGCN-RF: A hybrid learning model for commuting flow prediction considering geographical semantics and neighborhood effects

G Yin, Z Huang, Y Bao, H Wang, L Li, X Ma, Y Zhang - GeoInformatica, 2023 - Springer
Commuting flow prediction is a crucial issue for transport optimization and urban planning.
However, the two existing types of solutions have inherent flaws. One is traditional models …

A global feature-rich network dataset of cities and dashboard for comprehensive urban analyses

W Yap, F Biljecki - Scientific Data, 2023 - nature.com
Urban network analytics has become an essential tool for understanding and modeling the
intricate complexity of cities. We introduce the Urbanity data repository to nurture this …

Urbanity: automated modelling and analysis of multidimensional networks in cities

W Yap, R Stouffs, F Biljecki - npj Urban Sustainability, 2023 - nature.com
Urban networks play a vital role in connecting multiple urban components and developing
our understanding of cities and urban systems. Despite the significant progress we have …

Performance of catboost and xgboost in medicare fraud detection

J Hancock, TM Khoshgoftaar - 2020 19th IEEE international …, 2020 - ieeexplore.ieee.org
Due to the size of the data involved, performance is an important consideration in the task of
detecting fraudulent Medicare insurance claims. We evaluate CatBoost and XGBoost on the …

Fintech for the poor: Financial intermediation without discrimination

P Tantri - Review of Finance, 2021 - academic.oup.com
I ask whether machine learning (ML) algorithms improve the efficiency in lending without
compromising on equity in a credit environment where soft information dominates. I obtain …