PREDIKCIJA CENE NEKRETNINA NA OSNOVU PODATAKA IZ OGLASA
Apstrakt
U ovom radu je predstavljen model za predikciju cene nekretnina na osnovu podataka iz oglasa. Iz oglasa, preuzetih sa veb stranice za oglašavanje, su izdvojene tehničke specifikacije nekretnine, slike nekretnine i, ukoliko su dostupne, geografske koordinate nekretnine. Geografske koordinate su upotrebljene za formiranje ocene kvaliteta lokacije. Slike su upotrebljene za obučavanje neuronske mreže za detekciju značajnih objekata na slikama. Formirana su tri skupa podataka za obučavanje prediktivnih modela. Prvi skup sadrži samo tehničke specifikacije nekretnina, drugi skup ima dodatu ocenu lokacije, a treći skup ima i ocenu lokacije i detektovane objekte na slikama iz oglasa. Za svaki skup je obučeno nekoliko regresionih modela za predikciju cene i njihove performanse su poređene. Performanse ovih prediktivnih modela, izražene kao R2, su poređene. Najbolje performanse je imao GBT (Gradient Boosted Trees) model na skupu sa slikama i ocenom lokacije sa ostvarenom R2 vrednošću od 0.856.
Reference
[2] Azme Bin Khamis and Nur Khalidah Khalilah Binti Kamarudin. Comparative study on estimate house price using statistical and neural network model. International Journal of Scientific & Technology Research, 3(12):126–131, 2014.
[3] John Ottensmann, Seth Payton, and Joyce Man. Urban Location and Housing Prices within a Hedonic Model. Journal of Regional Analysis and Policy, 38, January 2008.
[4] Fanhua Kong, Haiwei Yin, and Nobukazu Nakagoshi. Using GIS and landscape metrics in the hedonic price modeling of the amenity value of urban green space: A case study in Jinan City, China. Landscape and Urban Planning, 79(3):240–252, March 2007.
[5] Nekretnine.rs. dostupno na https://www.nekretnine.rs.
[6] Scrapy | A Fast and Powerful Scraping and Web Crawling Framework.dostupno na https://scrapy.org.
[7] Stephen Conroy, Andrew Narwold, and Jonathan Sandy. The value of a floor: valuing floor level in high-rise condominiums in san diego. International Journal of Housing Markets and Analysis, 6(2):197–208, 2013.
[8] overpass api. dostupno na https://wiki.openstreetmap.org/wiki/Overpass_AP.
[9] Openstreetmap. dostupno na https://www.openstreetmap.org.
[10] darrenl. Tzutalin. labelimg. git code (2015)., September 2018. dostupno na https://github.com/tzutalin/labelImg.
[11] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer, 2014.
[12] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
[13] Giuseppe Bonaccorso. Machine learning algorithms. Packt, 2017.
[14] Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pages 267–288, 1996.
[15] Arthur E Hoerl and Robert W Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1):55–67, 1970.
[16] Hui Zou and Trevor Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2):301–320, 2005.
[17] Leo Breiman. Classification and regression trees. Routledge, 2017.
[18] Yoav Freund and Robert E Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55(1):119–139, 1997.
[19] Jerome H Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189–1232, 2001.