Publications in Information Sciences and Technology
-
Jian Wu, Athar Sefid, Allen C. Ge, and C. Lee Giles. "A Supervised Learning Approach To Entity Matching Between Scholarly Big Datasets." In: Proceedings of the 9th International Conference on Knowledge Capture (K-CAP 2017), December 4-6, 2017, Austin, Texas, USA. [Accepted]
-
Hung-Hsuan Chen, Jian Wu, C. Lee Giles. "Compiling Keyphrase Candidates for Scientific Literature Based on Wikipedia. " In: (meta)-data quality workshop part of the 21st International Conference on Theory and Practice of Digital Libraries (MDQual 2017 ), September 18-21, 2017, Thessaloniki, Greece
-
Yufeng Ma, Tingting Jiang, Chandani Shrestha, Edward A. Fox, Jian Wu, and C. Lee Giles. "Scenarios for Advanced Services in an ETD Digital Library." In: ETD2017, the 20th international symposium on electronic theses and dissertations, Washington, DC
-
Jian Wu, Sagnik Ray Choudhury, Agnese Chiatti, Chen Liang, and C. Lee Giles. "HESDK: A Hybrid Approach to Extracting Scientific Domain Knowledge Entities." In: Proceedings of ACM/IEEE-CS Joint Conference on Digital Libraries ( JCDL 2017), Toronto, Canada.
-
Nir Nissim, Aviad Cohen, Jian Wu, Andrea Lanzi, Lior Rokach, Yuval Elovici, and Lee Giles. \Scholarly Digital Libraries as a Platform for Malware Distribution." In: Proceedings of the 2nd Singapore Cyber Security R&D Conference (SG-CRC 2017 ), Singapore.
-
Jian Wu, Chen Liang, Huaiyu Yang, and C. Lee Giles. "CiteSeerX data: semanticizing scholarly papers." In: Proceedings of the International Workshop on Semantic Big Data, (SBD 2016), San Francisco, CA, USA
-
Kyle Williams, Jian Wu, Zhaohui Wu, and C. Lee Giles. "Information Extraction for Scholarly Digital Libraries." In: Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, (JCDL 2016), Newark, NJ, USA
-
Cornelia Caragea, Jian Wu, Sujatha Das G., and C. Lee Giles. "Document Type Classification in Online Digital Libraries." In: Proceedings of the 26th Innovative Applications of Artificial Intelligence Conference (IAAI 2016).
-
Jian Wu, Kyle Williams, Hung-Hsuan Chen, Madian Khabsa, Cornelia Caragea, Suppawong Tuarob, Alexander Ororbia, Douglas Jordan, Prasenjit Mitra, and C. Lee Giles. "CiteSeerX: AI in a Digital Library Search Engine." Artificial Intelligence Magazine (AI Magazine), 2015.
-
Jian Wu and C. Lee Giles. "Information Extraction for Scholarly Document Big Data." In: The 1st International Workshop on Capturing scientific knowledge, co-located with K-CAP 2015, (SciKnow 2015), Palisades, NY, 2015.
-
Jian Wu, Jason Killian, Huaiyu Yang, Kyle Williams, Sagnik Ray Choudhury, Suppawong Tuarob, Cornelia Caragea, and C. Lee Giles. "PDFMEF: A Multi-Entity Knowledge Extraction Framework for Scholarly Documents and Semantic Search." In: Proceedings of The 8th International Conference on Knowledge Capture (K-CAP 2015), Palisades, NY, USA. [Best Paper Nomination]
-
Alexander Ororbia, David Reitter, Jian Wu, and C. Lee Giles. "Online Learning of Deep Hybrid Architectures for Semi-Supervised Categorization." In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2015), Porto, Portugal.
-
Alexander G. Ororbia II, Jian Wu, and C. Lee Giles. "Big Scholarly Data in CiteSeerX: Information Extraction from the Web." In: The 2nd WWW Workshop on Big Scholarly Data: Towards the Web of Scholars (BigScholar 2015), Florence, Italy, 2015.
-
Alexander G. Ororbia II, Jian Wu, and C. Lee Giles. "CiteSeerX: Intelligent Information Extraction and Knowledge Creation from Web-Based Data." In: The 4th Workshop on Automated Knowledge Base Construction at NIPS 2014, (AKBC 2014), Montreal, Canada, 2014.
-
Jian Wu, Kyle Williams, Madian Khabsa and C. Lee Giles. "The Impact of User Corrections on ACrawl-Based Digital Library: A CiteSeerX Perspective." In: The 10th International Conference on Collaborative Computing: Networking, Applications, and Worksharing (CollaborateCom 2014), Miami, Florida, USA. [Invited Paper]
-
Kyle Williams, Jian Wu, and C. Lee Giles. "SimSeerX: A Similar Document Search Engine." In: The 14th ACM Symposium on Document Engineering (DocEng 2014), Fort Collins, CO, USA.
-
Jian Wu, Alexander Ororbia, Kyle Williams, Madian Khabsa, Zhaohui Wu and C. Lee Giles. "Utility-Based Control Feedback in a Digital Library Search Engine: Cases in CiteSeerX." In: The 9th USENIX International Workshop on Feedback Computing (Feedback 2014), Philadelphia, PA, USA, 2014.
-
Jian Wu, Kyle Williams, Hung-Hsuan Chen, Madian Khabsa, Cornelia Caragea, Alexander Ororbia, Douglas Jordan and C. Lee Giles. "CiteSeerX: AI in a Digital Library Search Engine". In: The 26th Annual Conference on Innovative Applications of Artificial Intelligence (IAAI 2014), Quebec City, Quebec, Canada. [Best Application Paper].
-
Zhaohui Wu, Jian Wu, Madian Khabsa, Kyle Williams, Hung-Hsuan Chen, Wenyi Huang, Suppawong Tuarob, Sagnik Ray Choudhury, Alexander Ororbia, Prasenjit Mitra, and C. Lee Giles. "Towards Building a Scholarly Big Data Platform: Challenges, Lessons, and Opportunities". In: Proceedings of the 14th ACM/IEEE-CS Joint Conference on Digital Libraries (DL 2014), London, UK.
-
Kyle Williams, Lichi Li, Madian Khabsa, Jian Wu, Patrick Shih, and C. Lee Giles. "A Web Service for Scholarly Big Data Information Extraction". In: The 21st IEEE International Conference on Web Services (ICWS 2014), Anchorage, AK, USA.
-
Kyle Williams, Jian Wu, Sagnik Choudhury, Madian Khabsa, and C. Lee Giles. "Scholarly Big Data Information Extraction and Integration in the CiteSeerX Digital Library." In: The 10th International Workshop on Information Integration on the Web (IIWeb 2014), Chicago, IL, USA, 2014.
-
Jian Wu, Pradeep Teregowda, Kyle Williams, Madian Khabsa, Douglas Jordan, Eric Treece, Zhaohui Wu and C. Lee Giles. "Migrating A Digital Library to A Private Cloud". In: Proceedings of the IEEE International Conference on Cloud Engineering 2014 (IC2E 2014), Boston, MA, USA. [Best Paper Nomination]
-
Cornelia Caragea, Jian Wu, Kyle Williams, Sujatha Das G., Madian Khabsa, Pradeep Teregowda and C. Lee Giles. "Automatic Identification of Research Articles from Crawled Documents." In: WSDM 2014 Workshop on Web-scale Classi cation: Classifying Big Data from the Web (WSCBD 2014), New York City, NY, USA, 2014.
-
Cornelia Caragea, Jian Wu, Alina Ciobanu, Kyle Williams, Juan Fernandez-Ramrez, Hung-Hsuan Chen, Zhaohui Wu and C. Lee Giles. "CiteSeerX: A Scholarly Big Dataset". In: Proceedings of the 36th European Conference on Information Retrieval (ECIR 2014), Amsterdam, Netherlands.
-
Jian Wu, Pradeep Teregowda, Eric Treece, Madian Khabsa, Douglas Jordan, Stephen Carman, Prasenjit Mitra and C. Lee Giles. 2013. "Scalability Bottlenecks of the CiteSeerX Digital Library Search Engine." In: the 10th International Workshop on Large-Scale and Distributed System for Information Retrieval co-located with WSDM 2013 (LSDS-IR 2013), Rome, Italy, 2013.
-
Jian Wu, Pradeep Teregowda, Madian Khabsa, Stephen Carman, Douglas Jordan, Jose San Pedro Wandelmer, Xin Lu, Prasenjit Mitra, and C. Lee Giles. "Web crawler middleware for search engine digital libraries: a case study for CiteSeerX." In: Proceedings of the 12th international workshop on Web information and data management, collocated with CIKM 2012 (WIDM 2012), Maui, HI, USA, 2012.
-
Jian Wu, Pradeep Teregowda, Juan Pablo Fernandez Ramrez, Prasenjit Mitra, Shuyi Zheng, and C. Lee Giles. "The evolution of a crawling strategy for an academic document search engine: whitelists and blacklists." In: Proceedings of the 3rd Annual ACM Web Science Conference (WebSci 2012), Evanston, IL, USA.
-
Sumit Bhatia, Cornelia Caragea, Hung-Hsuan Chen, Jian Wu, Pucktada Treeratpituk, ZhaohuiWu, Madian Khabsa, Prasenjit Mitra and C. Lee Giles. "Specialized Research Datasets in theCiteSeerX Digital Library." D-Lib Magazine, Vol. 18, Number 7/8, 2012.