Show simple item record

dc.contributor.advisorYu, Philip S.en_US
dc.contributor.authorKong, Xiangnanen_US
dc.date.accessioned2014-10-28T20:22:56Z
dc.date.available2016-10-29T09:30:07Z
dc.date.created2014-08en_US
dc.date.issued2014-10-28
dc.date.submitted2014-08en_US
dc.identifier.urihttp://hdl.handle.net/10027/19119
dc.description.abstractGraphs are ubiquitous and have become increasingly important in modeling diverse kinds of objects. In many real-world applications, instances are not represented as feature vectors, but as graphs with complex structures, e.g., chemical compounds, program flows, XML web documents and brain networks. One central issue in graph mining research is graph classification, which has a wide variety of real world applications, e.g., drug activity predictions, toxicology tests and kinase inhibitions. There are some major challenges in real-world graph classification problems as follows: 1) Learning from graphs with multiple labels:} For example, a chemical compound can inhibit the activities of multiple types of kinases, e.g., ATPase and MEK kinase; One drug molecular can have anti-cancer efficacies on multiple types of cancers. 2) Learning from a small number of labeled graphs: In many real world applications, the labels of graph data are very expensive or difficult to obtain. Creating a large training dataset can be too expensive, time-consuming or even infeasible. For example, in molecular medicine, it requires time, efforts and excessive resources to test drugs' anti-cancer efficacies by pre-clinical studies and clinical trials, while there are often copious amounts of unlabeled drugs or molecules available from various sources. 3) Learning from uncertain graphs: For example, in neuroimaging, the functional connectivities among different brain regions are highly uncertain. In such applications, each human brain can be represented as an uncertain graph, instead of a certain graph. In this thesis, we explore four different settings of graph classification: multi-label setting, semi-supervised setting, active learning setting, and uncertain graph setting. In the multi-label setting, each graph object can be assigned with multiple labels. In semi-supervise setting and active learning setting, we explore two different settings to reduce the labeling costs in graph classification problems. In uncertain graph setting, we explore how to incorporate the uncertainty information in the graph structure for graph classification problems.en_US
dc.language.isoenen_US
dc.rightsen_US
dc.rightsCopyright 2014 Xiangnan Kongen_US
dc.subjectGraph Miningen_US
dc.subjectData Miningen_US
dc.subjectBig Dataen_US
dc.subjectData Varietyen_US
dc.subjectSubgraph Patternen_US
dc.subjectFeature Selectionen_US
dc.subjectUncertain Dataen_US
dc.subjectDrug Discoveryen_US
dc.subjectBrain Networken_US
dc.titleModeling Big Data Variety with Graph Mining Techniquesen_US
thesis.degree.departmentComputer Scienceen_US
thesis.degree.disciplineComputer Scienceen_US
thesis.degree.grantorUniversity of Illinois at Chicagoen_US
thesis.degree.levelDoctoralen_US
thesis.degree.namePhD, Doctor of Philosophyen_US
dc.type.genrethesisen_US
dc.contributor.committeeMemberLiu, Bingen_US
dc.contributor.committeeMemberLillis, Johnen_US
dc.contributor.committeeMemberWang, Junhuien_US
dc.contributor.committeeMemberRagin, Ann B.en_US
dc.type.materialtexten_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record