Faculty Contact: Xifeng Yan
Abstract: This module seeks to learn node embeddings from large graphs in a manner that is scalable. In this work, "learning node embedding" means mapping each node to a vectoral form which later will be used for various tasks, including link prediction, community detection, node label classification, etc. More specifically, this project aims to improve upon previously introduced methods (see references) that employ the elegant intuition that short random walks can be seen as sentences, and each node therefore can be seen as a word. Both of these papers apply word2vec in order to capture vector representation of nodes. In our observations, even though word2vec presents a good start to the problem's solution, it can not capture the node order and therefore is not applicable for directed graphs. We are instead attacking the same problem with another NLP method called "language modeling", which is able to capture the node ordering. Early results on small graphs look promising.
- Perozzi, B., Al-Rfou, R., & Skiena, S. (2014, August). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 701-710). ACM.
- Leskovec, J., Huttenlocher, D., & Kleinberg, J. (2010, April). Predicting positive and negative links in online social networks. In Proceedings of the 19th international conference on World wide web (pp. 641-650). ACM.
- Spring 2017: Furkan Kocayusufoglu