NS Seminar

Date and Location

Feb 23, 2018 - 3:00pm to 4:00pm
NS Lab, Bldg 434, room 122

Abstract

Learning Latent Representations of Nodes for Classifying in Heterogeneous Social Networks  (presented by Vania Wang, Geography)

Jacob, Y., Denoyer, L., & Gallinari, P. (2014, February). Learning latent representations of nodes for classifying in heterogeneous social networks. In Proceedings of the 7th ACM international conference on Web search and data mining(pp. 373-382). ACM.

Social networks are heterogeneous systems composed of different types of nodes (e.g. users, content, groups, etc.) and relations (e.g. social or similarity relations). While learning and performing inference on homogeneous networks have motivated a large amount of research, few work exists on heterogeneous networks and there are open and challenging issues for existing methods that were previously developed for homogeneous networks. We address here the specific problem of nodes classification and tagging in heterogeneous social networks, where different types of nodes are considered, each type with its own label or tag set. We propose a new method for learning node representations onto a latent space, common to all the different node types. Inference is then performed in this latent space. In this framework, two nodes connected in the network will tend to share similar representations regardless of their types. This allows bypassing limitations of the methods based on direct extensions of homogenous frameworks and exploiting the dependencies and correlations between the different node types. The proposed method is tested on two representative datasets and compared to state-of-the-art methods and to baselines.

 

struc2vec: Learning Node Representations from Structural Identity  (presented by Su Burtner, Geography)

Figueiredo, D. R., Ribeiro, L. F., & Saverese, P. H. (2017). struc2vec: Learning Node Representations from Structural Identity. arXiv preprint arXiv:1704.03165.

Structural identity is a concept of symmetry in which network nodes are identifi€ed according to the network structure and their relationship to other nodes. Structural identity has been studied in theory and practice over the past decades, but only recently has it been addressed with representational learning techniques. Œis work presents struc2vec, a novel and flƒexible framework for learning latent representations for the structural identity of nodes. struc2vec uses a hierarchy to measure node similarity at di‚fferent scales, and constructs a multilayer graph to encode structural similarities and generate structural context for nodes. Numerical experiments indicate that state-of-the-art techniques for learning node representations fail in capturing stronger notions of structural identity, while struc2vec exhibits much superior performance in this task, as it overcomes limitations of prior approaches. As a consequence, numerical experiments indicate that struc2vec improves performance on classifi€cation tasks that depend more on structural identity.

 

Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change (presented by Devin Cornell, Sociology)

Hamilton, W. L., Leskovec, J., & Jurafsky, D. (2016). Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change, 1489–1501. http://doi.org/10.18653/v1/P16-1141

Understanding how words change their meanings over time is key to models of language and cultural evolution, but historical data on meaning is scarce, making theories hard to develop and test. Word embeddings show promise as a diachronic tool, but have not been carefully evaluated. We develop a robust methodology for quantifying semantic change by evaluating word embeddings (PPMI, SVD, word2vec) against known historical changes. We then use this methodology to reveal statistical laws of semantic evolution. Using six historical corpora spanning four languages and two centuries, we propose two quantitative laws of semantic change: (i) the law of conformity—the rate of semantic change scales with an inverse power-law of word frequency; (ii) the law of innovation—independent of frequency, words that are more polysemous have higher rates of semantic change.