About

I am a first year PhD student at the University of Pennsylvania, where I have the fortune of being advised by Erik Waingarten. I graduated from UC Santa Cruz with a Master's and Bachelor's in Computer Science and a Bachelor's in Mathematics, where I was advised by C. Seshadhri with whom I researched the properties of low-dimensional graph embeddings.

My research interests lie mostly in theoretical computer science.

Education

PhD in Computer and Information Science,
University of Pennsylvania, 2023 - Present

  • Advisor: Erik Waingarten

M.S. in Computer Science and Engineering,
University of California, Santa Cruz, 2021 - 2022

  • M.S. Advisor: C. Seshadhri

B.S. in Computer Science & B.S. in Mathematics,
University of California, Santa Cruz, 2017 - 2021

  • Senior Thesis (Computer Science): Knowledge Graph Embeddings in Probablistic Soft Logic
  • Senior Paper (Mathematics): Graph Coloring and Ramsey Theory

Publications

Link Prediction using Low-dimensional node embeddings: the measurement problem
Nicolas Menand and C. Seshadhri
PNAS 2024
PDF Code
Graph representation learning is a fundamental technique for machine learning (ML) on complex networks. Given an input network, these methods represent the vertices by low-dimensional real-valued vectors. These vectors can be used for a multitude of downstream ML tasks. We study one of the most important such task, link prediction. Much of the recent literature on graph representation learning has shown remarkable success in link prediction. On closer investigation, we observe that the performance is measured by the AUC (area under the curve), which suffers biases. Since the ground truth in link prediction is sparse, we design a vertex-centric measure of performance, called the VCMPR@k plots. Under this measure, we show that link predictors using graph representations show poor scores. Despite having extremely high AUC scores, the predictors miss much of the ground truth We identify a mathematical connection between this performance, the sparsity of the ground truth, and the low-dimensional geometry of the node embeddings. Under a formal theoretical framework, we prove that low-dimensional vectors cannot capture sparse ground truth using dot product similarities (the standard practice in the literature). Our results call into question existing results on link prediction and pose a significant scientific challenge for graph representation learning. The VCMPR plots identify specific scientific challenges for link prediction using low-dimensional node embeddings.

Teaching

Other

Honors & Awards
  • Highest Honors in the Major (Computer Science), UC Santa Cruz, 2021
  • Highest Honors in the Major (Mathematics), UC Santa Cruz, 2021
  • University Honors, UC Santa Cruz, 2021