About
Me
Hi! My name is Musashi (sometimes Jacobs-) Harukawa. I’m a postdoctoral researcher at the Data-Driven Social Science Initiative at Princeton University. I research computational/quantitative methods, especially those concerning unstructured data. Recently I am interested in the applications of deep learning tools in computational social science, and have done work on multimodal classifiers for political advertisements and using model interpretability methods to characterize campaign style.
Previously, I was a pre-doctoral researcher at University College London working with Lucy Barnes on the UKRI-funded MENMOPE project, and did my DPhil in Politics at the University of Oxford supervised by Andy Eggers and Ray Duch.
Before my graduate studies, I worked as a Data Scientist at a finance/IT conglomerate in Tokyo, and as an teacher in Moscow. A long time ago, I grew up in NYC.
- CV
- teaching website
- email:
mjacobsharukawa[at]princeton[fullstop]edu
This Blog
While computational methods have become essential for political (and other social) science research, our research environment and support infrastructure have not caught up. Tasks that are usually handled by different teams in an IT company are all down to the researcher; you have to become your own sys admin, front end developer, back end developer, database administrator, and data analyst.
The infeasibility of this task leads to lax practices, from poor data storage/management techniques (Dropbox full of zipped CSVs anyone?) to idiosyncratic development environments. All of this hampers reproducibility and wastes time.
This blog is my notes and tutorials on the tools available to computational social science researchers, at all stages of development. I aim to write on topics from web development to causal inference to machine learning to academic writing, using only free/open-source software. The end goal: to save you time and help standardize reproducible research.
I write on topics close to my own research and teaching, such as computational linguistics, machine learning, causal inference, statistics, and quantitative social science methodology.