muhark.github.io

I am a Research Scientist in the Emergent Artificial Intelligence Lab at Intel Labs.

The substantive focus of my research is characterizing latent social biases in large generative models and techniques for correcting/steering these biases.

In practice, my work is up and down the AI Research/Engineering stack, from optimizing training for large multimodal models on hundreds of accelerators, to writing custom models and pipelines, to designing experiments and evaluation metrics.

Prior to Intel, I was postdoc at Princeton University working with Professor Brandon Stewart, and did my PhD at the University of Oxford with Professors Andy Eggers and Raymond Duch.

News

Feb 2025: New dataset on HuggingFace: IssueBench
Dec 2024: Scholar Award for top Intel Labs Academic Author
Dec 2024: Spotlight Paper at the Creativity and AI Workshop at NeurIPS 2024
Oct 2024: 3 papers accepted to NeurIPS 2024 Workshops
Sep 2024: 2 papers accepted to EMNLP 2024
Aug 2024: ACL Outstanding Paper Award!
Mar 2024: New model on HuggingFace: intel/llava-gemma-2b
Feb 2024: Started at Intel Labs as AI Research Scientist

Publications

For a full list of papers, see here.

Highlights

Using Imperfect Surrogates for Downstream Inference: Design-based Supervised Learning for Social Science Applications of Large Language Models

Naoki Egami, Musashi Hinck, Hanying Wei and Brandon Stewart.

NeurIPS 2023 (Main) 📄 Paper

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models

Paul Röttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Rose Kirk, Hinrich Schütze, Dirk Hovy.

ACL 2024 (Main) ⭐Outstanding Paper Award⭐ 📄 Paper

Why do LLaVA Vision-Language Models Reply to Images in English?

Musashi Hinck*, Carolin Holtermann, Matthew Lyle Olson, Florian Schneider, Sungduk Yu, Anahita Bhiwandiwalla, Anne Lauscher, Shaoyen Tseng, Vasudev Lal.

EMNLP 2024 (Findings) 📄 Paper

AutoPersuade: A Framework for Evaluating and Explaining Persuasive Arguments

Till Raphael Saenger, Musashi Hinck, Justin Grimmer, and Brandon M Stewart.

EMNLP 2024 (Main) 📄 Paper

Dr Musashi Hinck

News

Publications

Highlights

Using Imperfect Surrogates for Downstream Inference: Design-based Supervised Learning for Social Science Applications of Large Language Models

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models

Why do LLaVA Vision-Language Models Reply to Images in English?

AutoPersuade: A Framework for Evaluating and Explaining Persuasive Arguments

Preprints

IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance

ClimDetect: A Benchmark Dataset for Climate Change Detection and Attribution

Debias your Large Multi-Modal Model at Test-Time with Non-Contrastive Visual Attribute Steering

Semantic Specialization in Moe Appears with Scale: A Study of DeepSeek-R1 Expert Specialization