The MIT AI Risk Initiative is seeking support to build LLM-augmented pipelines to accelerate evidence synthesis and systematic reviews for AI risks and mitigations. The initial contract is six months, and part-time, with the possibility of extension.

The immediate use case is to help build out modules to support our review of global organizations’ AI risk responses, where we identify public documents, screen for relevance, extract claims about AI risks/mitigations, and classify outputs against several taxonomies.

The bigger picture includes generalizing and adapting this pipeline to support living updates & extensions for our risk repository, incident tracker, mitigations review, and governance mapping work.

By contributing your skills to the MIT AI Risk Initiative, you’ll help us provide the authoritative data and frameworks that enable decision-makers across the AI ecosystem to understand & address AI risks.

What you’ll do:

Phase 1: Org review pipeline (Jan–Mar)

  • Build/improve modules for document identification, screening, extraction, and classification
  • Build/improve human validation / holdout sampling processes and interfaces so we can measure performance against humans at each step
  • Integrate modules into an end-to-end evidence synthesis pipeline
  • Ship something that helps us complete the org review by ~March

Phase 2: Generalization & learning (Mar onwards)

  • Refactor for reuse across different AI Risk Initiative projects (incidents, mitigations, governance mapping)
  • Implement adaptive example retrieval
  • Build change tracking: when prompts or criteria change, what shifts in outputs?
  • Help us understand where LLM judgments can exceed human performance and thus be fully automated, and what still needs human review (and design interfaces / processes to enable this)
  • Document architecture and findings for handoff

Required skills

  • Strong software engineering fundamentals
  • Hands-on experience building LLM pipelines
  • Python proficiency
  • Comfort working on ambiguous problems where "what should we build?" is part of the work
  • Can communicate clearly with researchers who aren't software engineers

Nice to have

  • Prior work in research, systematic review, or annotation/labeling contexts
  • Experience with evaluation/QA/human validation
  • Familiarity with embeddings + vector search for example retrieval
  • API integrations (Airtable or similar), Extract, Transform, Load (ETL)/scraping-adjacent work

Read more: https://futuretech.mit.edu/opportunities/ml-engineer---mit-ai-risk-initiative-contractor-part-time-6-months

Express interest:

https://mitfuturetech.atlassian.net/jira/core/form/a35da49a-3ed9-4722-8eda-2258b30bcc29

Please share with anyone relevant.

7

0
0

Reactions

0
0
Comments
No comments on this post yet.
Be the first to respond.
Curated and popular this week
Relevant opportunities