Founding AI/ML Engineering Lead (LLM Specialist)

About Mongabay

Mongabay is a leading environmental news platform that reaches over 70 million people annually with trusted journalism about conservation, climate change, and environmental issues. Founded 25 years ago, we operate a global network of correspondents across 80+ countries delivering critical information to decision-makers worldwide.

About Story Transformer

Story Transformer is Mongabay’s groundbreaking initiative to democratize access to environmental information. Using generative AI, we’re building a system that automatically transforms our environmental journalism into multiple languages and accessible formats for vulnerable communities in the Global South. This is a rare opportunity to work on AI for social impact at scale—potentially reaching 18+ million people in frontline communities.

Human-AI Partnership: Story Transformer uses AI for speed and scale, but maintains human editorial oversight at critical points. The dual-model verification system is designed to make editors more efficient by automatically flagging potential issues, not to replace their expertise or judgment.

The Role

We’re seeking an experienced AI/ML Engineering Lead with deep expertise in Large Language Models to design and implement the foundational architecture and production-ready MVP for Story Transformer’s AI core. You’ll build the initial dual-model verification system, establish prompt engineering frameworks for environmental content, and create evaluation pipelines to ensure outputs are accurate and trustworthy. This is a leading IC role requires both technical sophistication and a pragmatic approach to deploying AI in real-world, high-stakes contexts, with focus on delivering a working system in Phase 1 (6 months) that can scale in future phases.

What You’ll Build

- Dual-Model Verification System: Design and implement a Writer/Reviewer pipeline where two LLMs cross-check translations for accuracy, semantic consistency, and completeness across the initial 5 languages (English, Spanish, Indonesian, Portuguese, French). This system is designed to surface discrepancies and risk signals for human review, not to replace editorial judgment—the goal is to make human editors more efficient by flagging potential issues automatically.
- Domain-Specific Model Optimization: Fine-tune LLMs on Mongabay’s 60,000+ article archive to improve performance on environmental terminology and scientific concepts establishing the methodology that can scale in Phase 2. For low-resource expansion languages, focus on establishing baseline performance metrics and progressive improvement strategies rather than immediate optimization.

<liNon-Global Language Enhancement: Document requirements and develop scalable methods to progressively improve baseline performance for underserved languages like Arabic, Bengali, Malay, Malayalam, Marathi, Nepali, Swahili, Tagalog (Filipino), Tamil, and Vietnamese. This includes approaches for building contextual glossaries and identifying linguistic partnerships needed for Phase 2 expansion.

Prompt Engineering Framework: Create comprehensive prompt libraries optimized for content transformation across languages, formats, and audience types
Evaluation & Quality Systems: Build automated evaluation frameworks to measure translation accuracy, cultural appropriateness, and semantic preservation
Multi-Modal Integration: Implement text-to-speech systems optimized for multiple languages, dialects, and low-bandwidth environments
Continuous Improvement Pipeline: Design feedback loops that capture user data and model performance to refine outputs over time

Our AI Stack

Primary LLMs: AWS Bedrock (Claude, Llama, Titan, etc. – with flexibility to use multiple models)
Framework: Python, PyTorch or TensorFlow
Platform: Amazon Web Services (Bedrock, SageMaker)
Tools: Langchain, Hugging Face, custom evaluation frameworks
Data: 60,000+ articles in 6 languages, multilingual terminology databases

Required Skills & Experience

5+ years of ML/AI experience with at least 2 years focused on Large Language Models. This is a lead individual contributor role with architectural authority – prior experience designing and owning AI systems end-to-end is essential.
LLM expertise: Deep understanding of transformer architectures, fine-tuning approaches (LoRA, PEFT), and prompt engineering.
Multilingual NLP: Experience working with non-English languages, with realistic understanding of low-resource language challenges and progressive improvement approaches.
Python proficiency: Strong coding skills with ML frameworks (PyTorch, TensorFlow, or JAX)
API integration: Experience working with LLM APIs (OpenAI, Anthropic, Google Gemini/PaLM)
Evaluation design: Building metrics and frameworks to assess model quality and reliability
Production ML: Experience deploying models in production environments, not just research/notebooks
Technical leadership experience: You’ve architected and owned AI/ML systems from design through production deployment, ideally as a lead engineer or founding technical team member
Understanding of responsible AI practices and bias mitigation

Strongly Preferred

Experience with AWS (Bedrock, SageMaker) or other cloud ML platforms
Background in machine translation or cross-lingual NLP
Work with text-to-speech systems or audio generation
Experience fine-tuning models for specific domains (scientific, technical, medical)
Previous work on social impact or mission-driven AI projects
Academic background in NLP, computational linguistics, or related field, especially with experience in low-resource language challenges and incremental quality improvement strategies

Key Technical Challenges You’ll Solve

Accuracy Detection at Scale: How do we automatically surface potential accuracy issues so human editors can verify that environmental content is translated accurately without losing scientific nuance? The goal is efficient flagging, not autonomous verification.
Low-Resource Languages: How do we improve LLM performance for languages like Quechua or Nigerian Pidgin where training data is scarce? Success means establishing scalable methods for improvement, not achieving immediate parity with high-resource languages.
Cultural Adaptation: How do we ensure AI-generated content is culturally appropriate and resonates with local communities?
Format Transformation: How do we intelligently adapt long-form journalism into 90-second audio messages or simplified texts while preserving key information?
Trust & Reliability: How do we build systems that communities can trust with information that affects their livelihoods and ecosystems?

What Makes You a Great Fit

Experienced architect: You’ve previously designed and owned AI systems end-to-end, not just contributed to them as a team member
Mission alignment: You’re excited about using AI to address environmental challenges and serve underserved communities
Pragmatic mindset: You balance cutting-edge techniques with practical deployment constraints
Quality-focused: You understand that accuracy matters deeply when information affects real-world decisions, and you’re realistic about what AI can verify autonomously versus what requires human editorial judgment
Collaborative: You can explain complex AI concepts to non-technical stakeholders
Experimental: You’re comfortable trying multiple approaches and learning from failures
Ethical foundation: You think critically about AI risks, biases, and unintended consequences

What You’ll Get

Real-world impact: Your AI systems will help millions of people access critical environmental information
Technical ownership: Lead AI architecture decisions and shape the model strategy
Cutting-edge work: Apply state-of-the-art LLM techniques to novel, meaningful problems
Remote flexibility: Work from anywhere with flexible hours
Mission-driven team: Join people passionate about conservation, technology, and equity.”

Timeline & Expectations

Start date: April 15, 2026
Initial commitment: 1 year (with strong potential for extension)
Time commitment: Full-time (40 hours/week)
Report To: Head of Product – Story Transformer
Collaboration: Close partnership with Full Stack Engineer, Product Manager/Designer, and native-language editors

6-Month Milestone (Phase 1)

By month 6, you will have built and deployed:

Dual-model verification system (Writer/Reviewer architecture) operational on AWS Bedrock
Multi-language transformation capability for 5 languages (English, Spanish, Indonesian, Portuguese, French)
Prompt engineering framework with language-specific optimization
Quality evaluation metrics and automated accuracy reporting
Processing of 300+ adapted outputs to validate model performance and reliability
Documentation and training materials for 6 native-language editors who will support expansion

To Apply

WE ARE NO LONGER ACCEPTING APPLICATIONS. On the application, you will be asked to fill out basic contact information, pay rate expectations, and reference information (2), and to upload the following documents:

Cover letter – please provide specific examples of your experience related to this position, as well as specific examples of your ability to work effectively from home and meet deadlines (1-2 pages)
Resume (1-2 pages)
Two references

Applications must be submitted in English. Please note that you do not need to provide letters of recommendation and we will not reach out to references before getting your permission to do so.

Benefits, Holidays, and Paid Time Off

It is important for all staff to step away from work to renew themselves. That is why Mongabay provides generous paid time off (PTO) for professional development, holidays, sick time, rest and relaxation, and vacation, including a flexible work schedule.

We observe 11 holidays and support staff who celebrate other personal and religious holidays.

Equal Employment Opportunity

Mongabay is an equal employment opportunity (EEO) employer and provides EEO to all employees and applicants for employment without regard to race, color, religion, sex, sexual orientation, national origin, age, disability, genetics, or any other status prohibited by law.

* The above statements are intended to describe the general nature and level of work assigned to this position. They are not intended to be an exhaustive list of all duties, responsibilities, and qualifications. Management reserves the right to change or modify such duties as required. All staff are responsible for following applicable Mongabay policies and procedures as defined by their manager and Employment Handbook.