Application-Specific Language Model (ASLM) Developer (Remote) (downtown / civic / van ness)
compensation:Flat fee per model (project-based)
QR Code Link to This Post
### Overview
We’re looking for an Application-Specific Language Model (ASLM) Developer to help us design, train, and ship language-model-powered agents tailored to specific business workflows.
You’ll work with our team to turn a defined workflow into a reliable model/agent that performs consistently, with clear evaluation metrics and iteration cycles.
### What You’ll Do
- Define the target workflow and success criteria (inputs/outputs, constraints, edge cases)
- Build a training/evaluation dataset from provided examples (and propose what’s missing)
- Design the approach: prompt system, RAG, fine-tuning (if needed), tools/function calling, guardrails
- Implement an evaluation harness (offline tests + regression suite)
- Iterate to improve reliability, reduce hallucinations, and increase consistency
- Deliver a packaged “model + instructions + tests” bundle we can run and maintain
- Document the system clearly (setup, prompts, data format, metrics, retraining steps)
### Qualifications (What We’re Looking For)
- Strong experience building with modern LLMs (OpenAI/Anthropic/etc.), including:
- prompt design, structured outputs (JSON), function/tool calling
- RAG (embeddings, chunking, retrieval evaluation)
- fine-tuning / adapters (LoRA) or other model customization (nice to have)
- Practical experience shipping agentic workflows (automation, code assistants, ops bots)
- Solid software engineering fundamentals (Python/TypeScript preferred)
- Experience with evaluation:
- test sets, scoring rubrics, regression testing, error analysis
- Comfort working with real business constraints:
- reliability, traceability, security, cost controls
- Bonus:
- Playwright/Selenium/UI automation experience
- GitHub PR tooling, static analysis, CI integration
- CRM workflows (HubSpot) / sales ops automation
### Deliverables (Per Model)
Each ASLM project typically includes:
- A clear spec: scope, inputs/outputs, non-goals
- Prompt/system design + tooling interface
- Training/evaluation dataset (or dataset generation pipeline)
- Automated evaluation suite + benchmark results
- Implementation (repo/module) + deployment notes
- Documentation for handoff and future iteration
### Compensation
- Flat fee per model (agreed upfront per scope)
- Potential for ongoing work across multiple models if the collaboration goes well
### How to Apply
Please email/message with:
1. A short intro + your location/time zone
2. 2–3 relevant projects (links/screenshots/brief writeups)
3. Your preferred stack (Python/TS, frameworks, eval tooling)
4. How you approach “reliability + eval” for LLM systems
5. If possible, a sample: a short write-up of how you’d build an ASLM for one of these:
- UI automation agent
- Code review bot
- Sales bot