$30/hour Posted: 2 days ago
Job Description
<h3>Job Description</h3><p>Job Description<p><p><strong>Software Engineer, AI — Code Evaluation & Training (Remote)</strong></p><p><strong>List of accepted countries and locations</strong></p><p>Help train large-language models (LLMs) to write production-grade code across a wide range of programming languages:</p><ul><li><p><strong>Compare & rank multiple code snippets</strong>, explaining which is best and why.</p></li><li><p><strong>Repair & refactor AI-generated code</strong> for correctness, efficiency, and style.</p></li><li><p><strong>Inject feedback</strong> (ratings, edits, test results) into the RLHF pipeline and keep it running smoothly.<br /><strong>End result:</strong> the model learns to propose, critique, and improve code the way <em>you</em> do.</p></li></ul><p><strong>RLHF in one line</strong><br />Generate code ➜ expert engineers rank, edit, and justify ➜ convert that feedback into reward signals ➜ reinforcement learning tunes the model toward code you’d actually ship.</p>What You’ll Need<ul><li><p><strong>3+ years of professional software engineering experience</strong> in <strong>C#</strong><br />(Constraint programming experience is a bonus, but not required)</p></li><li><p><strong>Strong code-review instincts</strong>—you can spot logic errors, performance traps, and security issues quickly.</p></li><li><p><strong>Extreme attention to detail and excellent written communication skills.</strong><br />Much of this role involves explaining <em>why</em> one approach is better than another. This cannot be overstated.</p></li><li><p>You <strong>enjoy reading documentation and language specs</strong> and thrive in an asynchronous, low-oversight environment.</p></li></ul>What You Don’t Need<ul><li><p>No prior RLHF (Reinforcement Learning with Human Feedback) or AI training experience.</p></li><li><p>No deep machine learning knowledge. If you can review and critique code clearly, we’ll teach you the rest.</p></li></ul>Tech Stack<p>We are looking for engineers with a strong command of <strong>C#</strong>.</p>Logistics<ul><li><p><strong>Location:</strong> Fully remote — work from anywhere</p></li><li><p><strong>Compensation:</strong> From $30/hr to $70/hr, depending on location and seniority </p></li><li><p><strong>Hours:</strong> Minimum 15 hrs/week, up to 40 hrs/week available</p></li><li><p><strong>Engagement:</strong> 1099 contract</p></li></ul><p>Straightforward impact, zero fluff. If this sounds like a fit, apply here! </p></p></p>Create Your Resume First
Give yourself the best chance of success. Create a professional, job-winning resume with AI before you apply.
It's fast, easy, and increases your chances of getting an interview!
Application Disclaimer
You are now leaving Healthfitnessjobs.ca and being redirected to a third-party website to complete your application. We are not responsible for the content or privacy practices of this external site.
Important: Beware of job scams. Never provide your bank account details, credit card information, or any form of payment to a potential employer.