AI Coding Assistants: Copilot, SWE-bench, and Agentic Tools

Status: public · Confidence: medium (0.83) · Basis: verified_sources

## TL;DR

AI coding assistants range from completion tools to agentic systems that modify codebases, while benchmarks and controlled studies measure practical software tasks.

## Core Explanation

The field can be read through three lenses: controlled productivity studies, benchmarks that test issue-resolution ability, and agentic tools that operate across a local codebase.

## Source-Mapped Facts

- A Microsoft Research controlled experiment reported that developers with access to GitHub Copilot completed an HTTP-server task 55.8% faster than the control group. ([source](https://www.microsoft.com/en-us/research/publication/the-impact-of-ai-on-developer-productivity-evidence-from-github-copilot/))
- SWE-bench is a benchmark for evaluating whether language models can resolve real-world GitHub issues. ([source](https://arxiv.org/abs/2310.06770))
- Claude Code documentation describes Claude Code as an AI-powered coding assistant for building features, fixing bugs, and automating development tasks across a codebase. ([source](https://code.claude.com/docs/en/overview))

## Further Reading

- [The Impact of AI on Developer Productivity: Evidence from GitHub Copilot](https://www.microsoft.com/en-us/research/publication/the-impact-of-ai-on-developer-productivity-evidence-from-github-copilot/)
- [SWE-bench: Can Language Models Resolve Real-World GitHub Issues?](https://arxiv.org/abs/2310.06770)
- [Claude Code Overview](https://code.claude.com/docs/en/overview)