CodeCoT and Beyond: Learning to Program and Test like a Developer Paper • 2308.08784 • Published Aug 17, 2023 • 5
Lemur: Harmonizing Natural Language and Code for Language Agents Paper • 2310.06830 • Published Oct 10, 2023 • 33
CodePlan: Repository-level Coding using LLMs and Planning Paper • 2309.12499 • Published Sep 21, 2023 • 79
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines Paper • 2310.03714 • Published Oct 5, 2023 • 37
Prompt2Model: Generating Deployable Models from Natural Language Instructions Paper • 2308.12261 • Published Aug 23, 2023 • 1
AskIt: Unified Programming Interface for Programming with Large Language Models Paper • 2308.15645 • Published Aug 29, 2023 • 2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning Paper • 2310.03731 • Published Oct 5, 2023 • 29
SCREWS: A Modular Framework for Reasoning with Revisions Paper • 2309.13075 • Published Sep 20, 2023 • 17
MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning Paper • 2309.05653 • Published Sep 11, 2023 • 10
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving Paper • 2309.17452 • Published Sep 29, 2023 • 3
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification Paper • 2308.07921 • Published Aug 15, 2023 • 23
Octopus: Embodied Vision-Language Programmer from Environmental Feedback Paper • 2310.08588 • Published Oct 12, 2023 • 38
CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules Paper • 2310.08992 • Published Oct 13, 2023 • 12
Ranking LLM-Generated Loop Invariants for Program Verification Paper • 2310.09342 • Published Oct 13, 2023 • 3
LMDX: Language Model-based Document Information Extraction and Localization Paper • 2309.10952 • Published Sep 19, 2023 • 66
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion Paper • 2310.11248 • Published Oct 17, 2023 • 4
Pop Quiz! Do Pre-trained Code Models Possess Knowledge of Correct API Names? Paper • 2309.07804 • Published Sep 14, 2023 • 2
CAT-LM: Training Language Models on Aligned Code And Tests Paper • 2310.01602 • Published Oct 2, 2023 • 1
SWE-bench: Can Language Models Resolve Real-World GitHub Issues? Paper • 2310.06770 • Published Oct 10, 2023 • 9
Large Language Models (GPT) Struggle to Answer Multiple-Choice Questions about Code Paper • 2303.08033 • Published Mar 9, 2023 • 1
Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation Paper • 2308.01240 • Published Aug 2, 2023 • 1
OctoPack: Instruction Tuning Code Large Language Models Paper • 2308.07124 • Published Aug 14, 2023 • 31
Can Programming Languages Boost Each Other via Instruction Tuning? Paper • 2308.16824 • Published Aug 31, 2023 • 11
Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation with Large Language Models Paper • 2308.10462 • Published Aug 21, 2023 • 2
LLaMA-Reviewer: Advancing Code Review Automation with Large Language Models through Parameter-Efficient Fine-Tuning Paper • 2308.11148 • Published Aug 22, 2023 • 2
ViperGPT: Visual Inference via Python Execution for Reasoning Paper • 2303.08128 • Published Mar 14, 2023 • 2
Visual Programming: Compositional visual reasoning without training Paper • 2211.11559 • Published Nov 18, 2022 • 1
LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language Models Paper • 2309.09506 • Published Sep 18, 2023 • 14
Evaluating and Explaining Large Language Models for Code Using Syntactic Structures Paper • 2308.03873 • Published Aug 7, 2023 • 1
CodeFusion: A Pre-trained Diffusion Model for Code Generation Paper • 2310.17680 • Published Oct 26, 2023 • 73
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning Paper • 2207.01780 • Published Jul 5, 2022 • 1
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation Paper • 2310.18628 • Published Oct 28, 2023 • 8
Safurai 001: New Qualitative Approach for Code LLM Evaluation Paper • 2309.11385 • Published Sep 20, 2023 • 2
Large Language Models for Software Engineering: A Systematic Literature Review Paper • 2308.10620 • Published Aug 21, 2023 • 1
Software Testing with Large Language Model: Survey, Landscape, and Vision Paper • 2307.07221 • Published Jul 14, 2023 • 1
ComputeGPT: A computational chat model for numerical problems Paper • 2305.06223 • Published May 8, 2023 • 1
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning Paper • 2309.10814 • Published Sep 19, 2023 • 3
Code Prompting: a Neural Symbolic Method for Complex Reasoning in Large Language Models Paper • 2305.18507 • Published May 29, 2023 • 1
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks Paper • 2211.12588 • Published Nov 22, 2022 • 3
Structured Chain-of-Thought Prompting for Code Generation Paper • 2305.06599 • Published May 11, 2023 • 1
Pair Programming with Large Language Models for Sampling and Estimation of Copulas Paper • 2303.18116 • Published Mar 31, 2023 • 1
EcoAssistant: Using LLM Assistant More Affordably and Accurately Paper • 2310.03046 • Published Oct 3, 2023 • 6
CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets Paper • 2309.17428 • Published Sep 29, 2023 • 1
Bridging Code Semantic and LLMs: Semantic Chain-of-Thought Prompting for Code Generation Paper • 2310.10698 • Published Oct 16, 2023 • 2
Test-Case-Driven Programming Understanding in Large Language Models for Better Code Generation Paper • 2309.16120 • Published Sep 28, 2023 • 1
The Program Testing Ability of Large Language Models for Code Paper • 2310.05727 • Published Oct 9, 2023 • 2
ClarifyGPT: Empowering LLM-based Code Generation with Intention Clarification Paper • 2310.10996 • Published Oct 17, 2023 • 1
Towards an Understanding of Large Language Models in Software Engineering Tasks Paper • 2308.11396 • Published Aug 22, 2023 • 1
Leveraging Training Data in Few-Shot Prompting for Numerical Reasoning Paper • 2305.18170 • Published May 29, 2023 • 2
Impact of Large Language Models on Generating Software Specifications Paper • 2306.03324 • Published Jun 6, 2023 • 2
Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning? Paper • 2306.01754 • Published May 23, 2023 • 1
An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation Paper • 2302.06527 • Published Feb 13, 2023 • 1
Effective Test Generation Using Pre-trained Large Language Models and Mutation Testing Paper • 2308.16557 • Published Aug 31, 2023 • 1
A Static Evaluation of Code Completion by Large Language Models Paper • 2306.03203 • Published Jun 5, 2023 • 3
CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors Paper • 2305.05711 • Published May 9, 2023 • 2
Few-shot training LLMs for project-specific code-summarization Paper • 2207.04237 • Published Jul 9, 2022 • 1
Improving Few-Shot Prompts with Relevant Static Analysis Products Paper • 2304.06815 • Published Apr 13, 2023 • 1
Large Language Models are Few-Shot Summarizers: Multi-Intent Comment Generation via In-Context Learning Paper • 2304.11384 • Published Apr 22, 2023 • 1
Repository-Level Prompt Generation for Large Language Models of Code Paper • 2206.12839 • Published Jun 26, 2022 • 3
A Systematic Evaluation of Large Language Models of Code Paper • 2202.13169 • Published Feb 26, 2022 • 1
Private-Library-Oriented Code Generation with Large Language Models Paper • 2307.15370 • Published Jul 28, 2023 • 1
CodeTF: One-stop Transformer Library for State-of-the-art Code LLM Paper • 2306.00029 • Published May 31, 2023 • 2
Bias Assessment and Mitigation in LLM-based Code Generation Paper • 2309.14345 • Published Sep 3, 2023 • 1
A Simple, Yet Effective Approach to Finding Biases in Code Generation Paper • 2211.00609 • Published Oct 31, 2022 • 1
Execution-Based Evaluation for Open-Domain Code Generation Paper • 2212.10481 • Published Dec 20, 2022 • 1
CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis Paper • 2203.13474 • Published Mar 25, 2022 • 2
Improving Code Generation by Training with Natural Language Feedback Paper • 2303.16749 • Published Mar 28, 2023 • 1
Large Language Models of Code Fail at Completing Code with Potential Bugs Paper • 2306.03438 • Published Jun 6, 2023 • 2
How Effective Are Neural Networks for Fixing Security Vulnerabilities Paper • 2305.18607 • Published May 29, 2023 • 2
Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair Paper • 2309.00608 • Published Sep 1, 2023 • 2
Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions Paper • 2304.03816 • Published Apr 7, 2023 • 1
Enhancing Automated Program Repair through Fine-tuning and Prompt Engineering Paper • 2304.07840 • Published Apr 16, 2023 • 1
Generating High-Precision Feedback for Programming Syntax Errors using Large Language Models Paper • 2302.04662 • Published Jan 24, 2023 • 1
FLAG: Finding Line Anomalies (in code) with Generative AI Paper • 2306.12643 • Published Jun 22, 2023 • 1
The potential of LLMs for coding with low-resource and domain-specific programming languages Paper • 2307.13018 • Published Jul 24, 2023 • 1
Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs Paper • 2308.09895 • Published Aug 19, 2023 • 1
ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation Paper • 2308.01861 • Published Aug 3, 2023 • 1
ToolCoder: Teach Code Generation Models to use API search tools Paper • 2305.04032 • Published May 6, 2023 • 1
Out of the BLEU: how should we assess quality of the Code Generation models? Paper • 2208.03133 • Published Aug 5, 2022 • 2
MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning Paper • 2311.02303 • Published Nov 4, 2023 • 12
WizardCoder: Empowering Code Large Language Models with Evol-Instruct Paper • 2306.08568 • Published Jun 14, 2023 • 32
ChatCoder: Chat-based Refine Requirement Improves LLMs' Code Generation Paper • 2311.00272 • Published Nov 1, 2023 • 11
CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models Paper • 2309.01940 • Published Sep 5, 2023 • 2
Prompt Engineering or Fine Tuning: An Empirical Assessment of Large Language Models in Automated Software Engineering Tasks Paper • 2310.10508 • Published Oct 11, 2023 • 1
A Critical Review of Large Language Model on Software Engineering: An Example from ChatGPT and Automated Program Repair Paper • 2310.08879 • Published Oct 13, 2023 • 1
Large Language Model-Aware In-Context Learning for Code Generation Paper • 2310.09748 • Published Oct 15, 2023 • 2
B-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis Paper • 2310.03173 • Published Oct 4, 2023 • 1
SteloCoder: a Decoder-Only LLM for Multi-Language to Python Code Translation Paper • 2310.15539 • Published Oct 24, 2023 • 1
T5APR: Empowering Automated Program Repair across Languages through Checkpoint Ensemble Paper • 2309.15742 • Published Sep 27, 2023 • 1
InstructCoder: Empowering Language Models for Code Editing Paper • 2310.20329 • Published Oct 31, 2023 • 2
The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation Paper • 2305.06156 • Published May 9, 2023 • 2
Constructing Multilingual Code Search Dataset Using Neural Machine Translation Paper • 2306.15604 • Published Jun 27, 2023 • 1
On Learning Meaningful Code Changes via Neural Machine Translation Paper • 1901.09102 • Published Jan 25, 2019 • 1
Dynamically Relative Position Encoding-Based Transformer for Automatic Code Edit Paper • 2205.13522 • Published May 26, 2022 • 1
CoCoSoDa: Effective Contrastive Learning for Code Search Paper • 2204.03293 • Published Apr 7, 2022 • 1
ContraBERT: Enhancing Code Pre-trained Models via Contrastive Learning Paper • 2301.09072 • Published Jan 22, 2023 • 1
Model-Agnostic Syntactical Information for Pre-Trained Programming Language Models Paper • 2303.06233 • Published Mar 10, 2023 • 1
One Adapter for All Programming Languages? Adapter Tuning for Code Search and Summarization Paper • 2303.15822 • Published Mar 28, 2023 • 1
CodeAttack: Code-Based Adversarial Attacks for Pre-trained Programming Language Models Paper • 2206.00052 • Published May 31, 2022 • 1
Explainable AI for Pre-Trained Code Models: What Do They Learn? When They Do Not Work? Paper • 2211.12821 • Published Nov 23, 2022 • 2
Benchmarking Language Models for Code Syntax Understanding Paper • 2210.14473 • Published Oct 26, 2022 • 1
Are Code Pre-trained Models Powerful to Learn Code Syntax and Semantics? Paper • 2212.10017 • Published Dec 20, 2022 • 1
Diet Code Is Healthy: Simplifying Programs for Pre-trained Models of Code Paper • 2206.14390 • Published Jun 29, 2022 • 1
The EarlyBIRD Catches the Bug: On Exploiting Early Layers of Encoder Models for More Efficient Code Classification Paper • 2305.04940 • Published May 8, 2023 • 1
Method-Level Bug Severity Prediction using Source Code Metrics and LLMs Paper • 2309.03044 • Published Sep 6, 2023 • 1
WELL: Applying Bug Detectors to Bug Localization via Weakly Supervised Learning Paper • 2305.17384 • Published May 27, 2023 • 1
GAMMA: Revisiting Template-based Automated Program Repair via Mask Prediction Paper • 2309.09308 • Published Sep 17, 2023 • 1
Too Few Bug Reports? Exploring Data Augmentation for Improved Changeset-based Bug Localization Paper • 2305.16430 • Published May 25, 2023 • 1
CCT-Code: Cross-Consistency Training for Multilingual Clone Detection and Code Search Paper • 2305.11626 • Published May 19, 2023 • 1
Searching by Code: a New SearchBySnippet Dataset and SnippeR Retrieval Model for Searching by Code Snippets Paper • 2305.11625 • Published May 19, 2023 • 1
ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages Paper • 2212.06742 • Published Dec 13, 2022 • 3
Towards Understanding the Capability of Large Language Models on Code Clone Detection: A Survey Paper • 2308.01191 • Published Aug 2, 2023 • 1
Assessing the Use of AutoML for Data-Driven Software Engineering Paper • 2307.10774 • Published Jul 20, 2023 • 1
CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models for Programming Language Attend Code Structure Paper • 2210.04633 • Published Oct 7, 2022 • 1
Towards Efficient Fine-tuning of Pre-trained Code Models: An Experimental Study and Beyond Paper • 2304.05216 • Published Apr 11, 2023 • 1
Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation Paper • 2310.02304 • Published Oct 3, 2023 • 1
Is AI the better programming partner? Human-Human Pair Programming vs. Human-AI pAIr Programming Paper • 2306.05153 • Published Jun 8, 2023 • 1
"Teach AI How to Code": Using Large Language Models as Teachable Agents for Programming Education Paper • 2309.14534 • Published Sep 25, 2023 • 2
When to Show a Suggestion? Integrating Human Feedback in AI-Assisted Programming Paper • 2306.04930 • Published Jun 8, 2023 • 3
A Large-Scale Survey on the Usability of AI Programming Assistants: Successes and Challenges Paper • 2303.17125 • Published Mar 30, 2023 • 1
LORD: Low Rank Decomposition Of Monolingual Code LLMs For One-Shot Compression Paper • 2309.14021 • Published Sep 25, 2023 • 1
Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding Paper • 2306.06094 • Published Jun 9, 2023 • 1
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation Paper • 2312.14187 • Published Dec 20, 2023 • 49
Neural Rankers for Code Generation via Inter-Cluster Modeling Paper • 2311.03366 • Published Oct 16, 2023 • 1
LLM-Assisted Code Cleaning For Training Accurate Code Generators Paper • 2311.14904 • Published Nov 25, 2023 • 5
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator Paper • 2312.04474 • Published Dec 7, 2023 • 34
Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code Paper • 2311.07989 • Published Nov 14, 2023 • 26
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution Paper • 2401.03065 • Published Jan 5, 2024 • 11
Is Model Attention Aligned with Human Attention? An Empirical Study on Large Language Models for Code Generation Paper • 2306.01220 • Published Jun 2, 2023 • 1
CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model Paper • 2310.06266 • Published Oct 10, 2023 • 2
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents Paper • 2401.00812 • Published Jan 1, 2024 • 11
Code as Policies: Language Model Programs for Embodied Control Paper • 2209.07753 • Published Sep 16, 2022 • 1
Program Merge Conflict Resolution via Neural Transformers Paper • 2109.00084 • Published Aug 31, 2021 • 1
AceCoder: Utilizing Existing Code to Enhance Code Generation Paper • 2303.17780 • Published Mar 31, 2023 • 1
SkCoder: A Sketch-based Approach for Automatic Code Generation Paper • 2302.06144 • Published Feb 13, 2023 • 1
What Makes Good In-context Demonstrations for Code Intelligence Tasks with LLMs? Paper • 2304.07575 • Published Apr 15, 2023 • 1
The Good, the Bad, and the Missing: Neural Code Generation for Machine Learning Tasks Paper • 2305.09082 • Published May 16, 2023 • 1
RestGPT: Connecting Large Language Models with Real-World RESTful APIs Paper • 2306.06624 • Published Jun 11, 2023 • 1
Leveraging Large Language Models to Improve REST API Testing Paper • 2312.00894 • Published Dec 1, 2023 • 2
Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering Paper • 2401.08500 • Published Jan 16, 2024 • 5
On the Usage of Continual Learning for Out-of-Distribution Generalization in Pre-trained Language Models of Code Paper • 2305.04106 • Published May 6, 2023 • 1
LEVER: Learning to Verify Language-to-Code Generation with Execution Paper • 2302.08468 • Published Feb 16, 2023 • 1
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback Paper • 2402.01391 • Published Feb 2, 2024 • 43
Improving Natural Language Capability of Code Large Language Model Paper • 2401.14242 • Published Jan 25, 2024 • 1
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models Paper • 2401.00788 • Published Jan 1, 2024 • 23
Leveraging Large Language Models for Automated Proof Synthesis in Rust Paper • 2311.03739 • Published Nov 7, 2023 • 8
Guiding Language Models of Code with Global Context using Monitors Paper • 2306.10763 • Published Jun 19, 2023 • 7
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay Paper • 2402.04858 • Published Feb 7, 2024 • 15
MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline Paper • 2401.08190 • Published Jan 16, 2024
Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming Paper • 2402.14261 • Published Feb 22, 2024 • 10
PYInfer: Deep Learning Semantic Type Inference for Python Variables Paper • 2106.14316 • Published Jun 27, 2021
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models Paper • 2404.02575 • Published Apr 3, 2024 • 50
Advancing LLM Reasoning Generalists with Preference Trees Paper • 2404.02078 • Published Apr 2, 2024 • 46
NExT: Teaching Large Language Models to Reason about Code Execution Paper • 2404.14662 • Published Apr 23, 2024 • 4
MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning Paper • 2405.07551 • Published May 13, 2024
Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective Paper • 2404.07549 • Published Apr 11, 2024
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning Paper • 2402.09136 • Published Feb 14, 2024 • 1
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset Paper • 2402.10176 • Published Feb 15, 2024 • 38
Grounding Data Science Code Generation with Input-Output Specifications Paper • 2402.08073 • Published Feb 12, 2024
SemCoder: Training Code Language Models with Comprehensive Semantics Paper • 2406.01006 • Published Jun 3, 2024 • 1
AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology Paper • 2406.11912 • Published Jun 16, 2024 • 27
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging Paper • 2410.01215 • Published Oct 2, 2024 • 39
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs Paper • 2506.18896 • Published Jun 23, 2025 • 29
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model Paper • 2601.15892 • Published 2 days ago • 43