AI Code Fails the Test: ChatGPT Stumbles on New Problems
The Rise of ChatGPT: Boon or Bane for Human Jobs?
ChatGPT’s arrival in 2022 sparked debate about its potential to replace human workers. While some experts predict AI will take over tasks like coding, others believe it will enhance human capabilities. Tech leaders have even predicted a future with no human programmers, replaced entirely by AI. However, a new study suggests otherwise.
ChatGPT’s Code: A Mixed Bag of Success
A study published in IEEE Transactions on Software Engineering evaluated ChatGPT’s generated code against human-written code. They compared functionality, complexity, and security. The results were all over the map. ChatGPT’s success rate in churning out the working code swung wildly, from a dismal 0.66% to a high of 89%. This rollercoaster ride suggests that while ChatGPT can occasionally rival or even outperform humans, it also has serious shortcomings.
Yutian Tang, a researcher involved in the study, sees AI code generation as a potential productivity booster and automation tool for software development. However, he stresses the importance of understanding both the strengths and weaknesses of these AI models. A thorough analysis is needed, says Tang, to identify potential problems and refine techniques for AI-generated code.
ChatGPT Stumbles on Newer Coding Challenges
To understand ChatGPT’s limitations better, the researchers tested its ability to solve coding problems on the LeetCode platform. They used GPT-3.5 and focused on 728 problems across five languages (C, C++, Java, JavaScript, and Python).
The results were interesting. ChatGPT performed well on problems added to LeetCode before 2021. For easy problems, it achieved an impressive 89% success rate, dropping to 71% for medium and 40% for hard problems.
However, when it came to problems added after 2021, ChatGPT’s performance took a nosedive. The success rate for easy problems plummeted to 52%, and for hard problems, it dropped to a dismal 0.66%. This suggests that ChatGPT struggles with newer coding challenges. It’s likely because its training data doesn’t include these recent problems, limiting its ability to adapt.
Also read : Prime Day Alert: Security Experts Warn of Phony Amazon Sites!
Why ChatGPT Struggles with Newer Code: A Training Gap
Researchers believe they’ve cracked the code behind ChatGPT’s inconsistent performance. They propose that ChatGPT excels at pre-2021 coding problems because its training data likely included a large number of these examples. However, as coding practices evolve, ChatGPT hasn’t been exposed to the latest problems and solutions. This lack of exposure, compared to a human programmer’s critical thinking skills, hinders its ability to adapt. In simpler terms, ChatGPT can handle familiar problems but struggles with anything new.
The study emphasizes that while AI models like ChatGPT can be productivity boosters and automate repetitive tasks, they’re not programmer replacements. ChatGPT’s struggles with new problems highlight the need for continuous development and training to keep pace with the ever-changing world of software engineering.