Exploring AI's Power to Write Code

AI is one of the most discussed topics in recent years. Many believe that it will replace developers, making them unnecessary. I decided to check how true this statement is and conduct a study of the capabilities of AI in writing code.

This article is the first in a series. As part of the research, I set myself several tasks:

Check how high-quality and viable the code generated by AI is.
To find out whether AI can really replace developers and whether it is possible to create a working application without technical knowledge.
Compare the output of several LLMs (Anthropic Claude 3.5, OpenAI ChatGPT-4o, OpenAI ChatGPT o1-preview, Deepseek R1) and the ease of working with them.
Share your experience working with AI, using various prompting techniques to get high-quality search results.

Methodology

A Spring Boot project template will be prepared as a starting point.

Stack:

Java 17
Spring Boot with JPA
Liquibase, Hibernate, Lombok

You will need to implement an application in which CRUD operations are available:

Creating an Operation
Operation Update
Search all operations (Operation)
Search for an operation by ID (Operation)
Removing Operation You will also need to write unit and MVC tests for a small piece of logic.

Each AI model will be given the same initial prompt. Then I will ask for implementation of different parts of the functionality. I will not endlessly improve the code: if it does not work, I will ask for bug fixes, and if it works, I will note possible improvements.

Communication with the AI will be in English. When I first worked with Anthropic Claude, it did not understand Russian well, which significantly reduced the quality of its responses. Therefore, to ensure the same conditions, I will use English for all models.

So, the first test subject will be Anthropic Claude 3.5 - or simply Claude.

Claude: The Beginning of the Journey

All source code is available on GitHub . As I wrote earlier, I created a project template using Spring Initializr . This is what I will fill with code.

First, you need to create a prompt for Claude. It will set the tone for the conversation and affect the quality of the output.

What is important to indicate in the prompt?

The role of the model and the list of technologies it should work with.
The purpose of communication is what I want from AI.
Rules of interaction - for example, how he should answer and clarify questions.

Initial Prompt

You will be acting as a backend developer. 
You are have an expertise in the following technologies: 
Java 21+, Spring boot, Spring JPA, Hibernate, Lombok, Spring Web, REST API, SQL.
Your goal is to create a production-ready solution for the user and answer 
their questions. You should clarify questions to provide the best possible answer. 
If you have any questions, ask them first without providing a solution. 
Only after all questions have been clarified, you provide a solution for the user.

You should maintain a friendly and professional tone.

Here are some important rules of conduct:
 - If you're not sure how to respond, say: "Sorry, I didn't understand you. 
    Could you please rephrase your question, please?"
 - If you don't know the answer to a question, say: 
    "I am sorry, but I don't know that answer. 
    Can you please clarify your question for me?"

Here is the user question:
 I have already created an application using Spring Initializr with the following dependencies:
Lombok, Spring Web, Spring Data JPA, PostgreSQL Driver, Liquibase Migration. 

Write a Liquibase migration for the tables.
<table1>
Category table with columns: 
  - category_id (bigint), 
  - category_name (varchar), 
  - category_description (varchar)
<table1>
<table2>
Operation table with columns: 
  - operation_id (bigint), 
  - operation_public_id (varchar), 
  - operation_name (varchar), 
  - operation_amount(bigdecimal), 
  - operation_type (enum), 
  - category_id (FK to category table)
<table2>

Suggest how to store the operation_type. 
In the code, the operation type will be an Enum with the values (DEPOSIT, WITHDRAW).

Think about the answer first before responding. Think step-by-step.

Keywords that improve search results

“If you have any questions, ask them first without providing a solution.” This phrase significantly improves the model’s answers. Instead of immediately providing a solution, it will first ask clarifying questions. This allows you to add details that may have been missed.
“Think step-by-step.” The phrase forces the model to think through the answer better and reduces the number of errors and guesses.

In addition to the basic instructions, I also asked Claude to write a table migration using Liquibase. I specified which columns I needed and asked him to suggest a way to store the transaction types.

The prompt is ready, the first question has been asked. We send it to Claude and wait for an answer.

That’s why it’s important to give the right instructions. Claude didn’t give a solution right away, but clarified the details. He talked about what he needed to do and asked questions about the table structure (some details I deliberately didn’t specify). This is a good sign: now I can count on a correct migration.

Additionally, Claude suggested several options for storing operation_type and explained which one he prefers and why.

We answer Claude’s questions and wait for the final version of the migration.

1. Correct
2. Questions for clarification:
	1. Yes, category_id and operation_id should be auto-incremented
	2. For category_name, operation_public_id and operation_name - varchar(255), and for category_description (varchar 1000)
	3. Nullable fields: category_name
	4. 20 digits before decimal point and 2 after
	5. Yes

Regarding the operation_type, let's take the first approach.

Bugs, fixes and the first Pull Request

After clarifying the details, he gave me the migration code, but in XML format. Neither I nor he specified what format the result should be in, so I asked him to remake the migration using YAML.

Could you re-write the migration using YAML instead of XML?

As expected, not everything went smoothly. Claude did not add the migration to databaseChangelogMaster(and did not even create this file). Well, it happens to everyone. I also made a mistake in the prompt: Claude made category_namenullable, but it should have been category_description. I will fix this manually.

I commended Claude for asking clarifying questions before generating the migration, but it seems he could have asked for more information about the planned data manipulation, which would have allowed the indexes to be tuned in advance. However, he did not, and I, as part of the experiment, decided to “not know” about the indexes and ignore the issue.

Setting up a connection to the database

The migration is ready, but before we can run it, we need to set up a connection to the database. We ask Claude to help.

Write the configuration to connect the application to the PostgreSQL database.

Claude suggested setting up multiple profiles, which may be redundant at the start, but is useful for production-ready code. He also added a basic connection pool and gave recommendations for setting up a production environment.

But there are some problems: Claude incorrectly declared connection-pool in**application.yaml** , which prevents Hikari from working. The error is not critical, but it is not obvious either — Spring will simply issue a warning in the log at startup. Such bugs are the most unpleasant, because they do not lead to an obvious crash, but can affect the operation of the application later.

Conclusion: Always check the settings suggested by the AI. It does not guarantee the functionality of the code.

The correct option is:

spring:  
  datasource:  
    url: jdbc:postgresql://localhost:5432/anthropic_claude?currentSchema=anthropic_claude  
    username: anthropic_claude_app  
    password: strongPassword  
    driver-class-name: org.postgresql.Driver  
    # Connection pool properties (using HikariCP - Spring Boot default)  
    hikari:  
      minimum-idle: 5  
      maximum-pool-size: 20  
      idle-timeout: 300000   # 5 minutes  
      pool-name: HikariPool  
      max-lifetime: 1200000  # 20 minutes  
      connection-timeout: 20000 # 20 seconds

Create init.sql and raise the database in Docker

The next step is to create an init.sql for local development convenience. I asked Claude to create a file with access rights settings.

Write init.sql with the following information:
  - Create an "app" role with a login password of 'strongPassword'.
  - Create a schema "my_app_schema" and authorize the "app" role to use it.
  - Grant all privileges to the "app" role on the "my_app_schema"

Claude did the right thing by creating a limited role for the app and reminding me to update application.yaml. That’s nice.

Now we need to set up PostgreSQL in Docker. I decided not to waste time and wrote it docker-compose.yamlmanually:

version: '3.1'

services:
  anthropic-claude-db:
    container_name: anthropic-claude-postgres
    image: postgres:15
    restart: always
    environment:
      POSTGRES_USER: anthropic_claude_user
      POSTGRES_PASSWORD: strongUserPassword
      POSTGRES_DB: anthropic_claude
    volumes:
      - ./db-volume:/var/lib/postgresql
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql
    ports:
      - "5432:5432"

Let’s try to launch it.

The first serious mistake

As expected, Liquibase couldn’t find**db.changelog-master.yaml** ( Liquibase failed to start because no changelog could be found at 'classpath:/db/changelog/db.changelog-master.yaml). Why? Because Claude forgot to create it. I’m writing to him about it and asking him to fix it.

When I start the application, I get an error. 
Liquibase failed to start because no changelog could be found at
'classpath:/db/changelog/db.changelog-master.yaml'.

Claude added db.changelog-master.yamlthe file and even suggested a structure for storing migrations. However, now a new problem has arisen - a schema access error .

Claude starts to “get clever”

We are trying to solve this problem together with Claude.

I’ve added a changelog to the master file, but when I run the application, a new error occurs: permission denied for schema anthropic_clude

I have seen this behavior more than once: if an LLM encounters several errors in a row, it starts “fantasizing” and offering bad solutions. And each subsequent fix is worse than the previous one. This is a typical problem for all LLMs - they do not analyze the history of interactions globally, but simply try to find the nearest possible fix. Sometimes each subsequent fix is worse than the previous one. In such cases, it is better to stop, review the situation manually and set more specific instructions.

His suggested fix turned out to be incorrect. I didn’t bother him and just added the fix manually:

CREATE SCHEMA my_schema_app AUTHORIZATION app;

After that the application launched without any problems.

What conclusions can be drawn?

The more errors, the worse the output becomes. LLM does not analyze the entire history of interactions, but simply tries to guess the next fix.
Check your infrastructure settings. Errors may be subtle but critical.

Create entities

The application is launched successfully, tables are created, migrations are running. The next step is to create entities .

I also asked Claude to suggest a package structure to help me understand where to store code.

Write entities for operation and category tables.
Provide a path to the package where I should create the entities.

Errors in entity generation

The first thing I noticed was that Claude started to lose context :

Category The field disappeared description.
Suddenly Operation appeared operationDate.
He changed the column names , which caused the app to stop running.

I had to manually correct the column names. I will continue to add missing information to the context to avoid such errors.

Problem with equals() & hashCode()

Another mistake is that and are not redefined .**equals()hashCode()**

Claude uses Set for Operation, which without correctness equals()can lead to duplication of identical objects with different references.

I asked him to override equals()and hashCode()to check if it would add OneToMany/ objects to those methods ManyToOneand create a circular dependency (which could lead to OutOfMemory).

Why is it important to redefine **equals()**and **hashCode()**?

If you don’t do this, objects are compared by reference , which can lead to bugs.
Lombok annotations @EqualsAndHashCodealso @Dataimplicitly override these methods using all available properties of the class (for example, they can take into account related entities, which will lead to recursion).
The best option is to override manually.

Please redefine equals()&hashCode()

Override equals and hashCode for Category and Operation entites.

What Claude did well:

Correctly arranged**@Transactional** - divided readOnly- transactions and full ones.
Added DTO - did not return the entity outside.
I indicated the structure of the project , which means that part of the context has been preserved.
Suggested further steps.

What Claude did wrong:

The entity “leaked” into the controller - mapping to DTO occurs at the controller level (outside the transaction), which can cause**LazyInitializationException** .
- Now there will be no error (we only take idcategories), but this is even worse - it creates a false feeling that everything is working correctly .
Didn’t specify which fields should be in the DTO .
- Because of this, he pulled **categoryId**out , although this is not critical .
I made a mistake with**@Transactional** .
- I explicitly specified the DAO layer , but Claude hung it **@Transactional**on the service .
- This is a problem because business logic (service) can call third-party APIs - and in this case, all this will be performed inside a transaction , which is inefficient.

Can these errors be corrected later? Of course. But if you do it right the first time , it will save a lot of time in the future .

Why I Clear Context

When an LLM changes a large piece of code, he often edits even parts that shouldn’t be touched.

This leads to:

“Littering” the context
Corruption of logic even in those places where it does not need to be changed
Deterioration in the quality of subsequent responses

So it’s easier to clear the context and ask the task again . I’ll ask a new prompt and correct Claude’s output to avoid errors.

This approach shows the importance of breaking complex tasks into steps .
Context cleaning is a useful tool if the model starts to lose its logic.

Next I will try to create a CREATE operation step by step to minimize errors.

Trying CRUD. Part 2

Clear the context and set a prompt with history

Preparing a new prompt with interaction history:

I had to rewrite the prompt twice due to errors in the wording.

The first time Claude clarified the details, and the code turned out a little better , but here I will analyze the second version, since it was the one that was sent to work.

Resetting the context and updating the prompt helps the AI remember its initial settings .

This is one way to improve the accuracy of the answer, but it has a downside - this approach takes more time , since you have to collect the history of communication with the model (tag <history>in the prompt).

I recommend using tags**<history></history>** only in two cases:

The AI has reached a dead end and is producing incorrect solutions.
There were errors in the prompt and I want to get rid of the errors.

The advantage of this method is the ability to edit the communication history , as well as restore the original requirements for the interaction of the model with the user, which ultimately improves the final result.

What turned out better compared to the first attempt

Limiting the task allowed Claude to improve the quality of the code :

Now the service does not return the entity to the controller .
The code has become cleaner and the output has become better.
Added basic business processing .

But it’s not without its drawbacks - No @JsonProperty in DTO. The presence of this annotation makes it safe to change DTO fields in the code without affecting the API contract.

Testing the functionality

Sending a request:

{
    "name": "AVDS",
    "amount": 125.23,
    "type": "WITHDRAW",
    "categoryId": 1
}

We get the answer:

{
	"publicId":"31cf0666-38b6-4aa3-9c6d-2547fe15e237",
	 "name":"ABCSD",
	 "amount":125.23,
	 "type":"WITHDRAW",
	 "categoryId":1
}

Selecting a strategy for the remaining CRUD operations

I tried two approaches:

Create all CRUD operations at once.
Create each operation separately.

The second option turned out to be better quality , so READ, UPDATE, DELETE will be implemented according to it .

Implementing UPDATE logic for operations

Write an UPDATE logic for the Operation entity (Controller - Service - DAO). 
In the first iteration, skip validation of all fields.

OperationMapper.updateEntityFromDto - what is it anyway?

Claude added some strange construction :

operationMapper.updateEntityFromDto(operation, dto);

Sounds logical, but there are a few problems:

Moved logic that doesn’t apply to the mapper into the mapper.
I divided it into different layers : three parameters are set in the mapper, and one in the service.

This is a bad design because the mapper should just convert the DTO to an entity and back , not make changes to the object.

OperationNotFoundException, but… not quite

Claude added a custom exception:

public class OperationNotFoundException extends RuntimeException { ... }

At first glance, everything is correct , but there is one nuance - he doesn’t use it anywhere!

Where he was supposed to throw OperationNotFoundException, Claude throws a regular RuntimeException. In the end, a good idea, but not fully implemented.

Problem with handling NotFound errors

Further exceptions in the same style will appear (for example CategoryNotFoundException). As a result, each exception will be unique - this will complicate support. It is better to make one **NotFoundException**, and if necessary - inherit from it:

public class NotFoundException extends RuntimeException {
    public NotFoundException(String message) {
        super(message);
    }
}

And then:

public class OperationNotFoundException extends NotFoundException {
    public OperationNotFoundException() {
        super("Operation not found");
    }
}

This approach has two advantages:

Single**ExceptionHandler** - can be processed NotFoundException, not a bunch of separate classes.
Cleaner code - if a new type appears NotFound, it will not need to be added to the handler.

Despite these issues, Claude got the job done . The code works, and even provided a test JSON for testing.

Now we move on to searching for operations (READ).

Implement READ logic for operations

Asked Claude to create two endpoints:

Get all operations with pagination.
Search for operation by publicId.

Write a FIND logic for the Operation entity (Controller - Service - DAO). 
In the first iteration, skip validation of all fields. 
You need to add two endpoints, the first one that finds all operations with pagination, 
and the second one that finds an operation by publicId.

How did Claude accomplish the task?

Claude used Spring JPA pagination , which makes sense since we’re working with Hibernate.

The code turned out almost perfect :

**PageResponse**made generic - can be reused for any answers.
The context is preserved - Claude does not duplicate code already written.

Disadvantages of implementation

The problem is**sortBy** that Claude refers to a column in the DB , not to an entity field . In this form, sorting will not work , since JPA uses the name of the entity fields for sorting.

Spring Data JPA uses reflection to work with entities. When we pass a sort parameter sortBy, Spring Data JPA tries to find the corresponding field by name inside the entity class . JPA looks for a property (field) inside the Java class Operation, not a column in the database.

We correct it and move on to DELETE .

DELETE – logic for operations

The final part of CRUD operations is deletion. I ask Claude to write an endpoint that deletes the operation by publicId.

Write a DELETE logic for the Operation entity (Controller - Service - DAO). 
In the first iteration, skip validation of all fields.

Result

Claude handled it without any problems – everything is obvious here, there is nothing special to comment on.

PR to remove the operation
https://github.com/nzinovev/anthropic-claude/pull/8

Adding validation

CRUD for operation ready, but it is still quite primitive . In addition, we have no tests at all - and this is an important part of quality development. In order not to stretch the article, I will not cover all endpoints with tests and validation.

I’ll pick one - CREATE Operation- and ask Claude to add:

Request validation
Unit tests for the service layer
MVC test for controller

Let’s go!

Adding validation to the CREATE operation

I remind Claude what the logic associated with creating an operation looks like and ask him to add validation for the CREATE operation.

Result: pros and cons

Claude implemented proper error handling :

All errors are returned at once - the user does not need to send multiple requests.
Added**ExceptionHandler** - error handling is centralized.

But validation doesn’t work .

Problem: Missing dependencies

Claude did not add the required dependencies:

<dependency>  
   <groupId>org.springframework.boot</groupId>  
   <artifactId>spring-boot-starter-validation</artifactId>  
</dependency>
<!-- NotBlank, NotNull, etc. -->
<dependency>  
   <groupId>jakarta.validation</groupId>  
   <artifactId>jakarta.validation-api</artifactId>  
   <version>3.1.0</version>  
</dependency>

If the need jakarta.validationcan still be guessed by importing annotations into DTO, then it spring-boot-starter-validationis less obvious.

But I’m imitating a developer with minimal experience , and without documentation it’s hard for me to understand why validation doesn’t work.

Error with handling NotFound exceptions

As with the CRUD logic, Claude added a specific exception :

public class CategoryNotFoundException extends RuntimeException { ... }

This approach requires:

Create a separate exception for each entity.
Add its processing to **ExceptionHandler**.

It is better to use a single one**NotFoundException** and inherit from it (this has already been discussed earlier).

In addition, an extra annotation has been added :

@ResponseStatus(HttpStatus.NOT_FOUND)

This is redundant because the exception is already handled by the global ExceptionHandler.

Trying to fix it is a new problem

Asked Claude to correct the errors:

I've added annotations to OperationCreateRequest. 
However, when I send a request with incorrect data, 
the application does not respond with a validation error. 
The app ignores all annotations relating to validation and passes the request on.

The answer is given, but it is too redundant . The model started adding unnecessary code , including:

// src/main/java/com/yourcompany/config/WebConfig.java

package com.yourcompany.config;

import org.springframework.context.annotation.Configuration;
import org.springframework.web.servlet.config.annotation.WebMvcConfigurer;

@Configuration
public class WebConfig implements WebMvcConfigurer {
    // Additional web configuration can be added here if needed
}

This class is not needed at all - validation works without it. This is a configuration for an MVC application, and we have a REST API.

The problem is familiar: with complex queries, LLM starts generating extra code .

It was possible to clear the context and recreate the prompt, but I decided to continue with the current version. Validation is ready , let’s move on to tests.

Unit tests for OperationService

Asked Claude to write positive and negative tests for OperationService

Duplication verify & when

In the test , and createOperation_Success are duplicated .**verifywhen**

Why is this bad?

when(...) already checks the method call.
verify(...)makes sense only if all calls in createOperation. are checked.

Which is better?

Or use verifyonly on voidmethods.
Or check all calls , not just some.

Problem with any(Operation.class)

The test does not check the contents of the object, only its type.

when(operationRepository.save(any(Operation.class))).thenReturn(operation);

This means that if createOperation the object inside **Operation**changes , the test will not notice it.

Example of error:

They forgot to specify the category in the code before saving.
The test still passes.

    @Transactional  
    public OperationResponse createOperation(OperationCreateRequest request) {  
        final var category = categoryRepository.findById(request.getCategoryId())  
                .orElseThrow(() -> new CategoryNotFoundException(  
                        String.format("Category with id %d not found", request.getCategoryId())));  
  
        final var operation = operationMapper.toEntity(request);  
//        operation.setCategory(category);  

        final var savedOperation = operationRepository.save(operation);  
        return operationMapper.toDto(savedOperation);  
    }

The test will not detect this error ! How to fix? Use ArgumentCaptorto intercept the object passed to save().

Corrected version of the test with ArgumentCaptor

@Captor  
ArgumentCaptor operationArgumentCaptor;

@Test  
void createOperation_Success() {  
    // Arrange  
    when(categoryRepository.findById(1L)).thenReturn(Optional.of(category));  
    when(operationMapper.toEntity(createRequest)).thenReturn(operation);  
    when(operationRepository.save(any(Operation.class))).thenReturn(operation);  
    when(operationMapper.toDto(operation)).thenReturn(operationResponse);  
  
    // Act  
    OperationResponse result = operationService.createOperation(createRequest);  
  
    // Assert  
    assertThat(result).isNotNull();  
    assertThat(result.getPublicId()).isEqualTo("test-public-id");  

    verify(operationRepository).save(operationArgumentCaptor.capture());  
    assertEquals(1, operationArgumentCaptor.getAllValues().size());  
  
    var savedOperation = operationArgumentCaptor.getValue();  
    assertEquals(category, savedOperation.getCategory());  
}

The test now checks the correctness of the object, not just its type.

Redundant data in setUp method

Let’s look at the same test createOperation_Success(). How createOperation()it works in code:

Searches for a category in the database
Muppet OperationCreateRequestinOperation
Sets the category to an objectOperation
Saves the objectOperaion

In the test, the object Operationis created once in setUp()– but already with the category set . Then this object is returned to the mock

when(operationMapper.toEntity(createRequest)).thenReturn(operation);

This implementation reduces the quality of the test: some logic createOperation()is not checked. In the future, this may lead to bugs that will go unnoticed.

Codestyle and “taste”

The class under test is named operationServiceinstead of sut(System Under Test), the second option immediately shows which service is being tested
Test data generation is moved to @BeforeEach, but not all data is needed in every test. It is better to move it to a private method and call it in the right places.

private OperationCreateRequest buildCreateRequest() {
    return new OperationCreateRequest("Тест", 100.0, "WITHDRAW", 1L);
}

This will make the tests cleaner and more understandable .

MVC tests

Unit tests are ready, now I asked Claude to write MVC tests for the controller .

What went wrong?

All tests pass except three :

findByPublicId_NotFound
deleteOperation_NotFound
updateOperation_NotFound

And here is a very interesting point, Claude expects the service to return the 500th error , but in fact it should respond with the 404th (and the name of the test hints at this). The test did not pass initially, because Claude never added the handling of 500th errors to ExceptionHandler. If the test expected the 404th error, it would still fail, because when processing validation, only the handling was added CategoryNotFoundException, but there is also an advantage: tests fail, which means they cannot be ignored. This gives a chance to fix the error in time .