Saturday, December 13, 2008

Branching and Merging with SVN

To merge the changes from the trunk to the current directory (we assume that we are working on a branch) we issue this command:

$ svn merge -c 344 http://svn.example.com/repos/calc/trunk
U integer.c

After the merge has been done, you can commit the change as usual. At that point, the change has been merged into your repository branch. In version control terminology, this act of copying changes between branches is commonly called porting changes.
When you commit the local modification, make sure your log message mentions that you're porting a specific change from one branch to another. For example:

$ svn commit -m "integer.c: ported r344 (spelling fixes) from trunk."
Sending integer.c
Transmitting file data .
Committed revision 360.

This will help you keep track of which changes had been ported so you don't merge the same changes more than once, especially when you repeatedly merge changes from one branch to another.

Now, there are times where you wish to preview the merge operation without actually applying the changes because you have local edits, so reverting the files is not an option. Issue this command to do so:

$ svn merge --dry-run -c 344 http://svn.example.com/repos/calc/trunk
U integer.c

This only shows the status codes that would result from a real merge. It's more concise than the info that svn diff would provide us, and is perfect when we don't want to be flooded with all the details of the changes. But in case you need the details then you can run the svn diff with the same arguments you passed to svn merge, in our previous example.

Now an excerpt from SVN handbook Chapter 4: Branching and Merging

Merges and Moves


A common desire is to refactor source code, especially in Java-based software projects. Files and directories are shuffled around and renamed, often causing great disruption to everyone working on the project. Sounds like a perfect case to use a branch, doesn't it? Just create a branch, shuffle things around, then merge the branch back to the trunk, right?

Alas, this scenario doesn't work so well right now, and is considered one of Subversion's current weak spots. The problem is that Subversion's update command isn't as robust as it should be, particularly when dealing with copy and move operations.

When you use svn copy to duplicate a file, the repository remembers where the new file came from, but it fails to transmit that information to the client which is running svn update or svn merge. Instead of telling the client, “Copy that file you already have to this new location”, it instead sends down an entirely new file. This can lead to problems, especially because the same thing happens with renamed files. A lesser-known fact about Subversion is that it lacks “true renames”—the svn move command is nothing more than an aggregation of svn copy and svn delete.

For example, suppose that while working on your private branch, you rename integer.c to whole.c. Effectively you've created a new file in your branch that is a copy of the original file, and deleted the original file. Meanwhile, back on trunk, Sally has committed some improvements to integer.c. Now you decide to merge your branch to the trunk:

$ cd calc/trunk

$ svn merge -r 341:405 http://svn.example.com/repos/calc/branches/my-calc-branch
D integer.c
A whole.c

This doesn't look so bad at first glance, but it's also probably not what you or Sally expected. The merge operation has deleted the latest version of integer.c file (the one containing Sally's latest changes), and blindly added your new whole.c file—which is a duplicate of the older version of integer.c. The net effect is that merging your “rename” to the branch has removed Sally's recent changes from the latest revision!

This isn't true data-loss; Sally's changes are still in the repository's history, but it may not be immediately obvious that this has happened. The moral of this story is that until Subversion improves, be very careful about merging copies and renames from one branch to another.

Another excerpt from SVN handbook Chapter 4: Common Use-Cases

Merging a Whole Branch to Another


To complete our running example, we'll move forward in time. Suppose several days have passed, and many changes have happened on both the trunk and your private branch. Suppose that you've finished working on your private branch; the feature or bug fix is finally complete, and now you want to merge all of your branch changes back into the trunk for others to enjoy.

So how do we use svn merge in this scenario? Remember that this command compares two trees, and applies the differences to a working copy. So to receive the changes, you need to have a working copy of the trunk. We'll assume that either you still have your original one lying around (fully updated), or that you recently checked out a fresh working copy of /calc/trunk.

But which two trees should be compared? At first glance, the answer may seem obvious: just compare the latest trunk tree with your latest branch tree. But beware—this assumption is wrong, and has burned many a new user! Since svn merge operates like svn diff, comparing the latest trunk and branch trees will not merely describe the set of changes you made to your branch. Such a comparison shows too many changes: it would not only show the addition of your branch changes, but also the removal of trunk changes that never happened on your branch.

To express only the changes that happened on your branch, you need to compare the initial state of your branch to its final state. Using svn log on your branch, you can see that your branch was created in revision 341. And the final state of your branch is simply a matter of using the HEAD revision. That means you want to compare revisions 341 and HEAD of your branch directory, and apply those differences to a working copy of the trunk.

Note:

A nice way of finding the revision in which a branch was created (the “base” of the branch) is to use the --stop-on-copy option to svn log. The log subcommand will normally show every change ever made to the branch, including tracing back through the copy which created the branch. So normally, you'll see history from the trunk as well. The --stop-on-copy will halt log output as soon as svn log detects that its target was copied or renamed.

So in our continuing example,

$ svn log -v --stop-on-copy \
http://svn.example.com/repos/calc/branches/my-calc-branch

------------------------------------------------------------------------
r341 | user | 2002-11-03 15:27:56 -0600 (Thu, 07 Nov 2002) | 2 lines
Changed paths:
A /calc/branches/my-calc-branch (from /calc/trunk:340)

$

As expected, the final revision printed by this command is the revision in which my-calc-branch was created by copying.

Here's the final merging procedure, then:

$ cd calc/trunk
$ svn update
At revision 405.

$ svn merge -r 341:405 http://svn.example.com/repos/calc/branches/my-calc-branch
U integer.c
U button.c
U Makefile

$ svn status
M integer.c
M button.c
M Makefile

# ...examine the diffs, compile, test, etc...

$ svn commit -m "Merged my-calc-branch changes r341:405 into the trunk."
Sending integer.c
Sending button.c
Sending Makefile
Transmitting file data ...
Committed revision 406.

Monday, December 1, 2008

Improvement to code generated projects

In my previous post Executing a maven 2 plugin task more than once I exhibited an example where it's possible to run a plugin task a few times to generate several JAXB java beans from a XSD schema. The problem with this approach is that it is still not very intuitive since you'll find yourself filling your project dir with extra directories for generated src files ($basedir/packageA and $basedir/packageB in our case). This creates confusion since you'll soon lose track of these files and attempt to check them in to the respository if you are not careful.

That being said, I'll show you a better alternative to organize your projects. To get started, let's lay out the proposed structure (this time we will have 1 project for each XSD and then another project for the java project that uses them):


+-my_java_project
| `-src/main/java
| `-src/main/resources
|
+-my_xsd_project_a
| `-src/main/resources
|
`-my_xsd_project_b
`-src/main/resources


Now, let's examine the pom.xml used for XSD projects:


<project>
<modelVersion>4.0.0</modelVersion>
<groupId>com.mycompany</groupId>
<artifactId>my_xsd_project_a</artifactId>
<packaging>jar</packaging>
<name>My Company - Project A</name>
...
<build>
<plugins>
<plugin>
<groupId>org.jvnet.jaxb2.maven2</groupId>
<artifactId>maven-jaxb2-plugin</artifactId>
<executions>
<!-- Configuration for packageA -->
<execution>
<goals>
<goal>generate</goal>
</goals>
</execution>
</executions>

<!-- Common JAXB configuration -->
<configuration>
<!-- Changes the default schema directory -->
<schemaDirectory>../xml-schemas-project/</schemaDirectory>
<extension>true</extension>
<generatePackage>com.mycompany.packageA</generatePackage>
<generateDirectory>generated-src-packageA</generateDirectory>
<schemaIncludes>
<include>xml-schemas/packageA/packageA.xsd</include>
</schemaIncludes>
</configuration>
</plugin>
</plugins>
</build>
</project>


Note that the configuration section has been extracted out of the <execution/> tag since we don't have any other <execution/> which needs particular configuration, thus we can move the configuration to the global configuration part of the plugin.

Next, if we run "mvn clean install" we will get a package with the generated classes in it. So, in order to use the generated classes we only need to include the my_xsd_project_a project as a maven2 dependency to my_java_project and that's it. Here's an example:


<project>
<modelVersion>4.0.0</modelVersion>
<groupId>com.mycompany</groupId>
<artifactId>my_java_project</artifactId>
<packaging>jar</packaging>
<name>My Company - My Java Project</name>
...
<dependencies>
<!-- Dependency to my_xsd_project_a (generated by JAXB) -->
<dependency>
<groupId>com.mycompany</groupId>
<artifactId>my_xsd_project_a</artifactId>
</dependency>

<!-- Dependency to my_xsd_project_b (generated by JAXB) -->
<dependency>
<groupId>com.mycompany</groupId>
<artifactId>my_xsd_project_b</artifactId>
</dependency>
</dependencies>
</project>

Executing a maven 2 plugin task more than once

For one particular project I needed to execute a task defined by a maven2 plugin more than once. I had to dig around maven site for a few hours to finally arrive at the deep-buried Using the executions tag page explaination saying that maven2 supported <executions/> in case you need to run several <execution/> with different configurations. So for saving me the trouble in the future and for you folks, here's an example:


<plugin>
<groupId>org.jvnet.jaxb2.maven2</groupId>
<artifactId>maven-jaxb2-plugin</artifactId>
<executions>
<!-- Configuration for packageA -->
<execution>
<id>JAXB - packageA</id>
<goals>
<goal>generate</goal>
</goals>
<configuration>
<generatePackage>com.mycompany.packageA</generatePackage>
<generateDirectory>generated-src-packageA</generateDirectory>
<schemaIncludes>
<include>xml-schemas/packageA/packageA.xsd</include>
</schemaIncludes>
</configuration>
</execution>

<!-- Configuration for packageB -->
<execution>
<id>JAXB - packageB</id>
<goals>
<goal>generate</goal>
</goals>
<configuration>
<generatePackage>com.mycompany.packageB</generatePackage>
<generateDirectory>generated-src-packageB</generateDirectory>
<schemaIncludes>
<include>xml-schemas/packageB/packageB.xsd</include>
</schemaIncludes>
</configuration>
</execution>
</executions>

<!-- Common JAXB configuration -->
<configuration>
<!-- Changes the default schema directory -->
<schemaDirectory>../xml-schemas-project/</schemaDirectory>
<extension>true</extension>
</configuration>
</plugin>

Working with JDBC datasources with Maven2

During my career I've seen too many database-driven maven2 projects that use maven 2 filtering of resources for jdbc properties in order to externalize the username, password, jdbc url and other JDBC specific configuration to $USER/.m2/settings.xml. The problem with this approach is that we become locked-in to maven2 and won't be able to run a simple JUnit test without first running "mvn clean install", so that the placeholders in jdbc properties file get replaced with the appropriate values from settings.xml.

In my opinion, jdbc properties file should be avoided at all if possible. In a web application for example, it's possible to create a datasource at the container (e.g. Tomcat) and then reference this datasource from the application via JNDI. By doing this, we can happily use JDBC datasources without having to run "mvn clean install", making the life for those folks building inside Eclipse much easier.

Now, we don't always want to run our code inside a container. One such example is when we are running unit tests. For those cases we should create a base JUnit class that is specific to our module, say "MyProjectModuleTestCase" and inject JNDI datasource as explained in Randy Carver's blog post Injecting JNDI datasources for JUnit Tests outside of a container.