Analyzing a Java codebase with CodeCity in 2016
By Matthew Miller

You’re probably a visual thinker. Almost everybody’s a visual thinker.

And software is, unfortunately, terrible at making itself visible. Your options are generally to see it in motion (as a running, preferably working product) or see it as code, with very little in between. (Maybe some log messages if you’re unlucky.)

This is why CodeCity is an interesting concept. It’s a visualization environment that allows a user to graph a codebase in three (or more) dimensions. A tool from 2009, I first discovered it from Adam Tornhill’s fun read “Your Code as a Crime Scene” (CaaCS).

Unfortunately, since it’s written in SmallTalk, the tool operates on a fairly obscure model: FAMIX 2.1, formatted with an interchange grammar called Moose. It’s what JSON would look like if SmallTalk had become the language of the web instead of JavaScript.

You’ll need a tool to analyze your code and export into this format. There are two tools that I’m aware of: inFusion (the demo of which doesn’t produce the correct output and is no longer available for purchase anyway) and iPlasma 6, which I’ve described below.

Exporting a Java codebase to MOOSE / FAMIX 2.1:

  1. Download iPlasma. The link is on the site above, but it’s easy to miss. Here’s a deep link.
  2. Unzip wherever.
  3. Update the iPlasma front end. iPlasma contains a pure Java Swing app called Insider, which includes a pair of launch scripts: insider.bat and insider.sh. They offer a front end onto the tool we’ll be using, but I found they need some doctoring first:
    1. The batch file attempts to launch with a bundled version of the JRE that isn’t modern. I altered it to use Java 1.8 from my path.
    2. Both scripts attempt to launch with a gig and a half of heap. I found that to be insufficient, so I bumped to 4G.
    3. The tool logs to console, so I found it valuable to launch from a shell.
  4. On launch, you’re presented with a good old-fashioned Swing UI. Select the option Load->Java Sources.
  5. The Swing modal dialog is displayed. Click the button with the three dots and browse to the root of your code repo. It doesn’t really matter what format the directories are in or what build you use — iPlasma is looking for your source files.3
  6. Click the Open button, then OK.
  7. Wait a while. Expect errors as a tool built for Java 1.5 tries to parse your modern code; these shouldn’t matter too much. Be aware: iPlasma caches your classes as it reads them to save time on the next read. You’ll need to clear this cache (located in a directory named “temp” in the iPlasma launch location) before using the tool in the future when you point to different repos, branches, or make other changes to the code.
  8. After what seems like an eternity, you’ll have this rather unusual UI:

    5
    The code you want to export will be in a tab in the upper right pane. Left click the name of the folder, then Right click the same name.

  9. A context menu is displayed. Browse to Run Tool -> Moose MSE Exporter.
  10. You’ll be asked to give a filename. Enter one and click OK.
  11. Expect to wait a SECOND eternity as your model is exported.
    1. The model of a 1M LOC codebase is about 125 MB (which is why I needed more heap to build it).
    2. I found that, for whatever reason, some classes caused the export program to fail. However, the offending class’s name was logged right before the failure. Fixing things involved closing iPlasma, removing that class’s src file, clearing the cache and starting over at Step 4 above.

Now all you need to do is load this MSE into CodeCity. That’s pretty straightforward, and in the case of a codebase our size, can take forever (20m). Once it’s loaded, you can generate a “city” based off attributes of your code. You can reconfigure these attributes to tell a better story about its complexity — such as making the height of the “buildings” in your city based off invocation or access, rather than lines of code, to illustrate importance rather than size.

Bad news:

  • There’s no Groovy, Java or Clojure support. This means our model doesn’t include a number of interesting classes, including most of our tests.
  • When viewing a very large city, CodeCity has a tendency to crash at any provocation — which means restarting the 20m import procedure. I have a smaller version of the full codebase I use for testing modeling options before applying them to the larger system.
Categories Software Engineering

Leave a Reply

Your email address will not be published. Required fields are marked *