Sunday, January 11, 2015

OpenJDK Cookbook



It has been long time since I've published my last post. Nowdays, work and the second child are consuming the most of my time, so getting just a few free minutes is somewhat very hard to afford :) But I haven't been just surviving these uneasy conditions, but manged to do some work outside of my day-to-day routine and that is a book, which I have wrote with two other guys - Alex Kasko and Alexey Mironchenko. This was quite challenging experience which, as you would expect, took much more effort that I have expected and it brought me some sleepness nights. But, I have to admit, it was positive experience and it is very pleasing feeling to see that the work is now completed.
As can be guessed, the book is about OpenJDK. In the book we specifically tried to avoid pure-java topics, one will not find there how to work with collections or tune JVM, there are lots of other books these topics and it's not something new to cover. In this book the most of material is about OpenJDK-specifics, things which can't be be found anywhere else. It is useful to someone who is going to hack around with OpenJDK, make changes in it's source code and experiment with it. The content covers building various versions of OpenJDK on various platforms, making code changes and, of course, testing them.
At the moment book is at the editing stage and will be published at the end of Jan. It is available for pre-order from Amazon UKor Amazon US

Tuesday, May 7, 2013

Embedding Jetty9 & Spring MVC

This post is re-done of one of my previous posts which was about embedding Jetty7. Now it's about new version - Jetty9 and also with support of Spring MVC. Just thought it would be a good idea to keep something like that as a reference. There is no much text below, this is because the source is clear enough and doesn't need much explanation. Though, feel free to raise questions in comments.

Wednesday, March 27, 2013

AtomicFieldUpdater vs. Atomic

Java 1.5 introduced new family of classes (Atomic*FieldUpdater) for atomic updates of object fields with properties similar to Atomic* set of classes and it seems like there is slight confusion about the purpose of these. And that confusion is understood, the reason for their existance is not very obvious. First of all they are no way faster than Atomics, if you look at source, you see that there are lots of access control checks. Then, they are not handy - developer has to write more code, understand new API, etc.

So why would you bother? There are two main use cases when Atomic*FieldUpdater can be considered an an option:

  • There is a field which is mostly read and rarely changed. In that case, volatile field can be used for read access and Atomic*FieldUpdater for ocasional updates. Thought, that optimization is arguable, because there is a good chance that in latest JVMs Atomic*.get() is intrinsic and should not be slower than volatile.
  • Atomics have much higher overhead on memory usage than primitives. In cases when memory is critical Atomic can be replaced with volatile primitive with Atomic*FieldUpdater.

References:
http://concurrency.markmail.org/message/ns4c5376otat2p54?q=FieldUpdater
http://concurrency.markmail.org/message/mpoy74yhuwgi52fa?q=FieldUpdater

Tuesday, March 12, 2013

Scala: Automatic resourse management

After completing wonderful course by Martin Odesky, I have eventually had a chance to have a little play with Scala and create something more useful than "hello world" app. And even I have had some experience with that language just a few week before, I felt slightly frustrated. I reckon all that is because I become too dull and silly spending too much time with Java :) First surprise was that I realized that this language has a compiler - with Java it almost doesn't exist, you never 'compile' you do 'build', which is very different kind of thing. With Java you always almost curtain that you code is compilable, because modern IDEs (like Intellij) do not give you a chance to leave compilation error in your code. Another surprize is that Scala compiler is deadly slow, I have a good feeling that big project will suffer with it. So, you can say that with Scala it feels like comming back to good old C++ days :)

Ok, that's was introduction, here is some stuff I wrote, and which I almost sure is just another 'bicycle', but was useful for me. After some time with language, I realized that it doesn't have any standard resource-management construction, which probably is good for Scala - language is so flexible that it allows you to build your own without much effort (mostly code is stolen from this post):

  trait Managed[T] {
    def onEnter(): T
    def onExit(t:Throwable = null)
    def attempt(block: => Unit) {
      try { block } finally {}
    }
  }

  def using[T <: Any, R](managed: Managed[T])(block: T => R): R = {
    val resource = managed.onEnter()
    var exception = false
    try {
      block(resource)
    } catch  {
      case t:Throwable => {
        exception = true
        managed.onExit(t)
        throw t
      }
    } finally {
      if (!exception) {
        managed.onExit()
      }
    }
  }

  def using[T <: Any, U <: Any, R] (managed1: Managed[T], managed2: Managed[U]) (block: T => U => R): R = {
    using[T, R](managed1) { r =>
      using[U, R](managed2) { s => block(r)(s) }
    }
  }

  class ManagedClosable[T <: Closeable](closable:T) extends Managed[T] {
    def onEnter(): T = closable
    def onExit(t:Throwable = null) {
      attempt(closable.close())
    }
  }

  implicit def closable2managed[T <: Closeable](closable:T): Managed[T] = {
    new ManagedClosable(closable)
  }
and the usage looks like this:
  def readLine() {
    using(new BufferedReader(new FileReader("file.txt"))) {
      file => {
        file.readLine()
      }
    }
  }

Monday, February 4, 2013

Evil of microbenchmarking & CAS performance on Ivy Bridge

Some days back Martin Thompson published investigation on results of his controversial CAS (compare and swap) performance test he made few months back. And that investigation really impressed me - it shows how microbenchmarking can go really wrong, even when it is done by such a smart guy.

Just to recap, test was executing several threads which were hammering CPU with CAS operations. Test showed that on average CAS on modern Ivy Bridge processor works significantly slower than on older Nehalem architecture. After a few months and Martin found out the reason for such strange behavior and amazing thing about it is that the reason for test being slower is that Ivy Bridge is actually faster.

To understand why that happens lets see what's going on when CAS is executed. Generally speaking, on high level, in relation to CPU core, memory which is going to be written can be in two states - core can either exclusively own cache line with it or do not own. If it owns that line then CAS is extremely fast - core doesn't need to notify other cores to do that operation. If core doesn't own it, the situation is very different - core has to send request to fetch cache line in exclusive mode and such request requires communication with all other cores. Such negotiation is not fast, but on Ivy Bridge it is much faster than on Nehalem. And because it is faster on Ivy Bride, core has less time to perform a set of fast local CAS operations while it owns cacheline, therefore total throughput is less.

I suppose, a very good lesson learned here - microbenchmarking can be very tricky and not easy to do properly. Also results can be easily interpreted in a wrong way. So, be careful!

Thursday, December 20, 2012

git hangs after "Resolving deltas"

Have had a funy problem with Git. I suppose it's proxy-related. Writing it down, because sure that will have the same problem some time again. Also hope it will help to people who are also suffering with it.

As a precondition, I have a git with following in '.gitconfig':

[http]
proxy=http://user:password@proxy:8080

When I tried to clone repository I've got this:

$ git clone https://code.google.com/p/caliper/
Cloning into 'caliper'...
remote: Counting objects: 3298, done.
remote: Finding sources: 100% (3298/3298), done.
remote: Total 3298 (delta 1755)
Receiving objects: 100% (3298/3298), 7.14 MiB | 1.94 MiB/s, done.
Resolving deltas: 100% (1755/1755), done.

And then nothing, it just hangs. If you go and have a look, you can see that files are downloaded, but not unpacked. As all other people on Internet, I have no idea why that is happening, but eventually I have found a way to get files out of it.

When it hangs, just kill the process with Ctrl+C and run this command in repository folder:

$ git fsck
notice: HEAD points to an unborn branch (master)
Checking object directories: 100% (256/256), done.
Checking objects: 100% (3298/3298), done.
notice: No default references
dangling commit 2916d1238ca0f4adecbda580ef4329a649fc777c
Now just merge that dangling commit:
$ git merge 2916d1238ca0f4adecbda580ef4329a649fc777c
and from now on you can enjoy repository content in any way you want.

Thursday, December 13, 2012

File.setLastModified & File.lastModified

Have observed interesting behavior of File.lastModified file property on Linux. Basically, my problem was that I was incrementing the value of that property by 1 in one thread and monitoring the change in the other thread. And apparently no change in property's value happened, the other thread did not see increment. After some time trying to make it work, I realized that I have to increment it at least by a 1000 to make the change visible.

Wondering why that is happening, I have had a look at JDK source code and that's what I found:

JNIEXPORT jlong JNICALL
Java_java_io_UnixFileSystem_getLastModifiedTime(JNIEnv *env, jobject this,
                                                jobject file)
{
    jlong rv = 0;

    WITH_FIELD_PLATFORM_STRING(env, file, ids.path, path) {
        struct stat64 sb;
        if (stat64(path, &sb) == 0) {
            rv = 1000 * (jlong)sb.st_mtime;
        }
    } END_PLATFORM_STRING(env, path);
    return rv;
}

What happens is that on Linux File.lastModified has 1sec resolution and simply ignores milliseconds. I'm not an expert in Linux programming, so not sure is there any way get that time with millisecond resolution on Linux. Assume it should be possible because 'setLastModified' seems like is working as it is expected to work - sets modification time with millisecond resolution (you can find the source code in 'UnixFileSystem_md.c').

So, just a nice thing to remember: when you work with files on Linux, you may not see change in File.lastModified when it's value updated for less than 1000ms.

Wednesday, October 24, 2012

Effective Concurrency by Herb Sutter

Have never ever written feedback on events or courses, but here I decided to write one. It is about "Effective concurrency" course by Herb Sutter. Hopefully that post will help to someone to support an approval for that course :)

So, as I have already said, a few weeks back I was lucky enough to attend "Effective Concurrency" course by Herb Sutter. That guy is software architect at Microsoft where he has been the lead designer of C++/CLI, C++/CX, C++ AMP, and other technologies. He also has served for a decade as chair of the ISO C++ standards committee. Many people also know him for his books.

Tuesday, September 11, 2012

Building OpenJDK on Windows

Experimenting with some stuff, I found that it is often useful to have JDK source code available in hand to make some changes, play with it, etc. So I decided to download and compile that beast. Apparently, it took me some time to do that, although my initial thought was that it's should be as simple as running make command :). As you can guess, I found that it's not a trivial task and to simplify my life in future, it would be useful to keep some records of what I was doing.

Saturday, May 19, 2012

Bug in Java Memory Model implementation

Just have came around amazing question on stackoverwflow:

http://stackoverflow.com/questions/10620680/why-volatile-in-java-5-doesnt-synchronize-cached-copies-of-variables-with-main

Basically the guy there is trying to use "piggybacking" to publish non-volatile variable and it doesn't work. "piggybacking" is a technique that uses data visibility guarantees of volatile variable or monitor to publish non-volatile data. For example such technique is used in ConcurrentHashmap#containsValue() and ConcurrentHashmap#containsKey(). The fact that is doesn't work in that case is a bug in Oracle's Java implementation. And that is rather scary - concurrency problems are very hard to indentify even on bug-free JVM and such bugs in Memory Model implementation making things much worse. Hopefully that's the only bug related to JMM and Oracle has good test coverage for such cases.

The good news is that this particular problem appears just on C1 (client Hotspot compiler) and not in all cases. It doesn't happen on C2 (server compiler, enabled with "-server" switch). Fortunately, the most of people are running java on server side and there are quite a few client application which are using advanced concurrency features.

For ones who want to understand that case better, please, follow the link, I've provided at the beginning of post. Also there is very useful post on "concurrency-interest", which also has a good explanation of what is going on there: http://cs.oswego.edu/pipermail/concurrency-interest/2012-May/009449.html