JavaScript & Hashtable

JavaScript doesn’t come with a native Hashtable object.

Yet, In Javascript, one can use any object as an associative array, similar to a Hashtable structure.

Since one can add to an object any arbitrary property on the fly. It turns out that any object is just a set of key/value pairs.

The only constraint  is that the key must be String.

e.g.

var myStrMap = new Object(); //init an object

myStrMap ["key"] = value; //put a value

var val = myStrMap ["key"] ; // get a value

Notice that if one would use non-string object as a key, JavaScript will translate it to string (utilizing the toString() function),

This can cause serious problems since some different objects might return the same toString() value..

Moreover all objects that don’t implement the toString() function will return the same value.. (mostly “[Object object]”)

In order to use non-string object as a key one could do one of the following:

  • Implement hash(myObject) function
  • Implement myObj.toString() function, bare in mind toString() might be needed for other stuff..
  • Use or develop hashtable library for javascript (e.g. http://www.timdown.co.uk/jshashtable/ )

If you are using Chrome/Chrome Frame/NodeJS and more, it means that you are using Google’s JavaScript Virtual Machine,V8, behined the scene

Specifically V8 doesn’t implement Object properties access as hashtable, it actually implement it in a better way (performance wise)

So how does it work? “V8 does not use dynamic lookup to access properties. Instead, V8 dynamically creates hidden classes behind the scenes” – that make the access to properties almost as fast as accessing properties of Java/C++ objects.

Why? because in fixed class each property can be found on a specific fixed offset location of the memory..

So in general accessing property of an object in V8 is faster than Hashtable..which mean that if one implements Hashtable (string keys) utilizing the object.properties approche, he/she will mostly get better performance than utilizing an Hashtable in Java where Hashtable come out of the box

More info can be found here: http://code.google.com/intl/sv/apis/v8/design.html#prop_access

The good news is that the next JavaScript version will probably ship with a native map object as suggested on ECMAScript – harmony 

Advertisements

One Java, Two compilers..

Lately I was discussing Java with few students of mine..

It seems like that for students there is a lot of confusion regarding how Java/The JVM works because there are TWO compilers involve, so when someone mentions a compiler or the Just In Time compiler some of them would imagine it’s the same one, the Java Compiler..

So how does it really works?

It’s simple..

1) You write Java code (file.java) which compiles to “bytecode“, this is done using the javacthe 1st compiler.

It’s well known fact that Java can be written once get compiled and run anywhere (on any platform) which mean that different types of JVM can get installed over any type of platform and read the same good old byte code

2) Upon execution of a Java program (the class file or Jar file that consists of some classes and other resources) the JVM should somehow execute the program and somehow translate it to the specific platform machine code.

In the first versions of Java, the JVM was a “stupid” interprater that executes byte-code line by line….that was extremely slow…people got mad, there were a lot of “lame-java, awesome c” talks…and the JVM guys got irratated and reinvented the JVM.

the “new” JVM initially was available as an add-on for Java 1.2 later it became the default Sun JVM (1.3+).

So what did they do? they added a second compiler.. Just In Time compiler(aka JIT)..

Instead of interpreting line by line, the JIT compiler compiles the byte-code to machine-code right after the execution..

Moreover, the JVM is getting smarter upon every release, it “knows” when it should interpat the code line-by-line and what parts of the code should get compiled beforehand (still on runtime).

It does that by taking real-usage statistics, and a long-list of super-awesome heuristics..

The JVM can get configured by the user in order to disable/enable some of those heuristics..

To summarize, In order to execute java code, you use two different compilers, the first one(javac) is generic and compiles java to bytecode, the second(jit) is platform-dependent and compiles some portions of the bytecode to machine-code in runtime!

Quick JDK 8 Suggestion

JDK 7 is just behind that door, and I am really excited about all the goodies that it brings with it.

While I was trying to benchmark JDK7 vs. older JDKs, I realized that the GC(Garbage Collector) is an unknown factor, i.e. while some piece of code is running, one can never know if the GC is running in parallel , in such a case that specific iteration might take much more time, hence the benchmark data will get corrupted.

So my suggestion is adding an awesome new block type: (JDK 8!!!)

no Garbage Collection block – noGC

noGC{

//some code here;

}

catch(AlmostOutOfMemoryError err){

}

While the code inside the noGC block is running, it’s promised that the GC won’t run in parallel

The AlmostOutOfMemoryError will get thrown in case the Heap is X% full (whereas X is configurable as -xnogcf )

Just to be clear, it wouldn’t help me out with benchmarking since the older JDKs do not support it, yet that was the trigger..

how would such a block would work in a multi-threaded environment is another issue..

In my opinion applications with a tiny real-time need would benefit a lot using such a block, much more than all those real-time Java frameworks out there…

Would love to hear your opinion

Avoid memory leaks using Weak&Soft references

Some Java developers believe that there is no such a thing as memory leak in Java (thanks to the fabulous automatic Garbage Collection concept)

Some others had met the OutOfMemoryError and understood that the JVM has encountered some memory issue but they are not sure if it’s all about the code or maybe even an OS issue…

The OutOfMemoryError API docs reveals that it “Thrown when the Java Virtual Machine cannot allocate an object because it is out of memory, and no more memory could be made available by the garbage collector. ”

As we know, the JVM has a parameter that represents the maximum heap size(-Xmx), hence we can defiantly try to increase the heap size. yet some code can generate new instances all the time, if those instances are accessible(being referenced by the main program – in a recursive manner) for the entire program life span, then the GC won’t reclaim those instances. hence the heap will keep increasing and eventually a OutOfMemoryError will be thrown <- we call that memory leak.

Our job as Java developers is to release references (that are accessible by the main program) that we won’t use in the future. by doing that we are making sure that the GC will reclaim those instances (free the memory that those instances occupying in the heap).

In some cases we reference an instance from 2 different roots. one root represent a fast-retrieval space(e.g. HashMap) and the other manages the real lifespan of that instance. Sometimes we would like to remove the reference of that instance from one root and get the other root(fast retrieval) reference removed automatically.

We wouldn’t want to do it manually due to the fact that we are not C++ developers and we wouldn’t like to manage the memory manually..

Weak references

In order to solve that we can use WeakReference.

Instances that are being referenced by only Weak references will get collected on the next collection! (Weakly reachable), in other words those references don’t protect their value from the garbage collector.

Hence if we would like to manage the life span of an instance by one reference only, we will use the WeakReference object to create all the other references. ( usage: WeakReference wr = new WeakReference(someObject);)

In some apps we would like to add all our existing references to some static list, those references should not be strong, otherwise we would have to clean those references manually, we would add those references to the list using this code.

public static void addWeakReference(Object o){
 refList.add(new WeakReference(o));
}

since most of the WeakReferences use cases needs a Map data structure, there is an implementation of Map that add a WeakReference automatically for you – WeakHashMap

Soft References

I saw few implementations of Cache using weak references (e.g. the cache is just a WeakHashMap => the GC is cleaning old objects in the cahce), without WeakReferences naive cache can easily cause memory leaks and therefor weak references might be a solution for that.

The main problem is that the GC will clean the cached-object probably and most-likely faster then you need.

Soft references solve that, those references are exactly like weak references, yet the GC won’t claim them as fast. we can be sure that the JVM won’t throw an OutOfMemory before it will claim all the soft and weak references!

using a soft references in order to cache considered the naive generic cache solution. (poor’s men cache)

( usage:SoftReference sr = new SoftReference(someObject);)

How-to speed-up your java code myths

In my last post I covered tips that I  have collected trough out the years on how to speed up your java code,

After reviewing the tips and reading my friends criticism, I updated the list and created a new list of myths, here it is:

final: developers might think that final methods are more efficient due to the fact that the compiler will be able to inline those methods. it’s false, imagine that you are compiling the class Main with the class Inline, the non static method Main.main() creates an instance of Inline and invokes the method inline.finalMethod() which is final. on compile time everything looks great, yet in runtime we might use a different version of the compiled Inline class whereas the finalMethod is not final and can be overwritten….

Synchronization blocks: old VMs used to pay a lot of overhead for running a synchronized method, new VMs mostly knows how to trace a synchronized method that is not running concurrently and treat it as a non-synchronized one.

Calling the garbage collection manually: calling the garbage collector manually (System.gc()) is usually a mistake, the new VMs garbage collection mechanism are state-of-the-art and most likely it will invoke the GC on a better timing. moreover manual GC triggers a full collection of all generations -> that’s not a smart move.

Object pooling: allocating object on the heap is not cheep but for non-complex objects it’s not that expensive as well, design an object-pooling for simple object will cause an over-head of managing the pool  in many cases.

In general it seems like performance tips should always be revisited since new compilers and VMs try to solve exactly those problems.

Immutable objects: in general immutable objects has many advantages (1) automate thread-safety (2) their hashCode value is cacheable (3) easy to work with

a quote from Effective Java: “Classes should be immutable unless there’s a very good reason to make them mutable……..If a class cannot be made immutable, limit its mutability as much as possible.”

Google Gears vs. JDBC (performance)

This year I’m teaching a course in the Hebrew University (advanced Internet technologies), I’m currently guiding one of my students, we are designing a web framework that heavily uses the abilities of Google Gears, I will elaborate about this project on a future post.

I must share with you a performance testing that we have been doing in order to proof the concept of client side database efficiency.

As I belong to the Java/J2EE school, I compared a JDBC method to a Javascript function that uses the Gears API (mainly the Database object)

My test case was simple, creating a new simple table with one column(int), inserting N rows(increasing number from 0 to N-1), selecting * from that table and finally iterating the result one by one.

In order to isolate the database connectivity problem from few multithreading issues I have used IE8 to test the javascript code ( and not chrome or FireFox).

The results are unbelievable:

The X axis represent N (the number of rows I have inserted), the Y axis represents time(ms).

Gears is working faster then JDBC!

Few issues I had:

1) This is happening only when I wrap the gears code with a transaction ( I think it’s not an issue due to the nature of a client side programming. no more than one concurrent user), (thanks to the Gears team for helping me here..)

2) Of course you can claim that JDBC is just an underneath layer and it can be work under a 2nd level cache layer, well we can defiantly implement a 2nd level cache for Gears as well as part of our framework 😉

3) Why IE8?  Chrome and maybe firefox 3.5 are working differently with threads and I did not want to add code that deals with threads (WorkerPool)

4) I have used the MySql database to test the JDBC code.

5) Gears code:

function db(numOfRows) {
var currentTime = new Date().getTime();
var db = google.gears.factory.create(“beta.database”);
db.open(“gearsDb”);
db.execute(‘BEGIN’);//start tx
db.execute(“delete from data”);
db.execute(“create table if not exists data (rowId int)”);
for(i=0;i<numOfRows;i++)
db.execute(“insert into data values (“+i+”)”);
db.execute(‘COMMIT’);
var rs = db.execute(“select rowId from data”);
var val = 0;
while (rs.isValidRow()){
val = rs.field(0);
rs.next();
}
rs.close();
var currentTime2 = new Date().getTime();
currentTime2 -=currentTime;
document.write(currentTime2+”,”+numOfRows+ “<BR/>”);
}

6) JDBC code:

long time = System.currentTimeMillis();
DriverManager.registerDriver(new Driver());
Connection con = DriverManager.getConnection(“jdbc:mysql://myLab:3306/jdbcPerTest”, “root”, “pass”);
PreparedStatement stmt = con.prepareStatement(“insert into data values (?)”)
con.setAutoCommit(false);
stmt.execute(“delete from data”);
for(int i =0 ;i<param;i++){
stmt.setInt(1, i);
stmt.execute();

}
con.commit();
ResultSet rset = stmt.executeQuery(“select * from data”);
int result = 0;
while (rset.next()) {
result = rset.getInt(1);
}
long time2 = System.currentTimeMillis();

System.out.println(param+”,”+(time2 – time));

rset.close();
stmt.close();
con.close();

For me such a results are a begining of an era, yet post factum those results are obvious..

If you had like to join such a web framework development please let me know..