May 23, 2013

Writing cross platform compatible Java code

Java is cross platform language in the sense that a compiled Java program runs on all platforms for which there exists a JVM like Windows, Mac OS and Unix. Having said this, there are scenarios where the Java programmers need to code things carefully. Experienced Java programmers will be well placed to answer the following question.

Q. Can you list some of the cross platform issues that a Java programmer needs to be mindful of based on your experience?

1. Carriage return and new line characters across different platforms.

If you are processing file like a CSV file and you need to parse the text line by line you need to be aware of the new line characters across different operating systems.

Windows: \r\n 
Unix: \n 
Mac: \r

Here is an example of the split function in Java that will work across different platforms. In Java extra "\" is used as an escape character.

  String[] split = fileInput.split("\\r?\\n");

There are other areas like in the formatter classes

String s2 = String.format("Use %%n as a platform independent newline.%n"); 

You can also get the line separator via


and in Java 7

System.lineSeparator() ;

2. The File path separator

The Windows uses "\" and the Unix systems use the "/" as the file path separator. So, you need to be careful construction file paths like

File file = new File(parentDirectory + "\" + resourceDirectory + "\" + fileName);

Instead, you should use 

File file = new File(parentDirectory + File.separator  + resourceDirectory +  File.separator  + fileName);

The File.separator will take care of the cross platform compatibility by using correct separator for the platform. What is even better is to nest your File construction with the "public File(String parent, String child)" constructor.

File dir = new File(parentDirectory, resourceDirectory);
File file = new File(fileName);
File finalFile = new File(dir, file);

3. Threading priorities

Threading priorities is another thing to consider across platforms. Other OS like Solaris for example has more thread priorities than windows. So, if you are working heavily on multi-threading, OS is something that may affect the program's behavior.

4. Using Native Code

Using native code (via JNI) can cause cross platform issues.

5. Beware of the System class

System.getProperty("") is clearly OS dependent. The other most common one is System.exec() as it calls another application from your system, and you should know if the application you are calling works across other systems.

Even though Java is touted to be a Write Once Run Anywhere (WORA) type programming language, one needs to be aware of the above potential issues and test it properly across other platforms. Watch out for these gotchas in code reviews. 

6. Character sets 

When converting bytes to String or reading a file in different environments, it is imperative that we use the right character sets. Otherwise, you can have cross platform issues like character displayed properly in a Win32 platform, but not in a Unix platform and vice versa.

So, instead of

String str = new String(bytes);

Use with proper character encoding like

String str = new String(bytes, "utf-8");

Also, instead of

   FileInputStream fis = new FileInputStream("specialcharacters.txt");  
   InputStreamReader irs = new InputStreamReader(fis);  

   FileInputStream fis = new FileInputStream("specialcharacters.txt");  
   InputStreamReader irs = new InputStreamReader(fis,"utf-8");  

You can also set it via JVM runtime argument as shown below

java MyApp -Dfile.encoding=ISO-8859-1

Recently UTF-8 has become the default encoding on many systems, but sometimes you have to deal with files originating from older systems with other encodings. ASCII is an encoding that uses 7 bits in mapping all US characters in saving the bytes into file. The UTF-8 was designed for backward compatibility with ASCII and to avoid the complications of endianness and byte order marks in UTF-16 and UTF-32The most useful and practical file encoding today is "UTF-8" because it support Unicode, and it's widely used in internet. UTF-8 encodes each of the 1,112,064 code points in the Unicode character set using one to four 8-bit bytes. The UTF-8 has become the dominant character encoding for the World-Wide Web.

Labels: ,


Post a Comment

Subscribe to Post Comments [Atom]

<< Home