The Future of File Utilities: encryption, hashing and compression in the browser

I was wondering if the time has come where browser tech is secure and powerful enough to replace some of the niche command line tools that we are used to in Linux. Normally I’d have to download some variant of them on each operating system or device I’m working on. This can get annoying. Some of the most common niche tools are encryption, hashing and compression of files.

Turns out that the browser is more than capable to deliver this funcitonality thanks to some awesome open source projects. Imagine in the future having a plethora of file tools implemented in the browser: file editor, diff, hex dump, hex edit, all kinds of compression, encryption, hashing, search and replace of text and so on. We can potentially pipe the output of these tools to third party APIs to save the files on cloud storage or process them further. This can open a new era for file tools. And the best part is its all cross-platform. The tools can run uniformly on any operating system and any device that supports a major browser.

File Utilities

After my experience with building SecureMyFiles, a client-side encryption product for desktop operating systems, I got curious if we can achieve similar functionality directly in the browser. My main requirement was that all file operations must be performed on the client side with no data being transferred to a server. The idea is to use the bowser as a cross-platform execution engine to run the file utilities on any operating system, including mobile devices. This way we can avoid downloading specialized tools as we move from device to device.

I’ve open sourced the tools on GitHub:

File Encryption in the Browser

I was able to create an efficient file encryption and decryption utility in the browser that runs completely on the client side. Both encryption and decryption are efficient and run in constant memory achieving 9MB/s. They don’t match the performance of their command-line alternatives but given the fact that they can run on most devices that can run a browser I think it is a good start.

The Javascript library I decieded to use is forge. The need to run with constant memory regardless of the file size requires a cipher that can be updated in chunks. The forge implementation worked quite well. Unfortunately the native Web Crypto API does not support chunked based encryption which was disappointing.

The encryption algorithm I use is AES-GCM with 12 byte IV and 128 bit tag. The key is derived from a password and a randomly generated 128 bit salt passed through PBKDF2 with 100K iterations. The IV is initialized to all 0s (deterministic construction). It is never re-used and each encryption generates a new key. The structure of the encrypted file is:

| 128 bit salt | encrypted file contents | 128 bit tag |

File Encryptor on GitHub:

File Decryptor on GitHub:

File Hashing in the Browser

The file hasher produces SHA-256 hash. It also uses the forge crypto library. It operates on 1MB chunks and can read files with arbitrary size using constant memory. The speed is around 6 times slower than the command line utility sha256sum on my MacBook Pro. But nevertheless it works great in the browser.

File Hasher on GitHub:

File Compression

I decided to use the pako compression library. It achieves good speed and uses the zlib compression algorithm. I tested compressing a 34GB file (VirtualBox vdi file). It took some time but finished successfully on my MacBook Pro with 16GB of RAM. The file format is not pure gzip which is unfortunate since other tools cannot be used to decompress the file.

File Compressor on GitHub:

File Decompressor on GitHub:

What’s next

Why not compress and encrypt at the same time? Why not compress, encrypt and push to Dropbox using their API? This way I have my file ready to be shared with whoever I want. What other use cases can you think of?


Exceptions Guidelines

Exception handling is tricky. It requires careful consideration of the project at hand and the way errors can be dealt with. I am presenting a general Exceptions Guidelines best practices that I have come up with after extensive research of the subject. Most of these guidelines can be used to all projects in any language that supports exceptions. Some of the guidelines will be Java specific. At the end you need to have a robust set of guidelines that can help you handle and deal with exceptions and error conditions.

Advantages of exceptions

I won’t go into details about the advantages of exceptions. But it mainly boils down to these three benefits:

  • Separate error handling code from regular code
  • Propagate errors up the call stack
  • Grouping and differentiating error types

Three types of exceptions in Java

Find more details here.

  1. Checked exceptions – these exceptions are checked at compile time and methods are forced to either declare the exception in its throws clause or catch it and handle it.
  2. Unchecked exceptions – these exceptions are checked at runtime so no compilation warnings will be issued if they are not handled. This is the behavior in C++ and PHP as well.
  3. Error – these are unrecoverable exceptions that the JVM can throw such as out of memory exceptions. These should not be caught.

Exceptions architecture and design

Your exception handling should be designed carefully from the start. A bad exception handling design can make your project hard to debug, buggy and inflexible to future changes that will slow down your development lifecycle. Here are some tips on good exceptions design that I have found useful.

Keep the number of custom exceptions to a minimum

As little as necessary to get the job done. Having many unnecessary custom exceptions makes your code bloat in size and harder to maintain. Also your API clients might get overwhelmed and mishandle the exceptions which defeats their purpose. Generic exceptions should cover most of the error use cases.

For unrecoverable errors don’t throw specific exceptions when a generic exception, such as a RuntimeException or a generic unchecked custom exception will do the job. Your clients can’t do anything about this exception. Why force them to worry about a specific exception. Specific custom exceptions should be reserved when different behavior is necessary to handle an error condition.

Limit the use of checked exceptions

Limit the amount of checked exceptions to the minimum and carefully consider each and every use case. There is a lot of controversy surrounding Java’s checked exceptions. There are many people that think checked exceptions are of dubious value. It forces the clients of your API to have a ton of boilerplate code to handle these exceptions. If the exception is not recoverable then throw a generic unchecked exception. If the exception is recoverable then think about either using a specific checked exception (if really needed) or a specific unchecked exception. This way the client can decide on its own if the exception needs specific handling or not. For example an ElementNotFoundException if the database doesn’t contain an ID you are looking for.

Think twice before you define interfaces that have checked exceptions in their method signatures. If later you have some implementations that don’t throw an exception, then you’ll still be forced to throw the interface defined checked exceptions. Your clients would be forced to have boilerplate exception handling code that is completely useless, because they’ll be handling exceptions that are not even thrown. A better approach would be to use unchecked exceptions which will leave the decision of handling the exceptions up to the client.

In general think really hard if you even need checked exceptions in your code. Only explicitly recoverable situations warrant the need of checked exceptions. And even then you can still get by without them. The overhead of handling checked exceptions is significant because of the boilerplate code your clients will need to have. They also make refactoring your code more difficult by making you update all your method definitions that pass through those checked exceptions.

Don’t expose internal, implementation specific details to your clients

Avoid exposing internal implementation specific exceptions to your clients, especially those contained in a third party library. This is a general object oriented rule of thumb and it’s as valid for your exceptions hierarchy design. You have no control over the third party library which can change its exceptions signatures and break all of your API contracts with your clients. Instead wrap those third party exceptions (such as an SQLException) in your own custom exceptions. This way you’ll have much greater flexibility to change the third party library in the future without breaking your clients’ API contract.

Create your own exceptions hierarchy for complex projects

Generally speaking create your own exceptions hierarchy for more complex modules especially if you are dealing with implementation specific exceptions in third party libraries. Each of your packages/modules could have its own top-level generic exceptions. For Java at least one should be defined that inherits from RuntimeException. Wrap all implementation specific exceptions in your custom exceptions so that your clients should only depend on your custom exceptions and/or generic Java exceptions. This will give you greater flexibility to refactor the implementation specific code later without breaking your API contracts.

If more fine grained error handling is necessary then you can further subclass your custom exceptions to handle specific cases and allow for error recovery. For example if you are connecting to an SQL database you can throw a ConnectionTimeoutException in such a way that if needed the client can retry the connection N times before giving up. This way you can later change your database engine to NoSQL and still allow for reconnects and the client code will stay the same.

Document all exceptions

Carefully document all exceptions your package/module/app throws in the javadoc definition of each public method. Failing to do so will frustrate your API users and cause them to not trust your API docs. You don’t really want your clients to dig in your source just to find out that you are throwing a specific exception, right?

Throw exceptions as early as possible.

Check all inputs to your public API methods and throw an exception as soon as you find inconsistencies between your expected parameters and what has been supplied. The earlier you throw an exception the less will be the chance of data corruption, because bad data won’t make it into the deeper parts of your code. It also gives valuable feedback to your clients in a timely manner instead of deep in your code where something throws an obscure exception with a bad message such as ‘Internal Error’ or NullPointerException.

Log exceptions properly

Follow the guidelines of your logging framework in order to properly log exceptions with their message and stack trace. You don’t want to loose either.

Add more context to thrown exceptions

Every time you can add more context to a thrown exception do it! It will be invaluable in the debugging stage. Different contexts can add their own information to a thrown exception by extending the thrown exception’s message or wrapping the exception in a more granular custom exception. Follow the throw path of exceptions through your code and make sure that important information is contained in the exception class or in the exception message so that your clients can properly document or recover from the exception.

Follow the principle of handle-or-propagate:

  • Don’t just catch and re-throw exceptions for no reason. This has significant performance penalties to your code and it is of no use to anybody.
  • Don’t catch, log and re-throw exceptions. This will probably cause you to log the same exception multiple times which should be avoided. It can lead to filling your logs with multiple entries for the same exception. There is nothing worse than a bloated log you have to go through in order to find what went wrong.
  • Don’t ever swallow an exception without proper comment in the code for why you are doing it! Explain why you are not even logging it!
  • Only catch exceptions if you need to extend their error information or handle them. Otherwise let them propagate.
  • Log exceptions once and only once!

Handle all exceptions at the top level of your code

At the top level of your code handle all propagated exceptions correctly. This means to order your catch clauses from the most specific to the most general. You can use multi-catch statements to reduce the boilerplate code you need to write. Make sure that every exception you catch here is logged with the appropriate log level. At the end make sure that your users receive notification for each exception received. Your users should know if something bad has happened and if they can do anything about it.

Some exceptions should cause your program to fail so don’t swallow them! Log them and quit.

Don’t catch top-level exceptions

Don’t catch the top-level exception classes such as Throwable, Exception or RuntimeException directly in your API, unless you really know you need to and you are at the very top of your code base (your main method and top level server/daemon code). These exception classes contain many unrecoverable exceptions that can’t be handled safely. This is especially true for the Throwable class which also catches Error exceptions that the JVM might throw such as out of memory exceptions. Catching these exceptions and continuing to run your application may result in data corruption and undefined behavior.

Make sure that your catch() statements are ordered correctly from most specific to most general. You can use multi-catch statements to group exceptions that need the same treatment like logging an error:

catch( Exception1 | Exception2 | Exception3 e) {
    logger.log( "Bad Exception: " + e.getMessage(), e );

Use the common parent if multiple exceptions can be thrown and handled the same way. For example catching IOException will catch automatically FileNotFoundException because it inherits form IOException.

Avoid catching the Throwable class! Bad things will happen when you start catching unrecoverable JVM exceptions.

General rules when working with exceptions

This section will give you general tips on how to deal with exceptions. It should extend and compliment your exceptions design and architecture.

When to consider checked exceptions

Choose checked exceptions only if the client can or should do something to recover from an error. For example if the client specifies a file that doesn’t exist he should be notified to correct this.

Make sure that the checked exception is a specific exception rather than the ‘Exception’ class itself. Otherwise your clients will be forced to catch more exceptions than they intend to handle. For example FileNotFoundException is a specific exception.

The checked exception can be a more general parent exception class such as IOException which is the parent of many IO related exceptions.

For checked exceptions follow the Catch or Specify principle. It means that every method that throws checked exceptions should be wrapped in a try…catch block or the exception needs to be reported in its throws clause.

Wrap lower level checked exceptions in unchecked exceptions

For example if you receive an SQLException in your database handling class you can wrap it in a RuntimeException and throw it. This way you won’t expose unnecessary internal implementation details to higher level clients. But a good rule of thumb is to create your own, package specific, exception that inherits from RuntimeException and use it instead. This way other RuntimeExceptions can still be propagated, but your clients can take specific actions based on this particular exception type.

Preserve encapsulation when converting exceptions from one to another

Always specify the original exception as part of the new exception so the stack trace is not lost. This will help greatly during debugging and root cause analysis.

Choose good error messages

When throwing exceptions, always specify a readable string with the proper description of the error and any additional data that might be helpful to debug the problem further. Such as missing IDs, bad parameters etc. Go through the stack of the thrown exception and figure out if any other place can add additional information to the exception. Adding additional information to the exception at the right context level can greatly enhance the usefulness of your logging.

Document Exceptions

This helps the next guy. It also helps you in a couple of months when you have forgotten about the code you wrote!

Clean up after yourself

Before throwing an exception make sure you clean up any resources that are created in your try block and in your methods. Do this either with try-with-resources or in the finally section of a try…catch…finally block.

Make sure that your objects are in a good state when an exception is thrown. If things need to be deallocated or cleaned up then do it before the throw. If the exception is recoverable make sure that the object from which the exception is thrown is re-initialized correctly to handle a retry.

Don’t ignore exceptions thrown in a thread. If an InterruptedException is thrown in a thread make sure that the thread is properly shutdown and cleaned up in order to avoid data corruption.

Don’t throw exceptions from a finally block

This would cause all exceptions thrown in the try block to be lost and only the finally block exception will be propagated. It is better to handle the finally block exception in the finally block and not let it propagate. This way the exception in the finally block can be logged and dealt with immediately and any other exception that was thrown from the try {} block will be correctly propagated up the call stack.

Don’t use exceptions for flow-control

It slows down your code and makes it less maintainable. It is considered very bad practice.

Don’t suppress or ignore exceptions

It creates hard to find bugs in your code!

Only catch specific exceptions you can handle

Let all other exceptions propagate to a place where they can be handled.

Log exceptions just once

It sucks to debug a program and find the same message appearing multiple times from different places. It will make the log files cleaner and easier to use for debugging.

Don’t create custom exceptions if you can avoid it

For simple projects try not to create custom exceptions if they don’t provide any additional data to the client code. A descriptive name is nice to have but it’s not that helpful. The standard JAVA API has hundreds of exceptions already predefined. Look through those first and see if they fit your business needs.

On the other hand, if state information is needed to be bundled with your exception then use it. For example if a file can’t be opened you can specify the file name, path, permissions, type (symlink, regular file, etc) as separate variables in a CustomException class. Exceptions are regular classes and they can have their own variables and methods.

If you need to create a custom exception then follow the principle of: name the problem not the thrower. Keep in mind the exception should reflect the reason of the problem instead of the operation causing the problem, i.e., FileNotFoundException or DuplicateKeyException instead of CreateException from a create() method. You can already see it is coming from the create() method in the stack trace

When to create custom exceptions

For more involved projects you can create a class hierarchy of exceptions that belong to your package/module/project and use this instead of using the Java provided exceptions. Be careful to not create too many custom exceptions, but the bare minimum to get your errors handled. In most cases a general unchecked custom exception will do just fine so think carefully if you really need anything else.

For example, start with a subclass of the checked Exception class and the unchecked RuntimeException class. If a more specific exception is needed by some of your modules then create it. This should be rare though. A good rule of thumb is to use custom specific exceptions if a specific recovery operation is needed or the client needs to do something different in case of a particular exception. But try to limit the amount of custom exceptions to the bare minimum. Normally having just one of each should do the job in most cases. This way you can allow for substantial refactoring of your code with minimum to no changes to your API client contracts.

Exceptions in Unit Tests

Specific exceptions are great for reducing the amount of false positives during unit testing. If a particular method can throw an IllegalArgumentException because of many different things then how do you know which one caused the exception to be thrown? If you have finer grained control over this you can eliminate the false positives by catching specific exceptions linked to specific issues in that method in your unit tests.

This is tedious but allows for good unit tests. For example a method expects three parameters: name, path, attribute. We can throw three runtime exceptions that inherit from IllegalArgumentException: IllegalNameException, IllegalPathException and IllegalAtrributeException. Those are runtime exceptions which you can specifically check during unit testing. This way you can be certain that your throw statements are being executed as expected in your unit tests.

Properly notify the end user

At the top level of your application throw a generic exception to the end users, notifying them about the error that happened. Make sure that all other error cases have been correctly covered and resources freed.

Final thoughts

Thanks for reading this far! I hope the tips here will be useful to other software engineering projects. Please post your thoughts and ideas in the comments below. It will help evolve the material and keep it current.


This information has been gathered from different source. The following resources are the ones explicitly mentioned: