Jul 212014
 

I’ve been playing around with Heroku at work for the past week or two. Heroku is pretty awesome if you want to get an app up and running quickly. Heroku does support Java and they have a few Java templates. Their current offering for Java uses Spring 3 and Tomcat 7.0.54 with Java 7. However, the version of Spring is somewhat older and they also use OpenJDK’s Java instead of Oracle’s Java. I wanted to try out Java 8 and also use a newer version of Spring so I upgraded the existing template to support both of those (I used a forked version of a custom buildpack for Java 8). I also had to update Heroku’s Web Runner to use Tomcat 7.0.54 (I have a pull-request waiting but I’m not sure if/when it will get approved so I have an artifact on GitHub that Maven can pull).

You can check out the template here.

Jun 262014
 

In 2005 my National Guard unit was deployed to Iraq as part of Operation Iraqi Freedom. My MOS (Military Occupational Specialty) in the Army was 92A, which is basically a logistics and supplies specialist. My job was to order parts for mechanics, pick them up, return old parts, manage HAZMAT, dispatch/return vehicles from missions, and handle licenses. I also did a few other things that I don’t remember right now. Anyway, at the time, the heart of this system was a tool called ULLS-G (Unit Level Logistics System – Ground). I say “at the time”, because shortly after we came back, ULLS-G was replaced by SAMS-E (Standard Army Maintenance System – Enhanced), which incidentally uses Oracle as a back-end database. Compared to SAMS-E, ULLS-G was a dinosaur. I had used it quite a bit, of course, having been in the Army for about 4 years by the time I was deployed. It was a complete pain to use it. ULLS-G was a DOS application (yes, MS-DOS) and most of the computers I used it on at the armory were only running DOS (this was circa early 2000′s so it wasn’t too uncommon to still see DOS systems around). By the time I was deployed most computers were running WinXP/2K or something like that, and so you could run ULLS-G in “MS-DOS compatibility mode”.

Continue reading »

Dec 122013
 

A few weeks ago, I ran into a puzzling issue at work. Someone was uploading an image which made it past our file-size checks, but caused an OutOfMemoryError and a heap dump when the code attempted to resize it. The information we had from the heap-dump was that it was trying to allocate memory for an image whose resolution was 18,375×11,175. What didn’t make sense is how this image was even getting through our file-size checks, because there is no way we would ever let in an image of that size.

In the code, we have a global limit for the largest file we will accept. We also have a separate limit for image uploads. If the image is over this size, but under the global limit, we will resize the image to a smaller size. The strange part was that the large image was making it past the global check, which meant that the size of the incoming data was below the global limit, but above the image-size limit. How could this be?

On a hunch, I hypothesized that perhaps the entire image wasn’t making it through. Perhaps only a part of the image was making it through with its header left intact. In an image file, there is usually a header that conveys information about the file format, the color space, and the resolution of the image! I figured that even though the data are incomplete, enough information was present in the header to enable the resizing code to make sense of it. When it tries to allocate memory for this image, based on the resolution it gets from the header, it runs out of memory!

To verify this, I manually hex-edited a file that was of proper size to have a ridiculously-large resolution. I then uploaded this file and was able to witness the behavior happening even though the file was of the proper size! So what’s the lesson here? Don’t only rely on file-size limits for images; you have to look at the resolution as well!

Nov 142013
 

Yesterday, I came across an interesting question on StackOverflow. The question is as follows: assuming you have a sorted array, is it possible to increment locations by 1 (one at a time) and still ensure that the array is sorted in O(1)? Assuming you are not allowed to use any other secondary data-structures, the answer is no. This becomes evident when you have repeated values. Assume a degenerate case where you have an array of identical values; let’s say [0, 0, 0, 0, 0]. If you increment the value in location 0, you now have [1, 0, 0, 0, 0]. For this array to be sorted, the value 1 must be moved to the end of the array (index 4). This will require four comparisons, which means that this will inevitably turn into an O(n) algorithm.

How can we do better?

Continue reading »

Sep 242013
 

I am at JavaOne! The last time I was here was in ’08, when it was run by Sun. Of course, it’s run by Oracle now. The first day has been pretty good. I attended sessions on Garbage Collection, the Nashorn JavaScript engine, writing DSLs, and about parallelization options offered by JDK 7 and 8 to leverage multicore processors. Pretty good first day!

Nashorn looks pretty interesting and I will be giving it a closer look when I come back home. They are also looking for people to help out so I am going to see if I can contribute anything useful.

Today I will be checking out a talk on Lambdas by Brian Goetz and will also be going to a talk on Big Data. Finally I also have some BOF (birds of a feather) sessions on evolutionary algorithms and writing parsers in Scala. Pretty interesting day and I’m looking forward to it!

Jul 142013
 

Recently I’ve noticed that YouTube’s performance on my machines have been terrible. It’s constantly buffering, or it will stop randomly in the middle of a video. I’ll get a few seconds of playback and then 10-30 seconds of buffering. It’s pretty terrible. On Windows I have been able to use the helpful workaround from here and performance has definitely increased. On top of that, I’m also using the SmartVideo plugin on Chrome (it’s also available for FireFox). But on my Linux boxes, I’m still having the same problem in spite of having the SmartVideo plugin. There is a Linux alternative to guide from above, but it uses ipfw program which is not natively available on Ubuntu/Linux (at least from my understanding) due to it being a BSD program. I didn’t want to compile it and install it from source, so I decided to use ufw instead, which is the “Uncomplicated Firewall” that comes with Ubuntu. It was pretty simple to convert the rules over. But first you will need to enable it (if you haven’t already). You can do that with:

sudo ufw enable

Then you can enable logging also, if you want:

sudo ufw logging on

If you SSH into your machine or if you use your machine as a webserver, you will need to enable a few more rules:

sudo ufw allow ssh/tcp
sudo ufw allow http/tcp
sudo ufw allow 8080/tcp

And of course, you can add the rules that will prevent your ISP from caching YouTube:

sudo ufw deny from 173.194.55.0/24
sudo ufw deny from 206.111.0.0/16

You can then use ufw status to verify that your rules are in place:

 ~ ⮀ $ ⮀sudo ufw status
Status: active

To                         Action      From
--                         ------      ----
22/tcp                     ALLOW       Anywhere
80/tcp                     ALLOW       Anywhere
8080/tcp                   ALLOW       Anywhere
Anywhere                   DENY        173.194.55.0/24
Anywhere                   DENY        206.111.0.0/16
22/tcp                     ALLOW       Anywhere (v6)
80/tcp                     ALLOW       Anywhere (v6)
8080/tcp                   ALLOW       Anywhere (v6)
Jun 062013
 

I have been working a project for the last few days, that deals with rendering PDF’s in-browser. Initially, I was going to parse the PDF and extract the text content, but then I ran into pdf.js, which is a library developed by Mozilla for rendering PDF’s in-browser via JavaScript. The project I am working on has a requirement that users should be able to select text within the PDF. This is possible using pdf.js. Unfortunately, the example code only shows you how to render a PDF, but not how to enable text-selection. I wasn’t able to find any API access to enable text-selection either. I finally ended up on the #pdfjs IRC channel and the friendly folks there gave me some direction. The logic for enabling text-selection was buried inside the code for Mozilla’s PDF viewer, and was heavily intertwined with the viewer code as well. I spent a few days playing around with the viewer and tracing through the code. I was stumped many times since the code was complex and I know jack about parsing PDF’s. But eventually I was able to focus on the part of the code that actually took care of enabling text-selection.

pdf.js’ approach to enabling text-selection is actually quite clever. The library overlays divs over the PDF, and these divs contain text that matches the PDF text that they are floating over. So when you select the text, you are actually selecting the text inside the overlaid divs. This was fine and dandy, but I was still stuck as far as getting this to work on my project. What I needed was a minimal example that I could adapt for my uses. After a day or two of tracing code, experimenting, debugging, and staring at the screen in frustration, I was eventually able to come up with a minimal example! To accomplish this, I extracted code that was relevant to creating the overlays out of the viewer code, into its own independent file. I also removed a lot of code that was dependent on the viewer itself. Keep in mind that this example doesn’t have functionality like text finding or matching, and that code is also heavily intertwined with the viewer code. All this example does is render a PDF with text-selection enabled. However, I think this is a good start!

If you are interested, you can check out the code on github and a working example on this fiddle.

The pertintent code is as follows (keep in mind you still require additional resources; all of that information is available on github):

window.onload = function () {
    var pdfBase64 = "..."; //base64 representing the PDF

    var scale = 1.5; //Set this to whatever you want. This is basically the "zoom" factor for the PDF.

    /**
     * Converts a base64 string into a Uint8Array
     */
    function base64ToUint8Array(base64) {
        var raw = atob(base64); //This is a native function that decodes a base64-encoded string.
        var uint8Array = new Uint8Array(new ArrayBuffer(raw.length));
        for (var i = 0; i < raw.length; i++) {
            uint8Array[i] = raw.charCodeAt(i);
        }

        return uint8Array;
    }

    function loadPdf(pdfData) {
        PDFJS.disableWorker = true; //Not using web workers. Not disabling results in an error. This line is
        //missing in the example code for rendering a pdf.

        var pdf = PDFJS.getDocument(pdfData);
        pdf.then(renderPdf);
    }

    function renderPdf(pdf) {
        pdf.getPage(1).then(renderPage);
    }

    function renderPage(page) {
        var viewport = page.getViewport(scale);
        var $canvas = jQuery("<canvas></canvas>");

        //Set the canvas height and width to the height and width of the viewport
        var canvas = $canvas.get(0);
        var context = canvas.getContext("2d");
        canvas.height = viewport.height;
        canvas.width = viewport.width;

        //Append the canvas to the pdf container div
        var $pdfContainer = jQuery("#pdfContainer");
        $pdfContainer.css("height", canvas.height + "px").css("width", canvas.width + "px");
        $pdfContainer.append($canvas);

        //The following few lines of code set up scaling on the context if we are on a HiDPI display
        var outputScale = getOutputScale();
        if (outputScale.scaled) {
            var cssScale = 'scale(' + (1 / outputScale.sx) + ', ' +
                (1 / outputScale.sy) + ')';
            CustomStyle.setProp('transform', canvas, cssScale);
            CustomStyle.setProp('transformOrigin', canvas, '0% 0%');

            if ($textLayerDiv.get(0)) {
                CustomStyle.setProp('transform', $textLayerDiv.get(0), cssScale);
                CustomStyle.setProp('transformOrigin', $textLayerDiv.get(0), '0% 0%');
            }
        }

        context._scaleX = outputScale.sx;
        context._scaleY = outputScale.sy;
        if (outputScale.scaled) {
            context.scale(outputScale.sx, outputScale.sy);
        }

        var canvasOffset = $canvas.offset();
        var $textLayerDiv = jQuery("<div />")
            .addClass("textLayer")
            .css("height", viewport.height + "px")
            .css("width", viewport.width + "px")
            .offset({
                top: canvasOffset.top,
                left: canvasOffset.left
            });

        $pdfContainer.append($textLayerDiv);

        page.getTextContent().then(function (textContent) {
            var textLayer = new TextLayerBuilder($textLayerDiv.get(0), 0); //The second zero is an index identifying
            //the page. It is set to page.number - 1.
            textLayer.setTextContent(textContent);

            var renderContext = {
                canvasContext: context,
                viewport: viewport,
                textLayer: textLayer
            };

            page.render(renderContext);
        });
    }

    var pdfData = base64ToUint8Array(pdfBase64);
    loadPdf(pdfData);
};
Mar 312013
 

I’ve known about tries for sometime. They’re a pretty neat data-structure for storing and looking-up strings.I decided to try and implement one in Java so that I can learn more about them. I’ll post another article later that goes into some more detail about this implementation, but for now you can check out the source here. It’s not production-ready by any means; it’s just me playing around.

All original content on these pages is fingerprinted and certified by Digiprove