Getting JPG dimensions with AS3 without loading the entire file
Does this sound familiar: You're loading a JPG file, and want to know what size it is before it has loaded? This can be useful if you're for example drawing a border/background, which will contain the image. If you don't know what size the image is, you don't know what size the box should be. I'm sure there are better & more common uses to this, but i know for sure it can be really useful.
Anyway, I decided to solve this problem. If you don't care about how it works, fine, i'll give you the short version first. After all the class i've created does all the heavy lifting for you, so if you just want to have it work, you can just use it without worrying about what's under the hood.
Here's the class: JPGSizeExtractor.as. Download it and put it in a folder com/anttikupila/utils in your classpath. Then you can just use it like this:
-
var je : JPGSizeExtractor = new JPGSizeExtractor( );
-
je.addEventListener( JPGSizeExtractor.PARSE_COMPLETE, sizeHandler );
-
je.extractSize( your_jpg_file.jpg );
-
-
function sizeHandler( e : Event ) : void {
-
trace( "Dimensions: " + je.width + " x " + je.height );
-
}
To trace out debug information, you can say je.debug = true;
What it actually does is that it starts loading the JPG file and analyzes every byte that has loaded. When it find the jpg's JFIF headers (according to the JPEG specs), it will be able to determine the dimension of the file and close the stream. The dimensions in a jpg are before the actual image info, meaning that you can load a 10mb file and get it's dimensions in a fraction of a second, instead of waiting for the entire file to load.
Note: This doesn't work for all files. Some files i tried with were not according to the JPEG specs (or then i misunderstood something, which is more likely since other software showed the correct dimensions). Don't use this in mission critical projects, unless you have control over the JPG files. I don't take any responsibility of it working correctly.
Download the class, with an example
If you're not so much interested in “the what”, but more “the how”? Read on. Let's see what we can do about that ..
What we need to do is look at the JPEG specs to see where the image dimensions are stored. Then just go to that place in the file and read out those dimensions. Simple enough, 'aye? Well, it pretty much is, but there are some bumps on the road..
First we have to know how a JPEG file is structured. Luckily JPEG is a very common file format, and there's plenty of documentation online. Fire up a hex editor and open a jpg image to see what's actually happening. Note that 0x stands for a hex number, 0x0A = 10, 0xA0 = 160
This is the basic structure (a jpg can contain more sections, but these are the most common ones):
| Identifier | Name | Description |
|---|---|---|
0xFF 0xD8 |
SOI | Start of Image |
0xFF 0xE0 |
APP0 | First JFIF segment |
0xFF 0xEn |
APPn | An application specific segment, where n = 1—F |
0xFF 0xDB |
DQT | Define Quantization Table |
0xFF 0xC0 |
SOF0 | Start of frame 0 |
0xFF 0xC4 |
DHT | Define Huffman Table |
0xFF 0xDA |
SOS | Start Of Scan |
0xFF 0xDD |
DRI | Define Restart Interval |
0xFF 0xD9 |
EOI | End of Image |
If we take a closer look at SOF0's structure, we'll notice something we want to see..
0xFF 0xC0(SOF0 identifier)- length (high byte, low byte), 8+components*3
- data precision (1 byte) in bits/sample, usually 8 (12 and 16 not supported by most software)
- image height (2 bytes, Hi-Lo), must be >0 if DNL not supported
- image width (2 bytes, Hi-Lo), must be >0 if DNL not supported
- number of components (1 byte), usually 1 = grey scaled, 3 = color YCbCr or YIQ, 4 = color CMYK)
- for each component: 3 bytes
- component id (1 = Y, 2 = Cb, 3 = Cr, 4 = I, 5 = Q)
- sampling factors (bit 0-3 vert., 4-7 hor.)
- quantization table number
So, what we want to look for is SOF0 and from there get the width & height of the image. SOF0 is identified by 0xFF 0xC0, so if we find that, we're pretty much done. The following image shows you the header of an actual 200x200px jpg file (0x00 0xC8 = 200) with the header info highlighted.

The other info we don't really need, but it makes it easier to find the correct address in the file when we have more info. Now we know where & how the information is stored. Now we just need to find it
What we want to look for is 0xFF 0xC0, but since the length in most cases will be 0x11 (17) bytes (8 + 3x3 as we're assuming it's an RGB image) and the bitdepth is 8bit (12bit jpg's exist in medical imaging, but they are rarely supported by consumer software), we can require those too. In other words we can look for 0xFF 0xC0 0x00 0x11 0x08.
-
protected static const SOF0 : Array = [ 0xFF, 0xC0 , 0x00 , 0x11 , 0x08 ];
What we do is simply to examine every incoming byte and check if it matches with the SOF0's first byte. If it does, we take the next byte, and the next, until we have matched all the bytes in the header and know we're at the correct place. After this we can just read the height & width of the file.
-
var index : int = 0;
-
while ( bytesAvailable>= SOF0.length + 4 ) {
-
if ( byte == SOF0[ index ] ) {
-
index++;
-
if ( index>= SOF0.length ) {
-
jpgHeight = readUnsignedShort( );
-
jpgWidth = readUnsignedShort( );
-
break;
-
}
-
}
-
}
The code is pretty straightforward if you've worked with ByteArrays before. In order for us not to go outside the range of the file that has loaded so far (as we're running this on progress–not on complete–we know a portion of the file has loaded, not the whole file), we check that enough bytes are available. We create a loop that compares every byte with the array and moves on the the next byte if a match is found. When the position matches the length of the array, we know that we've matched the entire header and can read out the dimensions. As you can see from the image & specs earlier in this post, the dimension are stored as shorts (2 bytes each, hence the + 4 in the while loop). If you try this, it actually works..
..in most cases. The problem is that a jpg file may contain more than just one image. Why? Well, a jpg may have other info embedded, such as thumbnails also stored as jpgs inside the jpg. This means that if we just pick the first SOF0, we might end up with a thumbnail's SOF0. Save for web in photoshop, for example, doesn't embed a thumbnail while the normal save as does. Hence you might get different (wrong) results on the normally saved file if you just pick the first SOF0. Now, how can we solve this?
Taking a closer look at the JPEG specs we notice something called APPn segments. These are segments that contain additional info, such as EXIF information. The thumbnail is in here, if it exists. So, what we need to look for first is the APPn sections, and if we find any, skip them as all the image info in there is wrong anyway. As every other JPEG segment, the APPn segment's identifiers are always followed by the segment's length, so when we find an APPn section's identifier, we can just read the next short and skip that amount of bytes. Seems simple enough
Here's the complete function from the class, which does just that
-
protected function progressHandler( e : ProgressEvent ) : void {
-
dataLoaded = bytesAvailable;
-
var APPSections : Array = new Array( );
-
for ( var i : int = 1; i <16; i++ ) {
-
APPSections[ i ] = [ 0xFF, 0xE0 + i ];
-
}
-
var index : uint = 0;
-
var byte : int = 0;
-
var address : int = 0;
-
while ( bytesAvailable>= SOF0.length + 4 ) {
-
var match : Boolean = false;
-
// Only look for new APP table if no jump is in queue
-
if ( jumpLength == 0 ) {
-
byte = readUnsignedByte( );
-
address++;
-
// Check for APP table
-
for each ( var APP : Array in APPSections ) {
-
if ( byte == APP[ index ] ) {
-
match = true;
-
if ( index+1>= APP.length ) {
-
if ( traceDebugInfo ) trace( "APP" + Number( byte - 0xE0 ).toString( 16 ).toUpperCase( ) + " found at 0x" + address.toString( 16 ).toUpperCase( ) );
-
// APP table found, skip it as it may contain thumbnails in JPG (we don't want their SOF's)
-
jumpLength = readUnsignedShort( ) - 2; // -2 for the short we just read
-
}
-
}
-
}
-
}
-
// Jump here, so that data has always loaded
-
if ( jumpLength> 0 ) {
-
if ( traceDebugInfo ) trace( "Trying to jump " + jumpLength + " bytes (available " + Math.round( Math.min( bytesAvailable / jumpLength, 1 ) * 100 ) + "%)" );
-
if ( bytesAvailable>= jumpLength ) {
-
if ( traceDebugInfo ) trace( "Jumping " + jumpLength + " bytes to 0x" + Number( address + jumpLength ).toString( 16 ).toUpperCase( ) );
-
jumpBytes( jumpLength );
-
match = false;
-
jumpLength = 0;
-
} else break; // Load more data and continue
-
} else {
-
// Check for SOF
-
if ( byte == SOF0[ index ] ) {
-
match = true;
-
if ( index+1>= SOF0.length ) {
-
// Matched SOF0
-
if ( traceDebugInfo ) trace( "SOF0 found at 0x" + address.toString( 16 ).toUpperCase( ) );
-
jpgHeight = readUnsignedShort( );
-
jpgWidth = readUnsignedShort( );
-
if ( traceDebugInfo ) trace( "Dimensions: " + jpgWidth + " x " + jpgHeight );
-
removeEventListener( ProgressEvent.PROGRESS, progressHandler ); // No need to look for dimensions anymore
-
if ( stopWhenParseComplete && connected ) close( );
-
dispatchEvent( new Event( PARSE_COMPLETE ) );
-
break;
-
}
-
}
-
if ( match ) {
-
index++;
-
} else {
-
index = 0;
-
}
-
}
-
}
-
}
-
-
protected function jumpBytes( count : uint ) : void {
-
for ( var i : uint = 0; i <count; i++ ) {
-
readByte( );
-
}
-
}
Yep, gets a hell of a lot more complicated because of this. But it works. I won't go through every single line here, as a lot of it is by now pretty self explanatory. What it basically does is it tries to find an APPn section (0xFF 0xE1, 0xFF 0xE2, 0xFF 0xE3 etc, up to 0xFF 0xEF), and if it does, it reads the following short and jumps that amount of bytes. Because of the nature flash where we're loading files over the network, we might not have that amount of info available yet so we have to queue up that jump until we have the info we need. Isn't working with networks just pure fun?
Sidenote: I think it's quite weird that the URLStream does not extend ByteArray since it works almost the same way. Here the position property of ByteArray would have been useful for jumping bytes, but when it didn't exist, i had to do it with a for loop. Can anybody come up with a reason why it doesn't extend ByteArray?
Alright, we're pretty much done here. As I said in the beginning, this works almost every time. I found images that don't give any dimensions at all (when looking at the file they're not according to the specs, but still render fine in software, hmm..). If anybody has an idea of what's common to these files and how to solve this problem, i'd love to hear it
What all this shows to me–in addition to being a super useful class in real life projects–is the power of actionscript 3. I mean, getting jpg information (one could get exif info with this technique?) is just one thing, you could actually read and analyze any file. With this you could for example bring in formats that are not natively supported by flash. And well, a hell of a lot more. The sky is truly the limit.
Hope you found this useful, good luck
Download the class, with an example
Update Apr 10: the length in the description (11) should be 0x11. Fixed to avoid confusion
Comments
21 Comments
2007-04-09, 4:59 by sike
yeah, maybe we can use this to checked the flv size before load it to the stage.
2007-04-27, 2:54 by Kenneth Woodruff
Or EXIF data! Nice work.
2007-04-27, 9:15 by Torben
Thanks for this helpfully class - nice work!!!
2007-06-21, 14:09 by Getting JPG dimensions with AS3 without loading the entire file « Flash Enabled - Get Ready With Flash…
[…] Read More. […]
2007-06-21, 15:49 by Kevin Hoyt
I actually just wrote an application that does exactly this same thing, though your is a bit cleaner. Here’s a twist though, I wrote it using the File IO features of AIR against local images on disk. Good times to have so many options!
Nice work,
Kevin
2007-06-21, 16:35 by Jaap Kooiker
Excellent work! Thanx…
2007-06-21, 16:55 by Theo
Good work, this could be really useful. But if I understand it correctly you stop loading as soon as you know the dimensions? Given that the overhead of establishing the connection is often the bottleneck for images that are not big (say over 1000 pixels to one side) it would be useful if this worked more like a regular loader, that is, instead of just notifying me when the image was loaded it first notifies me when the image size is available. Having to do two connections to get the image would in many cases double the load time (because establishing the connection takes almost as long as sending a couple of 100 K).
On the other hand your implementation is great if I want to choose whether or not to load the image based on the dimensions.
2007-06-21, 18:55 by Jon B
I agree with Theo, it would be cool if this was a regular loader which could fire an event as soon as the dimensions were downloaded and then optionally continue to load the rest of the image - although I imagine that if you did this the image wouldn’t get cached either (you are stream bytes rather than loading a whole image) - in some situations this would be a huge bonus - in others it wouldn’t be at all - but I think the option would be nice
2007-06-22, 17:38 by Antti Author comment
Actually stopping the load progress is optional, so using it as a loader at the same time is already implemented (although not all events are fired, but that’s easy for anybody to add anyway, also a Event.COMPLETE probably shouldn’t be fired when the dimensions have been found, but when the file has completely loaded..). . Just put the second parameter to false in the extractSize function and it will continue loading (default is true)
2007-06-22, 20:17 by Mario Klingemann
That’s a great idea - thanks for sharing it!
2007-06-26, 22:26 by Jon B
Would I be right in thinking that the loaded image wouldn’t be cached? I’m not too sure on how URLStream works…
This is really cool
2007-08-12, 19:02 by Matt W
Dude - You are a genius - this is the cleverest class I’ve found to date and mega usefull! Much appreciate the public use of it - Will defo use this alot.
You are a walking Legend!
2008-02-08, 12:28 by Barry
Fantastic!
Do you know what the max size image you are allowed to load is? I got an 8400×600px image, loads but just shows a black box …
Sorry if this is off topic.
2008-02-08, 16:08 by Pawel
Thanks a lot! That’s a very useful thing. I think I’m gonna use this class in my upcoming new portfolio site :]
2008-02-08, 18:48 by Antti Author comment
Barry: There shouldn’t be a max limit, but i’m sure there can be problems with this. Have you looked at the actual hex chunks of the image, trying to figure out what’s wrong? If you find out what it is, please let me know.
One common problem (that i mentioned above too) is that a jpg can contain multiple within itself. Try resaving the file with photoshop’s save for web.
I can also take a look at the image if you want, trying to see what may cause the problem?
2008-02-23, 7:08 by Sachi
How to work out with sub samling factor in SOF0?
2008-03-18, 18:56 by UnkemptRich
Hi - i am new to this as3 and class stuff.
Could someone help me and give me a .zip file with an .fla in it that calls this Main.as file or the JPGExtractor - i just cant figure out how to import this as or get it to work.
Really sorry for my lack of knowledge
My email address is unkemptrich@hotmail.com
Cheers
Rich
2008-03-19, 11:24 by Antti Author comment
Rich: Check your mail.
I updated the zip to contain a .fla (it just sets Main as it’s document class).
2008-03-19, 20:34 by JPGSizeExtractor multi image example – Antti Kupila
[…] made a quick example to demonstrate the power of the JPGSizeExtractor class i wrote about a year ago. This is a demo that came up from a brief mail exchange with Richard Bacon […]
2008-04-02, 0:24 by jamie
I can’t get this to output the correct image size? my image was output in photoshop and is 350×255 but the trace says 160×139? any help would be great
2008-04-02, 0:36 by Antti Author comment
Jamie: I haven’t run into this issue but it’s fully possible something still breaks. Do you think you could send me the jpg that’s causing problems so that i can take a look at it?
Have you tried ’save for web’ in photoshop? It only saves the jpg data (stripping all meta data, thumbnails etc) and is less likely to cause problems. Still, even if you get it to work i’d like to take a look at the image that’s causing problems so that i can fix the class.
Post a comment