Working with files in JavaScript, Part 1: The Basics
<input type="file">
control. Even as the rest of the web was evolving, the way we dealt with
files never changed since it was first introduced. Thankfully, with
HTML5 and related APIs, we now have far more options for working with
files than ever before in the latest versions of desktop browsers (iOS
still has no support for the File API).The File type
TheFile type is defined in the File API[1] specification and is an abstract representation of a file. Each instance of File has several properties:name– the filenamesize– the size of the file in bytestype– the MIME type for the file
File object basically gives you essential information
about the file without providing direct access to the file contents.
That’s important because reading from files requires disk access, and
depending on the size of the file, that process has the potential to
take a significant amount of time. A File object is just a reference to a file, and getting data from that file is a separate process altogether.Getting File references
Of course, access to user files is strictly forbidden on the web because it’s a very obvious security issue. You wouldn’t want to load up a web page and then have it scan your hard drive and figure out what’s there. You need permission from the user in order to access files from their computer. There’s no need for messy permission windows, however, because users grant permission for web pages to read files all the time when they decide to upload something.When you use a
<input type="file"> control, you’re
giving the web page (and the server) permission to access that file. So
it makes sense that the first place you can retrieve File objects is through a <input type="file"> control.HTML5 defines a
files property for all <input type="file"> controls. This collection is a FileList, which is an array-like structure called FileList containing File
objects for each selected file in the control (remember, HTML5 allows
multiple file selection in these controls). So at any point in time, you
can get access to the files a user has selected using code similar to
this:<input type="file" id="your-files" multiple>
<script>
var control = document.getElementById("your-files");
control.addEventListener("change", function(event) {
// When the control has changed, there are new files
var i = 0,
files = control.files,
len = files.length;
for (; i < len; i++) {
console.log("Filename: " + files[i].name);
console.log("Type: " + files[i].type);
console.log("Size: " + files[i].size + " bytes");
}
}, false);
</script>
This relatively simple code listens for the change event
on the file control. When the event fires, it signifies that the file
selection has changed, and the code iterates through each File object and outputs its information. Keep in mind that the files property is always accessible from JavaScript, so you don’t have to wait for change to try to read it.Drag and drop files
Accessing files from form controls still requires the form control and the associated user action of browsing to find the files of interest. Fortunately, HTML5 Drag and Drop[2] provides another way for users to grant access to their files: by simply dragging a file from the desktop into the web browser. All you have to do to take advantage is listen for two events.In order to read files that are dropped onto an area of the page, you must listen for the
dragover and drop
events and cancel the default action of both. Doing so tells the
browser that you are handling the action directly and it shouldn’t, for
example, open an image file. <div id="your-files"></div>
<script>
var target = document.getElementById("your-files");
target.addEventListener("dragover", function(event) {
event.preventDefault();
}, false);
target.addEventListener("drop", function(event) {
// cancel default actions
event.preventDefault();
var i = 0,
files = event.dataTransfer.files,
len = files.length;
for (; i < len; i++) {
console.log("Filename: " + files[i].name);
console.log("Type: " + files[i].type);
console.log("Size: " + files[i].size + " bytes");
}
}, false);
</script>
The event.dataTransfer.files is another FileList object that you can access to get file information. The code is almost exactly the same as using the file form control and the File objects can be accessed in the same way.Ajax file upload
Once you have a reference to the file, you’re able to do something that’s pretty cool: upload a file via Ajax. This is all possible due to theFormData object, which is defined in XMLHttpRequest Level 2[3]. This object represents an HTML form and allows you to add key-value pairs to be submitted to the server via the append() method:var form = new FormData();
form.append("name", "Nicholas");
The great thing about the FormData object is that you
can add a file directly to it, effectively mimicking a file upload by
HTML form. All you have to do is add the File reference with a specific name, and the browser does the rest. For example:// create a form with a couple of values
var form = new FormData();
form.append("name", "Nicholas");
form.append("photo", control.files[0]);
// send via XHR - look ma, no headers being set!
var xhr = new XMLHttpRequest();
xhr.onload = function() {
console.log("Upload complete.");
};
xhr.open("post", "/entrypoint", true);
xhr.send(form);
Once the FormData object is passed into send(),
the proper HTTP headers are automatically set for you. You don’t have
to worry about setting the correct form encoding when using files, so
the server gets to act as if a regular HTML form has been submitted,
reading file data from the “photo” key and text data from the “name”
key. This gives you the freedom to write processing code on the backend
that can easily work with both traditional HTML forms and Ajax forms of
this nature.And all of this works in the most recent version of every browser, including Internet Explorer 10.
Up next
You now know the two methods of accessingFile
information in the browser: through a file upload control and through
native drag and drop. There will likely be other ways to access files in
the future, but for now, these are the two you need to know. Of course,
reading information about files is just part of the problem. The next
step is read data from those files, and that’s where part 2 will pick
up.References
Working with files in JavaScript, Part 2: FileReader
File
objects. These objects contain file metadata obtained only when the
user opts to either upload a file or drags and drops a file onto the web
page. Once you have files, however, the next step is to read data from
them.The FileReader type
TheFileReader type has a single job: to read data from a
file and store it in a JavaScript variable. The API is intentionally
designed to be similar to XMLHttpRequest since both are
loading data from an external (outside of the browser) resource. The
read is done asynchronously so as not to block the browser.There are several formats that a
FileReader can create
to represent the file data, and the format must be requested when asking
the file to be read. Reading is done through calling one of these
methods:readAsText()– returns the file contents as plain textreadAsBinaryString()– returns the file contents as a string of encoded binary data (deprecated – usereadAsArrayBuffer()instead)readAsArrayBuffer()– returns the file contents as anArrayBuffer(good for binary data such as images)readAsDataURL()– returns the file contents as a data URL
send() method initiating an HTTP request. As such, you must listen for the load event before starting to read. The result of the read is always represented by event.target.result. For example:var reader = new FileReader();
reader.onload = function(event) {
var contents = event.target.result;
console.log("File contents: " + contents);
};
reader.onerror = function(event) {
console.error("File could not be read! Code " + event.target.error.code);
};
reader.readAsText(file);
This example simply reads the contents of a file and outputs it in plain text to the console. The onload handler is called when the file is successfully read whereas the onerror handler is called if the file wasn’t read for some reason. The FileReader instance is available inside of the event handler via event.target and it’s recommended to use that instead of referencing the reader variable directly. The result property contains the file contents on success and error contains error information about the failed operation. Reading data URIs
You can use the same basic setup for reading to a data URI. Data URIs (sometimes called data URLs) are an interesting option if you want to, for example, display an image that was just read from disk. You could do so with the following code:var reader = new FileReader();
reader.onload = function(event) {
var dataUri = event.target.result,
img = document.createElement("img");
img.src = dataUri;
document.body.appendChild(img);
};
reader.onerror = function(event) {
console.error("File could not be read! Code " + event.target.error.code);
};
reader.readAsDataURL(file);
This code simply inserts an image that was read from disk into a
page. Since the data URI contains all of the image data, it can be
passed directly into the src attribute of an image and displayed on the page. You could, alternately, load the image and draw it onto a <canvas> as well:var reader = new FileReader();
reader.onload = function(event) {
var dataUri = event.target.result,
context = document.getElementById("mycanvas").getContext("2d"),
img = new Image();
// wait until the image has been fully processed
img.onload = function() {
context.drawImage(img, 100, 100);
};
img.src = dataUri;
};
reader.onerror = function(event) {
console.error("File could not be read! Code " + event.target.error.code);
};
reader.readAsDataURL(file);
This code loads the image data into a new Image object and then uses that to draw the image onto a canvas (specifying both the width and height as 100).Data URIs are generally used for this purpose, but can be used on any type of the file. The most common use case for reading a file into a data URI is to display the file contents on a web page immediately.
Reading ArrayBuffers
TheArrayBuffer type[1] was first introduced as part of WebGL. An ArrayBuffer represents a finite number of bytes that may be used to store numbers of any size. The way data is read from an ArrayBuffer is by using a specific view, such as Int8Array, which treats the underlying bytes as a collection of 8-bit signed integers or Float32Array, which treats the underlying bytes as a collection of 32-bit floating point numbers. These are called typed arrays[2], which force you to work with a specific numeric type rather than containing any type of data (as with traditional arrays).You use an
ArrayBuffer primarily when dealing with
binary files, to have more fine-grained control over the data. It’s
beyond the scope of this post to explain all the ins and outs of ArrayBuffer, just realize that you can read a file into an ArrayBuffer pretty easily if you need it. You can pass an ArrayBuffer directly into an XHR object’s send()
method to send the raw data to the server (you’ll have to read this
data from the request on the server to reconstruct the file), so long as
your browser fully supports XMLHttpRequest Level 2[3] (most recent browsers, including Internet Explorer 10 and Opera 12).Up next
Reading data from a file using aFileReader is pretty simple. If you know how to use XMLHttpRequest,
there’s no reason you can’t also be reading data from files. In the
next part of this series, you’ll learn more about using the FileReader events and understanding more about possible errors.References
Working with files in JavaScript, Part 3: Progress events and errors
FileReader object is used to read data from files that are made accessible through the browser. In my previous post, you learned how to use a FileReader object to easily read data from a file in a variety of formats. The FileReader is very similar to XMLHttpRequest in many ways.Progress events
Progress events are becoming so common that they’re actually written up in a separate specification[1]. These events are designed to generically indicate the progress of data transfers. Such transfers occur when requesting data from the server, but also when requesting data from disk, which is whatFileReader does. There are six progress events:
loadstart– indicates that the process of loading data has begun. This event always fires first.progress– fires multiple times as data is being loaded, giving access to intermediate data.error– fires when loading has failed.abort– fires when data loading has been canceled by callingabort()(available on bothXMLHttpRequestandFileReader).load– fires only when all data has been successfully read.loadend– fires when the object has finished transferring data. Always fires and will always fire aftererror,abort, orload.
error and load, were discussed in my previous post. The other events give you more fine-grained control over data transfers.Tracking progress
When you want to track progress of a file reader, use theprogress event. The event object for this event contains three properties to monitor the data being transferred:lengthComputable– a boolean indicating if the browser can determine the complete size of the data.loaded– the number of bytes that have been read already.total– the total number of bytes to be read.
progress event. For example, you may be using an HTML5 <progress> element to monitor the progress of reading a file. You can tie the progress value to the actual data using code like this:var reader = new FileReader(),
progressNode = document.getElementById("my-progress");
reader.onprogress = function(event) {
if (event.lengthComputable) {
progressNode.max = event.total;
progressNode.value = event.loaded;
}
};
reader.onloadend = function(event) {
var contents = event.target.result,
error = event.target.error;
if (error != null) {
console.error("File could not be read! Code " + error.code);
} else {
progressNode.max = 1;
progressNode.value = 1;
console.log("Contents: " + contents);
}
};
reader.readAsText(file);
This is similar to the approach that Gmail uses for its drag and drop
file upload implementation, where you see a progressbar immediately
after dropping a file onto the email. That progressbar indicates how
much of the files has been transferred to the server.Dealing with errors
Even though you’re reading a local file, it’s still possible for the read to fail. The File API specification[2] defines four types of errors:NotFoundError– the file can’t be found.SecurityError– something about the file or the read is dangerous. The browser has some leeway as to when this occurs, but generally if the file is dangerous to load into the browser or the browser has been performing too many reads, you’ll see this error.NotReadableError– the file exists but can’t be read, most likely due to a permissions problem.EncodingError– primarily when trying to read as a data URI and the length of the resulting data URI is beyond the maximum length supported by the browser.
FileReader object’s error
property is assigned to be an instance of one of the above mentioned
errors. At least, that’s how the spec is written. In reality, browsers
implement this as a FileError object that has a code property indicating the type of error that has occurred. Each error type is represented by a numeric constant value:FileError.NOT_FOUND_ERRfor file not found errors.FileError.SECURITY_ERRfor security errors.FileError.NOT_READABLE_ERRfor not readable errors.FileError.ENCODING_ERRfor encoding errors.FileError.ABORT_ERRwhenabort()is called while there is no read in progress.
error event or during loadend:var reader = new FileReader();
reader.onloadend = function(event) {
var contents = event.target.result,
error = event.target.error;
if (error != null) {
switch (error.code) {
case error.ENCODING_ERR:
console.error("Encoding error!");
break;
case error.NOT_FOUND_ERR:
console.error("File not found!");
break;
case error.NOT_READABLE_ERR:
console.error("File could not be read!");
break;
case error.SECURITY_ERR:
console.error("Security issue with file!");
break;
default:
console.error("I have no idea what's wrong!");
}
} else {
progressNode.max = 1;
progressNode.value = 1;
console.log("Contents: " + contents);
}
};
reader.readAsText(file);
Up next
TheFileReader object is a fully-featured object with a lot of functionality and a lot of similarities to XMLHttpRequest.
By following these last three posts, you should now be able to read
data from files using JavaScript and send that data back to the server
if necessary. However, the File API ecosystem is quite a
bit larger than has been already discussed in this series, and in the
next part you’ll learn about a powerful new features designed to work
with files.References
Working with files in JavaScript, Part 4: Object URLs
What is an object URL?
Object URLs are URLs that point to files on disk. Suppose, for example, that you want to display an image from the user’s system on a web page. The server never needs to know about the file, so there’s no need to upload it. You just want to load the file into a page. You could, as shown in the previous posts, get a reference to aFile object, read the data into a data URI, and then assign the data URI to an <img>
element. But think of all the waste: the image already exists on disk,
why read the image into another format in order to use it? If you create
an object URL, you could assign that to the <img> and access that local file directly.How does it work?
The File API[1] defines a global object calledURL that has two methods. The first is createObjectURL(), which accepts a reference to a File and returns an object URL. This instructs the browser to create and manage a URL to the local file. The second method is revokeObjectURL(),
which instructs the browser to destroy the URL that is passed into it,
effectively freeing up memory. Of course, all object URLs are revoked
once the web page is unloaded, but it’s good to free them up when
they’re no longer needed anyway.Support for the
URL object isn’t as good as for other parts of the File API. As of the time of my writing, Internet Explorer 10+ and Firefox 9+ support a global URL object. Chrome supports it in the form of webkitURL while Safari and Opera have no support.Example
So how would you display an image from disk without reading the data first? Suppose that you’ve given the user a way to select a file and now have a reference to it in a variable calledfile. You can then use the following:var URL = window.URL || window.webkitURL,
imageUrl,
image;
if (URL) {
imageUrl = URL.createObjectURL(file);
image = document.createElement("img");
image.onload = function() {
URL.revokeObjectURL(imageUrl);
};
image.src = imageUrl;
document.body.appendChild(image);
}
This example creates a local URL variable that normalizes the browser implementations. Assuming that URL is supported, the code goes on to create an object URL directly from file and stores it in imageUrl. A new <img> element is created and given an onload event handler that revokes the object URL (more on that in a minute). Then, the src property is assigned to the object URL and the element is added to the page (you may want to use an already-existing image). Why revoke the object URL once the image is loaded? After the image is loaded, the URL is no longer needed unless you intend to reuse it with another element. In this example, the image is being loaded into a single element, and once the image has been completely loaded, the URL isn’t serving any useful purpose. That’s the perfect time to free up any memory associated with it.
Security and other considerations
At first glance, this capability is a bit scary. You’re actually loading a file directly from the user’s machine via a URL. There are, of course, security implications to this capability. The URL itself isn’t a big security issue because it’s a URL that’s assigned dynamically by the browser and would be useless on any other computer. What about cross-origin?The File API disallows using object URLs on different origins. When an object URL is created, it is tied to the origin of the page in which the JavaScript executed, so you can’t use an object URL from
www.wrox.com on a page at p2p.wrox.com (an error occurs). However, two pages from www.wrox.com, where one is embedded in the other with an iframe, are capable of sharing object URLs.Object URLs exist only so long as the document that created them. When the document is unloaded, all object URLs are revoked. So, it doesn’t make sense to store object URLs in client-side data storage to use later; they are useless after the page has been unloaded.
You can use object URLs anywhere the browser would make a GET request, which includes images, scripts, web workers, style sheets, audio, and video. You can never use an object URL when the browser would perform a POST, such as within a
<form> whose method is set to “post”.Up next
The ability to create URLs that link directly to local files is a powerful one. Instead of needing to read a local file into JavaScript in order to display it on a page, you can simply create a URL and point the page to it. This process greatly simplifies the use case of including local files in a page. However, the fun of working with files in JavaScript has only just begun. In the next post, you’ll learn some interesting ways to work with file data.References
Working with files in JavaScript, Part 5: Blobs
File objects. The File object is actually a more specific version of a Blob, which represents a chunk of binary data. The size and type properties exist on Blob objects and are inherited by File.In most cases,
Blobs and Files can be used in the same places. For example, you can read from a Blob using a FileReader and you can create an object URL from a Blob using URL.createObjectURL().Slicing
One of the interesting things you can do withBlobs (and therefore, also Files) is to create a new Blob based on a subsection of another one. Since each Blob just represents pointers to data rather than the data itself, you can quickly create new Blob objects pointing to subparts of others. This is accomplished by using the slice() method.You may be familiar with
slice() on strings and arrays, and the one for Blobs
behaves in a similar manner. The method accepts three arguments: the
offset of the starting byte, the offset of the ending byte, and an
optional MIME type to apply to the Blob. If the MIME type isn’t specified, the new Blob has the same MIME type as the original one.Browser support for
slice() isn’t yet ubiquitous, with Firefox supporting it via mozSlice() and webkitSlice() in Chrome (no other browsers support this method currently). Here’s an example:function sliceBlob(blob, start, end, type) {
type = type || blob.type;
if (blob.mozSlice) {
return blob.mozSlice(start, end, type);
} else if (blob.webkitSlice) {
return blob.webkitSlice(start, end type);
} else {
throw new Error("This doesn't work!");
}
}
You can then use this function to, for example, split up a large file to upload it in chunks. Each new Blob
being produced is independent from the original even though the data
each references has an overlap. The engineers at Flickr use blob slicing
to read the Exif information from photos that are uploaded[1]
rather than waiting to it on the server. When the file is selected, the
Flickr upload page simultaneously starts to upload the file as well as
read the Exif information from the photo. This allows them to give a
preview of the extracted metadata in the page as the file is being
uploaded.Creating Blobs the old way
Very soon afterFile objects started appearing in browsers, developers realized that Blob
objects were actually quite powerful and so wanted to be able to create
them without user interaction. After all, any data can be represented
in a Blob, it doesn’t necessarily have to be tied to a file. Browsers quickly responded by creating BlobBuilder, a type whose sole purpose is to wrap some data in a Blob object. This is a non-standard type and has been implemented in Firefox (as MozBlobBuilder), Internet Explorer 10 (as MSBlobBuilder), and Chrome (as WebKitBlobBuilder).The
BlobBuilder works by creating a new instance and calling the append() method with a string, ArrayBuffer, or Blob. Once all of the data has been added, you call getBlob() and pass in an optional MIME type that should be applied to Blob. Here’s an example:var builder = new BlobBuilder();
builder.append("Hello world!");
var blob = builder.getBlob("text/plain");
The ability to create URLs for arbitrary pieces of data is incredibly
powerful, allowing you to dynamically create objects that can be
addressed as files in the browser. You could, for example, use a Blob to create a web worker without having a separate file for the worker code. This technique was written up in The Basics of Web Workers[2]:// Prefixed in Webkit, Chrome 12, and FF6: window.WebKitBlobBuilder, window.MozBlobBuilder
var bb = new BlobBuilder();
bb.append("onmessage = function(e) { postMessage('msg from worker'); }");
// Obtain a blob URL reference to our worker 'file'.
// Note: window.webkitURL.createObjectURL() in Chrome 10+.
var blobURL = window.URL.createObjectURL(bb.getBlob());
var worker = new Worker(blobURL);
worker.onmessage = function(e) {
// e.data == 'msg from worker'
};
worker.postMessage(); // Start the worker.
This code creates a simple script and then creates an object URL. The
object URL is assigned to a web worker in place of a script URL. You can call
append() as many times as you like, building up the contents of the Blob. Creating Blobs the new way
Because developers kept clamoring for a way to createBlob objects directly, and browsers coming up with BlobBuilder, it was decided to add a Blob constructor. This constructor is now part of the specification and will be the way that Blob objects are created in the future.The constructor accepts two arguments. The first is an array of parts to combine into a
Blob. These would be the same values as passed into the append() method of BlobBuilder and can be any number of strings, Blobs, and ArrayBuffers. The second argument is an object containing properties for the newly-created Blob. There are currently two properties defined, type, which specifies the MIME type of the Blob, and endings, which can be either “transparent” (default) or “native”. Here’s an example:var blob = new Blob(["Hello world!"], { type: "text/plain" });
As you can see, this is much simpler than using BlobBuilder.The
Blob constructor is currently in the nightly builds
of Chrome and will be in Firefox 13. Other browsers have not yet
announced plans to implement this constructor, however, it is now part
of the File API[3] standard and is expected to be implemented universally.Conclusion
This is the last part of the series on working with files in JavaScript. As I hope you learned, the File API is incredibly powerful and opens up entirely new ways of working with files in web applications. You no longer need to stick with plain file upload boxes when users need to upload files, and now that you can read the files in the client, that opens up all sorts of possibilities for client-side manipulation. You could resize an image that’s too large before uploading (usingFileReader and <canvas>);
you could create a text editor that works purely in the browser; you
could split up large files to upload piece by piece. The possibilities
aren’t quite endless, but are pretty damn close.References
- Parsing Exif client-side using JavaScript by Flickr Team
- The Basics of Web Workers by Eric Bidelman
- File API – Blob Constructor