Working with files in JavaScript, Part 1: The Basics
<input type="file">
control. Even as the rest of the web was evolving, the way we dealt with
files never changed since it was first introduced. Thankfully, with
HTML5 and related APIs, we now have far more options for working with
files than ever before in the latest versions of desktop browsers (iOS
still has no support for the File API).The File type
TheFile
type is defined in the File API[1] specification and is an abstract representation of a file. Each instance of File
has several properties:name
– the filenamesize
– the size of the file in bytestype
– the MIME type for the file
File
object basically gives you essential information
about the file without providing direct access to the file contents.
That’s important because reading from files requires disk access, and
depending on the size of the file, that process has the potential to
take a significant amount of time. A File
object is just a reference to a file, and getting data from that file is a separate process altogether.Getting File references
Of course, access to user files is strictly forbidden on the web because it’s a very obvious security issue. You wouldn’t want to load up a web page and then have it scan your hard drive and figure out what’s there. You need permission from the user in order to access files from their computer. There’s no need for messy permission windows, however, because users grant permission for web pages to read files all the time when they decide to upload something.When you use a
<input type="file">
control, you’re
giving the web page (and the server) permission to access that file. So
it makes sense that the first place you can retrieve File
objects is through a <input type="file">
control.HTML5 defines a
files
property for all <input type="file">
controls. This collection is a FileList
, which is an array-like structure called FileList
containing File
objects for each selected file in the control (remember, HTML5 allows
multiple file selection in these controls). So at any point in time, you
can get access to the files a user has selected using code similar to
this:<input type="file" id="your-files" multiple>
<script>
var control = document.getElementById("your-files");
control.addEventListener("change", function(event) {
// When the control has changed, there are new files
var i = 0,
files = control.files,
len = files.length;
for (; i < len; i++) {
console.log("Filename: " + files[i].name);
console.log("Type: " + files[i].type);
console.log("Size: " + files[i].size + " bytes");
}
}, false);
</script>
This relatively simple code listens for the change
event
on the file control. When the event fires, it signifies that the file
selection has changed, and the code iterates through each File
object and outputs its information. Keep in mind that the files
property is always accessible from JavaScript, so you don’t have to wait for change
to try to read it.Drag and drop files
Accessing files from form controls still requires the form control and the associated user action of browsing to find the files of interest. Fortunately, HTML5 Drag and Drop[2] provides another way for users to grant access to their files: by simply dragging a file from the desktop into the web browser. All you have to do to take advantage is listen for two events.In order to read files that are dropped onto an area of the page, you must listen for the
dragover
and drop
events and cancel the default action of both. Doing so tells the
browser that you are handling the action directly and it shouldn’t, for
example, open an image file. <div id="your-files"></div>
<script>
var target = document.getElementById("your-files");
target.addEventListener("dragover", function(event) {
event.preventDefault();
}, false);
target.addEventListener("drop", function(event) {
// cancel default actions
event.preventDefault();
var i = 0,
files = event.dataTransfer.files,
len = files.length;
for (; i < len; i++) {
console.log("Filename: " + files[i].name);
console.log("Type: " + files[i].type);
console.log("Size: " + files[i].size + " bytes");
}
}, false);
</script>
The event.dataTransfer.files
is another FileList
object that you can access to get file information. The code is almost exactly the same as using the file form control and the File
objects can be accessed in the same way.Ajax file upload
Once you have a reference to the file, you’re able to do something that’s pretty cool: upload a file via Ajax. This is all possible due to theFormData
object, which is defined in XMLHttpRequest Level 2[3]. This object represents an HTML form and allows you to add key-value pairs to be submitted to the server via the append()
method:var form = new FormData();
form.append("name", "Nicholas");
The great thing about the FormData
object is that you
can add a file directly to it, effectively mimicking a file upload by
HTML form. All you have to do is add the File
reference with a specific name, and the browser does the rest. For example:// create a form with a couple of values
var form = new FormData();
form.append("name", "Nicholas");
form.append("photo", control.files[0]);
// send via XHR - look ma, no headers being set!
var xhr = new XMLHttpRequest();
xhr.onload = function() {
console.log("Upload complete.");
};
xhr.open("post", "/entrypoint", true);
xhr.send(form);
Once the FormData
object is passed into send()
,
the proper HTTP headers are automatically set for you. You don’t have
to worry about setting the correct form encoding when using files, so
the server gets to act as if a regular HTML form has been submitted,
reading file data from the “photo” key and text data from the “name”
key. This gives you the freedom to write processing code on the backend
that can easily work with both traditional HTML forms and Ajax forms of
this nature.And all of this works in the most recent version of every browser, including Internet Explorer 10.
Up next
You now know the two methods of accessingFile
information in the browser: through a file upload control and through
native drag and drop. There will likely be other ways to access files in
the future, but for now, these are the two you need to know. Of course,
reading information about files is just part of the problem. The next
step is read data from those files, and that’s where part 2 will pick
up.References
Working with files in JavaScript, Part 2: FileReader
File
objects. These objects contain file metadata obtained only when the
user opts to either upload a file or drags and drops a file onto the web
page. Once you have files, however, the next step is to read data from
them.The FileReader type
TheFileReader
type has a single job: to read data from a
file and store it in a JavaScript variable. The API is intentionally
designed to be similar to XMLHttpRequest
since both are
loading data from an external (outside of the browser) resource. The
read is done asynchronously so as not to block the browser.There are several formats that a
FileReader
can create
to represent the file data, and the format must be requested when asking
the file to be read. Reading is done through calling one of these
methods:readAsText()
– returns the file contents as plain textreadAsBinaryString()
– returns the file contents as a string of encoded binary data (deprecated – usereadAsArrayBuffer()
instead)readAsArrayBuffer()
– returns the file contents as anArrayBuffer
(good for binary data such as images)readAsDataURL()
– returns the file contents as a data URL
send()
method initiating an HTTP request. As such, you must listen for the load
event before starting to read. The result of the read is always represented by event.target.result
. For example:var reader = new FileReader();
reader.onload = function(event) {
var contents = event.target.result;
console.log("File contents: " + contents);
};
reader.onerror = function(event) {
console.error("File could not be read! Code " + event.target.error.code);
};
reader.readAsText(file);
This example simply reads the contents of a file and outputs it in plain text to the console. The onload
handler is called when the file is successfully read whereas the onerror
handler is called if the file wasn’t read for some reason. The FileReader
instance is available inside of the event handler via event.target
and it’s recommended to use that instead of referencing the reader
variable directly. The result
property contains the file contents on success and error
contains error information about the failed operation. Reading data URIs
You can use the same basic setup for reading to a data URI. Data URIs (sometimes called data URLs) are an interesting option if you want to, for example, display an image that was just read from disk. You could do so with the following code:var reader = new FileReader();
reader.onload = function(event) {
var dataUri = event.target.result,
img = document.createElement("img");
img.src = dataUri;
document.body.appendChild(img);
};
reader.onerror = function(event) {
console.error("File could not be read! Code " + event.target.error.code);
};
reader.readAsDataURL(file);
This code simply inserts an image that was read from disk into a
page. Since the data URI contains all of the image data, it can be
passed directly into the src
attribute of an image and displayed on the page. You could, alternately, load the image and draw it onto a <canvas>
as well:var reader = new FileReader();
reader.onload = function(event) {
var dataUri = event.target.result,
context = document.getElementById("mycanvas").getContext("2d"),
img = new Image();
// wait until the image has been fully processed
img.onload = function() {
context.drawImage(img, 100, 100);
};
img.src = dataUri;
};
reader.onerror = function(event) {
console.error("File could not be read! Code " + event.target.error.code);
};
reader.readAsDataURL(file);
This code loads the image data into a new Image
object and then uses that to draw the image onto a canvas (specifying both the width and height as 100).Data URIs are generally used for this purpose, but can be used on any type of the file. The most common use case for reading a file into a data URI is to display the file contents on a web page immediately.
Reading ArrayBuffers
TheArrayBuffer
type[1] was first introduced as part of WebGL. An ArrayBuffer
represents a finite number of bytes that may be used to store numbers of any size. The way data is read from an ArrayBuffer
is by using a specific view, such as Int8Array
, which treats the underlying bytes as a collection of 8-bit signed integers or Float32Array
, which treats the underlying bytes as a collection of 32-bit floating point numbers. These are called typed arrays[2], which force you to work with a specific numeric type rather than containing any type of data (as with traditional arrays).You use an
ArrayBuffer
primarily when dealing with
binary files, to have more fine-grained control over the data. It’s
beyond the scope of this post to explain all the ins and outs of ArrayBuffer
, just realize that you can read a file into an ArrayBuffer
pretty easily if you need it. You can pass an ArrayBuffer
directly into an XHR object’s send()
method to send the raw data to the server (you’ll have to read this
data from the request on the server to reconstruct the file), so long as
your browser fully supports XMLHttpRequest Level 2[3] (most recent browsers, including Internet Explorer 10 and Opera 12).Up next
Reading data from a file using aFileReader
is pretty simple. If you know how to use XMLHttpRequest
,
there’s no reason you can’t also be reading data from files. In the
next part of this series, you’ll learn more about using the FileReader
events and understanding more about possible errors.References
Working with files in JavaScript, Part 3: Progress events and errors
FileReader
object is used to read data from files that are made accessible through the browser. In my previous post, you learned how to use a FileReader
object to easily read data from a file in a variety of formats. The FileReader
is very similar to XMLHttpRequest
in many ways.Progress events
Progress events are becoming so common that they’re actually written up in a separate specification[1]. These events are designed to generically indicate the progress of data transfers. Such transfers occur when requesting data from the server, but also when requesting data from disk, which is whatFileReader
does. There are six progress events:
loadstart
– indicates that the process of loading data has begun. This event always fires first.progress
– fires multiple times as data is being loaded, giving access to intermediate data.error
– fires when loading has failed.abort
– fires when data loading has been canceled by callingabort()
(available on bothXMLHttpRequest
andFileReader
).load
– fires only when all data has been successfully read.loadend
– fires when the object has finished transferring data. Always fires and will always fire aftererror
,abort
, orload
.
error
and load
, were discussed in my previous post. The other events give you more fine-grained control over data transfers.Tracking progress
When you want to track progress of a file reader, use theprogress
event. The event
object for this event contains three properties to monitor the data being transferred:lengthComputable
– a boolean indicating if the browser can determine the complete size of the data.loaded
– the number of bytes that have been read already.total
– the total number of bytes to be read.
progress
event. For example, you may be using an HTML5 <progress>
element to monitor the progress of reading a file. You can tie the progress value to the actual data using code like this:var reader = new FileReader(),
progressNode = document.getElementById("my-progress");
reader.onprogress = function(event) {
if (event.lengthComputable) {
progressNode.max = event.total;
progressNode.value = event.loaded;
}
};
reader.onloadend = function(event) {
var contents = event.target.result,
error = event.target.error;
if (error != null) {
console.error("File could not be read! Code " + error.code);
} else {
progressNode.max = 1;
progressNode.value = 1;
console.log("Contents: " + contents);
}
};
reader.readAsText(file);
This is similar to the approach that Gmail uses for its drag and drop
file upload implementation, where you see a progressbar immediately
after dropping a file onto the email. That progressbar indicates how
much of the files has been transferred to the server.Dealing with errors
Even though you’re reading a local file, it’s still possible for the read to fail. The File API specification[2] defines four types of errors:NotFoundError
– the file can’t be found.SecurityError
– something about the file or the read is dangerous. The browser has some leeway as to when this occurs, but generally if the file is dangerous to load into the browser or the browser has been performing too many reads, you’ll see this error.NotReadableError
– the file exists but can’t be read, most likely due to a permissions problem.EncodingError
– primarily when trying to read as a data URI and the length of the resulting data URI is beyond the maximum length supported by the browser.
FileReader
object’s error
property is assigned to be an instance of one of the above mentioned
errors. At least, that’s how the spec is written. In reality, browsers
implement this as a FileError
object that has a code
property indicating the type of error that has occurred. Each error type is represented by a numeric constant value:FileError.NOT_FOUND_ERR
for file not found errors.FileError.SECURITY_ERR
for security errors.FileError.NOT_READABLE_ERR
for not readable errors.FileError.ENCODING_ERR
for encoding errors.FileError.ABORT_ERR
whenabort()
is called while there is no read in progress.
error
event or during loadend
:var reader = new FileReader();
reader.onloadend = function(event) {
var contents = event.target.result,
error = event.target.error;
if (error != null) {
switch (error.code) {
case error.ENCODING_ERR:
console.error("Encoding error!");
break;
case error.NOT_FOUND_ERR:
console.error("File not found!");
break;
case error.NOT_READABLE_ERR:
console.error("File could not be read!");
break;
case error.SECURITY_ERR:
console.error("Security issue with file!");
break;
default:
console.error("I have no idea what's wrong!");
}
} else {
progressNode.max = 1;
progressNode.value = 1;
console.log("Contents: " + contents);
}
};
reader.readAsText(file);
Up next
TheFileReader
object is a fully-featured object with a lot of functionality and a lot of similarities to XMLHttpRequest
.
By following these last three posts, you should now be able to read
data from files using JavaScript and send that data back to the server
if necessary. However, the File API ecosystem is quite a
bit larger than has been already discussed in this series, and in the
next part you’ll learn about a powerful new features designed to work
with files.References
Working with files in JavaScript, Part 4: Object URLs
What is an object URL?
Object URLs are URLs that point to files on disk. Suppose, for example, that you want to display an image from the user’s system on a web page. The server never needs to know about the file, so there’s no need to upload it. You just want to load the file into a page. You could, as shown in the previous posts, get a reference to aFile
object, read the data into a data URI, and then assign the data URI to an <img>
element. But think of all the waste: the image already exists on disk,
why read the image into another format in order to use it? If you create
an object URL, you could assign that to the <img>
and access that local file directly.How does it work?
The File API[1] defines a global object calledURL
that has two methods. The first is createObjectURL()
, which accepts a reference to a File
and returns an object URL. This instructs the browser to create and manage a URL to the local file. The second method is revokeObjectURL()
,
which instructs the browser to destroy the URL that is passed into it,
effectively freeing up memory. Of course, all object URLs are revoked
once the web page is unloaded, but it’s good to free them up when
they’re no longer needed anyway.Support for the
URL
object isn’t as good as for other parts of the File API. As of the time of my writing, Internet Explorer 10+ and Firefox 9+ support a global URL
object. Chrome supports it in the form of webkitURL
while Safari and Opera have no support.Example
So how would you display an image from disk without reading the data first? Suppose that you’ve given the user a way to select a file and now have a reference to it in a variable calledfile
. You can then use the following:var URL = window.URL || window.webkitURL,
imageUrl,
image;
if (URL) {
imageUrl = URL.createObjectURL(file);
image = document.createElement("img");
image.onload = function() {
URL.revokeObjectURL(imageUrl);
};
image.src = imageUrl;
document.body.appendChild(image);
}
This example creates a local URL
variable that normalizes the browser implementations. Assuming that URL
is supported, the code goes on to create an object URL directly from file
and stores it in imageUrl
. A new <img>
element is created and given an onload
event handler that revokes the object URL (more on that in a minute). Then, the src
property is assigned to the object URL and the element is added to the page (you may want to use an already-existing image). Why revoke the object URL once the image is loaded? After the image is loaded, the URL is no longer needed unless you intend to reuse it with another element. In this example, the image is being loaded into a single element, and once the image has been completely loaded, the URL isn’t serving any useful purpose. That’s the perfect time to free up any memory associated with it.
Security and other considerations
At first glance, this capability is a bit scary. You’re actually loading a file directly from the user’s machine via a URL. There are, of course, security implications to this capability. The URL itself isn’t a big security issue because it’s a URL that’s assigned dynamically by the browser and would be useless on any other computer. What about cross-origin?The File API disallows using object URLs on different origins. When an object URL is created, it is tied to the origin of the page in which the JavaScript executed, so you can’t use an object URL from
www.wrox.com
on a page at p2p.wrox.com
(an error occurs). However, two pages from www.wrox.com
, where one is embedded in the other with an iframe, are capable of sharing object URLs.Object URLs exist only so long as the document that created them. When the document is unloaded, all object URLs are revoked. So, it doesn’t make sense to store object URLs in client-side data storage to use later; they are useless after the page has been unloaded.
You can use object URLs anywhere the browser would make a GET request, which includes images, scripts, web workers, style sheets, audio, and video. You can never use an object URL when the browser would perform a POST, such as within a
<form>
whose method
is set to “post”.Up next
The ability to create URLs that link directly to local files is a powerful one. Instead of needing to read a local file into JavaScript in order to display it on a page, you can simply create a URL and point the page to it. This process greatly simplifies the use case of including local files in a page. However, the fun of working with files in JavaScript has only just begun. In the next post, you’ll learn some interesting ways to work with file data.References
Working with files in JavaScript, Part 5: Blobs
File
objects. The File
object is actually a more specific version of a Blob
, which represents a chunk of binary data. The size
and type
properties exist on Blob
objects and are inherited by File
.In most cases,
Blob
s and File
s can be used in the same places. For example, you can read from a Blob
using a FileReader
and you can create an object URL from a Blob
using URL.createObjectURL()
.Slicing
One of the interesting things you can do withBlob
s (and therefore, also File
s) is to create a new Blob
based on a subsection of another one. Since each Blob
just represents pointers to data rather than the data itself, you can quickly create new Blob
objects pointing to subparts of others. This is accomplished by using the slice()
method.You may be familiar with
slice()
on strings and arrays, and the one for Blob
s
behaves in a similar manner. The method accepts three arguments: the
offset of the starting byte, the offset of the ending byte, and an
optional MIME type to apply to the Blob
. If the MIME type isn’t specified, the new Blob
has the same MIME type as the original one.Browser support for
slice()
isn’t yet ubiquitous, with Firefox supporting it via mozSlice()
and webkitSlice()
in Chrome (no other browsers support this method currently). Here’s an example:function sliceBlob(blob, start, end, type) {
type = type || blob.type;
if (blob.mozSlice) {
return blob.mozSlice(start, end, type);
} else if (blob.webkitSlice) {
return blob.webkitSlice(start, end type);
} else {
throw new Error("This doesn't work!");
}
}
You can then use this function to, for example, split up a large file to upload it in chunks. Each new Blob
being produced is independent from the original even though the data
each references has an overlap. The engineers at Flickr use blob slicing
to read the Exif information from photos that are uploaded[1]
rather than waiting to it on the server. When the file is selected, the
Flickr upload page simultaneously starts to upload the file as well as
read the Exif information from the photo. This allows them to give a
preview of the extracted metadata in the page as the file is being
uploaded.Creating Blobs the old way
Very soon afterFile
objects started appearing in browsers, developers realized that Blob
objects were actually quite powerful and so wanted to be able to create
them without user interaction. After all, any data can be represented
in a Blob
, it doesn’t necessarily have to be tied to a file. Browsers quickly responded by creating BlobBuilder
, a type whose sole purpose is to wrap some data in a Blob
object. This is a non-standard type and has been implemented in Firefox (as MozBlobBuilder
), Internet Explorer 10 (as MSBlobBuilder
), and Chrome (as WebKitBlobBuilder
).The
BlobBuilder
works by creating a new instance and calling the append()
method with a string, ArrayBuffer
, or Blob
. Once all of the data has been added, you call getBlob()
and pass in an optional MIME type that should be applied to Blob
. Here’s an example:var builder = new BlobBuilder();
builder.append("Hello world!");
var blob = builder.getBlob("text/plain");
The ability to create URLs for arbitrary pieces of data is incredibly
powerful, allowing you to dynamically create objects that can be
addressed as files in the browser. You could, for example, use a Blob
to create a web worker without having a separate file for the worker code. This technique was written up in The Basics of Web Workers[2]:// Prefixed in Webkit, Chrome 12, and FF6: window.WebKitBlobBuilder, window.MozBlobBuilder
var bb = new BlobBuilder();
bb.append("onmessage = function(e) { postMessage('msg from worker'); }");
// Obtain a blob URL reference to our worker 'file'.
// Note: window.webkitURL.createObjectURL() in Chrome 10+.
var blobURL = window.URL.createObjectURL(bb.getBlob());
var worker = new Worker(blobURL);
worker.onmessage = function(e) {
// e.data == 'msg from worker'
};
worker.postMessage(); // Start the worker.
This code creates a simple script and then creates an object URL. The
object URL is assigned to a web worker in place of a script URL. You can call
append()
as many times as you like, building up the contents of the Blob
. Creating Blobs the new way
Because developers kept clamoring for a way to createBlob
objects directly, and browsers coming up with BlobBuilder
, it was decided to add a Blob
constructor. This constructor is now part of the specification and will be the way that Blob
objects are created in the future.The constructor accepts two arguments. The first is an array of parts to combine into a
Blob
. These would be the same values as passed into the append()
method of BlobBuilder
and can be any number of strings, Blob
s, and ArrayBuffer
s. The second argument is an object containing properties for the newly-created Blob
. There are currently two properties defined, type
, which specifies the MIME type of the Blob
, and endings
, which can be either “transparent” (default) or “native”. Here’s an example:var blob = new Blob(["Hello world!"], { type: "text/plain" });
As you can see, this is much simpler than using BlobBuilder
.The
Blob
constructor is currently in the nightly builds
of Chrome and will be in Firefox 13. Other browsers have not yet
announced plans to implement this constructor, however, it is now part
of the File API[3] standard and is expected to be implemented universally.Conclusion
This is the last part of the series on working with files in JavaScript. As I hope you learned, the File API is incredibly powerful and opens up entirely new ways of working with files in web applications. You no longer need to stick with plain file upload boxes when users need to upload files, and now that you can read the files in the client, that opens up all sorts of possibilities for client-side manipulation. You could resize an image that’s too large before uploading (usingFileReader
and <canvas>
);
you could create a text editor that works purely in the browser; you
could split up large files to upload piece by piece. The possibilities
aren’t quite endless, but are pretty damn close.References
- Parsing Exif client-side using JavaScript by Flickr Team
- The Basics of Web Workers by Eric Bidelman
- File API – Blob Constructor
Комментариев нет:
Отправить комментарий