This document discusses Java input/output (I/O) streams and readers/writers for processing files, URLs, and other sources of input and output. It covers obtaining and working with input and output streams, reading and writing bytes, character encodings, text I/O, random access files, file operations, URL connections, and object serialization. The key classes for I/O include InputStream, OutputStream, Reader, Writer, File, Path, and URLConnection.
1. PROCESSING INPUT AND OUTPUT
PROGRAMMAZIONE CONCORRENTE E DISTR.
Università degli Studi di Padova
Dipartimento di Matematica
Corso di Laurea in Informatica, A.A. 2015 – 2016
rcardin@math.unipd.it
2. Programmazione concorrente e distribuita
SUMMARY
Introduction
Obtaining streams
Reading / writing bytes
Character encodings
Text input / output
Random access files
Processing files
URLs
Serialization
2Riccardo Cardin
3. Programmazione concorrente e distribuita
INTRODUCTION
Java provides an elegant API to read and write
data in binary and text format
The API let’s you work in the same way with files,
directories, web pages and so on
A source from which one can read bytes is called an
input stream
Bytes can come from a file, a network connection or an array
in memory
A destination for bytes in an output stream
In contrast readers and writers consume and produce
sequences of characters
3Riccardo Cardin
4. Programmazione concorrente e distribuita
OBTAINING STREAMS
A stream can be obtained from different sources
From a file
From an URL
From an array of bytes
Once the stream is obtained we can I/O from and to
the source using the same API
4Riccardo Cardin
InputStream in = Files.newInputStream(path);
OutputStream out = Files.newOutputStrem(path);
URL url = new URL("http://rcardin.github.io");
InputStream in = url.openStream();
byte[] bytes = /* ... */;
InputStream in = new ByteArrayInputStream(bytes);
OutputStream out = new ByteArrayOutputStream();
bytes = out.toByteArray();
5. Programmazione concorrente e distribuita
READING BYTES
The type to read bytes from a source is
java.io.InputStream
An input stream can read a single byte
Cast to byte only after you’ve checked that is not -1
Or it can read in bulk into a byte array
5Riccardo Cardin
InputStream in = /* ... */;
// A byte is an integer from 0 to 255. -1 is reserved to the
// end of the input stream
int b = in.read();
bytes[] bytes = /* ... */;
// Returns the number of actual read bytes
int actualBytesRead = in.read(bytes);
// Writes the bytes from a position of the array, for at max
// ‘length’ number of bytes
actualBytesRead = in.read(bytes, start, length);
6. Programmazione concorrente e distribuita
WRITING BYTES
The type to write bytes to a target is
java.io.OutputStream
Output stream can write a single byte or a bytes array
When done, you have to close streams
Streams implement AutoClosable and so you can use a try-
with-resources statements
6Riccardo Cardin
OutputStream in = /* ... */;
int b = /* ... */;
out.write(b);
byte[] bytes = /* ... */;
out.write(bytes);
out.write(bytes, start, length)
try (OutputStream out = /* ... */) {
out.write(bytes);
}
8. Programmazione concorrente e distribuita
CHARACTER ENCODINGS
In many cases you will interpret bytes as a
sequence of characters
How such characters were encoded?
Java uses Unicode standard for characters
Each char, or code point, has a 21-bit integer number
UTF-8 encondes each Unicode code point into a
sequence of one to four bytes
ASCII character set are represented with only 1 byte
UTF-16 encodes Unicode code points into one or two
16-bit values
This encoding is the default used in Java strings
8Riccardo Cardin
9. Programmazione concorrente e distribuita
CHARACTER ENCODINGS
There is no realiable way to automatically detect
the encoding from a stream of bytes
You should always explicitly specify the encoding
using StandardCharsets class
Use the Charset object when reading or writing text
If you specify nothing, the computer default charset is used
9Riccardo Cardin
// Values are of type Charset
StandardCharsets.UTF_8
StandardCharsets.UTF_16
StandardCharsets.UTF_16BE
StandardCharsets.UTF_16LE
StandardCharsets.ISO_8859_1
StandardCharsets.US_ASCII
String str = new String(bytes, StandardCharsets.UTF_8)
10. Programmazione concorrente e distribuita
TEXT INPUT
To read text input use a Reader
Obtain a reader from an input stream using an
InputStreamReader decorator
It is not very convenient to read a char at time
If you want to read an input line by line, usa the
decorator class BufferedReader
10Riccardo Cardin
// Here you can view the "onion" structure
Reader in = new InputStreamReader(new InputStream(/*...*/, charset);
// Reads a code unit between 0 and 65536, or -1
int ch = in.read();
try (BufferedReader reader =
new BufferedReader(new InputStreamReader(
new InputStream(/*...*/)) {
// A null is returned when the stream is done
String line = reader.readLine();
}
11. Programmazione concorrente e distribuita
TEXT OUTPUT
To write text us the class Writer
To turn an output stream to a writer use
OutputStreamWriter decorator
The write method writes strings to the stream
Class PrintWriter adapts the interface of writers to
the print, println and printf used with System.out
StringWriter lets you to write a stream into a String
You can combine it with a PrintWriter
11Riccardo Cardin
Writer out = new OutputStreamWriter(
new OutputStream(/*...*/), charset);
out.write("Any string");
PrintWriter wrt = new PrintWriter(out, "UTF-8");
wrt.println("Any string");
13. Programmazione concorrente e distribuita
DEALING WITH BINARY DATA
The DataInput type reads from binary source
The DataOutput type writes in binary format
Dealing with binary data is useful beacuse it is fixed in
width and efficient
No parsing operation is needed
The classes DataInputStream and
DataOutputStream adapt stream to the interface
13Riccardo Cardin
byte readByte();
char readChar();
int readInt();
long readLong();
// ...
DataInput in = new DataInputStream(Files.newInputStream(path));
DataOutput out = new DataOutputStream(Files.newOutputStream(path));
14. Programmazione concorrente e distribuita
RANDOM ACCESS FILES
The type RandomAccessFile lets you read or
write data anywhere in a file
Implements DataInput / DataOutput interfaces
Use "r" to open in read mode, "rw" in read-write
Such a file has a pointer that indicates the position of
the next byte
Use the seek method to position the pointer inside the file
14Riccardo Cardin
RandomAccessFile file = new RandomAccessFile(path.toString(), "rw");
// Read the next integer in the file
int value = file.readInt();
// Move the pointer
file.seek(file.getFilePointer(), 4);
// Write an integer to the file
file.writeInt(value + 1);
Writing moves the
pointer to the next
sequence of bytes
15. Programmazione concorrente e distribuita
FILE LOCKING
If more than a program tries to modify the same
file, it can easily become damaged
The FileLock solves the problem
Use the lock method to lock a file
Use the tryLock method to verify if a file is locked
Returns null if the lock is not available
Use a FileChannel to obtain the lock
The file remains locked until the lock or the channel is closed
It is best to use try-with-resources
15Riccardo Cardin
FileChannel channel = FileChannel.open(path);
FileLock lock = channel.lock();
// or
FileLock lock = channel.tryLock()
16. Programmazione concorrente e distribuita
FILE CREATION
A Path is a sequence of directory names,
optionally followed by a file name
If the first component is a root element, then the
path is absolute, otherwise is relative
Root elements may be «/» or «C:»
The Paths class is a companion class with a lot of utilities
An InvalidPathException will be thrown if the path is
not valid in the given filesystem
A Path does not have to correspond to a file that
actually exists
16Riccardo Cardin
Path abs = Paths.get("/", "home", "rcardin");
Path rel = Paths.get("home", "rcardin");
Path another = Paths.get("/home/rcardin");
17. Programmazione concorrente e distribuita
FILE CREATION
Create a directory
Create only the last part of the path
Create also intermediate directory as well
Create a file
Throws an exception if the file already exists
Check for existence and creation are an atomic operation
17Riccardo Cardin
Files.createDirectory(path);
Files.createDirectories(path);
Files.createFile(path);
// Checks if a file already exists
Files.exists(path);
Files.isDirectory(); // Test if the path is a directory
Files.isRegularFile(); // Test if the path is a regular file
18. Programmazione concorrente e distribuita
FILE CREATION
It’s possible to create temporary file / folders
Names of temp files and folders are generated
randomly
It is possible to give a prefix, a suffix and a target path
If no path is given, the default SO temporary folder is used
The file / folder will not be deleted automatically on
JVM termination
Call File.deleteOnExit() or specify the attribute
StandardOpenOption.DELETE_ON_CLOSE during creation
18Riccardo Cardin
Files.createTempFile(dir, prefix, suffix);
Files.createTempDirectory(dir, prefix)
Files.createTempDirectory(dir, prefix,
StandardOpenOption.DELETE_ON_CLOSE);
19. Programmazione concorrente e distribuita
OPERATIONS ON FILES
Abstractions on file serve to allow you to
operate on them
It’s possible to copy or move a file or an empty dir.
If the StandardCopyOperation.REPLACE_EXISTING is
not set, the above operations will fail if the target exists
With StandardCopyOperation.REPLACE_EXISTING you
can specify to maintain the source file if the operation fails
It’s possible to delete a file or an empty directory
19Riccardo Cardin
Files.copy(fromPath, toPath);
// Moves a file or an empty directory
Files.move(fromPath, toPath);
Files.delete(path)
boolean deleted = Files.deleteIfExists(path)
20. Programmazione concorrente e distribuita
OPERATIONS ON FILES
It is possible to walk through a file tree
Use the walkFileTree method
The FileVisitor interface defines method that are
called prior, during and post directory visit
The class SimpleFileVisitor gives a wrapper
implementation
The FileVisitResult defines how to continue the walking
process
20Riccardo Cardin
// The second argument is an implementation of FileVisitor
Files.walkFileTree(startingDir, fv);
FileVisitResult postVisitDirectory(T dir, IOException exc)
FileVisitResult preVisitDirectory(T dir, BasicFileAttributes attrs)
FileVisitResult visitFile(T file, BasicFileAttributes attrs)
FileVisitResult visitFileFailed(T file, IOException exc)
21. Programmazione concorrente e distribuita
OPERATIONS ON FILES
21Riccardo Cardin
public static class PrintFiles extends SimpleFileVisitor<Path> {
// Print information about each type of file.
@Override
public FileVisitResult visitFile(Path file,
BasicFileAttributes attr) {
if (attr.isSymbolicLink()) {
System.out.format("Symbolic link: %s ", file);
} else if (attr.isRegularFile()) {
System.out.format("Regular file: %s ", file);
} else {
System.out.format("Other: %s ", file);
}
return CONTINUE;
}
// Print each directory visited.
@Override
public FileVisitResult postVisitDirectory(Path dir,
IOException exc) {
System.out.format("Directory: %s%n", dir);
return CONTINUE;
}
}
23. Programmazione concorrente e distribuita
URL CONNECTIONS
The simplest method to read from an URL is to
get an input stream from it
Using URLConnection it is possible to retrieve
additional resources from URL and to write to it
The type has a subtype for each type of URL
Send data to the server using an output stream
23Riccardo Cardin
// Opens an input stream on the URL
InputStream in = url.openStream();
// From an HTTP URL returns an HttpURLConnection
URLConnection con = url.openConnection();
con.setDoOutput(true);
try (OutputStream out = con.getOutputStream()) {
// Write to out
}
24. Programmazione concorrente e distribuita
SERIALIZATION
Representation of an object as a sequence of
bytes, that includes data as well as type
The process is JVM independent
The type must implement java.io.Serializable
Marker interface with no methods
All fields in the class must be serializable. If a field is not, it
must be marked as transient
Transient fields are not serialized and are lost intentionally
A transient variable cannot be final or static
24Riccardo Cardin
public class Employee implements Serializable {
private String name;
private double salary;
// ...
}
25. Programmazione concorrente e distribuita
SERIALIZATION
Serialization process uses the stream API
To serialize objects, use an ObjectOutputStream
The writeObject method serializes the object
To deserialize objects, use an ObjectInputStream
The readObject does the magic
Serialization process is recursive on object attributes
Name of class and name / values of attributes are saved
25Riccardo Cardin
ObjectOutputStream out =
new ObjectOutputStream(Files.newOutputStream(path));
Employee paul = new Employee("Paul", 29000D);
out.writeObject(paul);
ObjectInputStream in =
new ObjectInputStream(Files.newInputStream(path));
Employee paul = (Employee) in.readObject();
26. Programmazione concorrente e distribuita
SERIALIZATION
Serialization works fine with duplicated objects
Each object gets a serial number when it is saved
An ObjectOutputStream checks if an object was previously
written with the same serial number. In that case, it just write out
the serial number
The ObjectInputStream works conversely
It is possible to tweak the serialization process
Redefine readObject and writeObject methods
Use defaultWriteObject and defaultReadObject on streams
26Riccardo Cardin
// Used to deserialize an object
private void readObject(ObjectInputStream in)
throws IOException, ClassNotFoundException
// Used to serialize an object
private void writeObject(ObjectOutputStream out) throws IOException
27. Programmazione concorrente e distribuita
SERIALIZATION
Versioning
If you use serialization for long term persistence, you
need to consider what happens when classes evolve
How do we deserialize objects into new versions of a class?
Serialization supports versioning
Assign a serialVersionUID to a Serializable class. The
version uid is written with other information
When the class change in an incompatible way, modify the
version uid
If uids are different, an InvalidClassException is thrown
Default uid is generated from hash code of the class
27Riccardo Cardin
Private static final long serialVersionUID = 1L;
30. Programmazione concorrente e distribuita
REFERENCES
Chap. 9 «Processing Input and Output», Core Java for the
Impatient, Cay Horstmann, 2015, Addison-Wesley
Walking the File Tree
https://docs.oracle.com/javase/tutorial/essential/io/walk.html
30Riccardo Cardin