Java - Processing input and output

This presentation introduces basic concepts about the Java I/O framework based on input and output streams. These slides introduce the following concepts:

- Obtaining streams
- Reading / writing bytes
- Character encodings
- Text input / output
- Random access files
- Processing files
- URLs
- Serialization

The presentation is took from the Java course I run in the bachelor-level informatics curriculum at the University of Padova.

  1. 1. PROCESSING INPUT AND OUTPUT PROGRAMMAZIONE CONCORRENTE E DISTR. Università degli Studi di Padova Dipartimento di Matematica Corso di Laurea in Informatica, A.A. 2015 – 2016
  2. 2. Programmazione concorrente e distribuita SUMMARY  Introduction  Obtaining streams  Reading / writing bytes  Character encodings  Text input / output  Random access files  Processing files  URLs  Serialization 2Riccardo Cardin
  3. 3. Programmazione concorrente e distribuita INTRODUCTION  Java provides an elegant API to read and write data in binary and text format  The API let’s you work in the same way with files, directories, web pages and so on  A source from which one can read bytes is called an input stream  Bytes can come from a file, a network connection or an array in memory  A destination for bytes in an output stream  In contrast readers and writers consume and produce sequences of characters 3Riccardo Cardin
  4. 4. Programmazione concorrente e distribuita OBTAINING STREAMS  A stream can be obtained from different sources  From a file  From an URL  From an array of bytes  Once the stream is obtained we can I/O from and to the source using the same API 4Riccardo Cardin InputStream in = Files.newInputStream(path); OutputStream out = Files.newOutputStrem(path); URL url = new URL(""); InputStream in = url.openStream(); byte[] bytes = /* ... */; InputStream in = new ByteArrayInputStream(bytes); OutputStream out = new ByteArrayOutputStream(); bytes = out.toByteArray();
  5. 5. Programmazione concorrente e distribuita READING BYTES  The type to read bytes from a source is  An input stream can read a single byte  Cast to byte only after you’ve checked that is not -1  Or it can read in bulk into a byte array 5Riccardo Cardin InputStream in = /* ... */; // A byte is an integer from 0 to 255. -1 is reserved to the // end of the input stream int b =; bytes[] bytes = /* ... */; // Returns the number of actual read bytes int actualBytesRead =; // Writes the bytes from a position of the array, for at max // ‘length’ number of bytes actualBytesRead =, start, length);
  6. 6. Programmazione concorrente e distribuita WRITING BYTES  The type to write bytes to a target is  Output stream can write a single byte or a bytes array  When done, you have to close streams  Streams implement AutoClosable and so you can use a try- with-resources statements 6Riccardo Cardin OutputStream in = /* ... */; int b = /* ... */; out.write(b); byte[] bytes = /* ... */; out.write(bytes); out.write(bytes, start, length) try (OutputStream out = /* ... */) { out.write(bytes); }
  8. 8. Programmazione concorrente e distribuita CHARACTER ENCODINGS  In many cases you will interpret bytes as a sequence of characters  How such characters were encoded?  Java uses Unicode standard for characters  Each char, or code point, has a 21-bit integer number  UTF-8 encondes each Unicode code point into a sequence of one to four bytes  ASCII character set are represented with only 1 byte  UTF-16 encodes Unicode code points into one or two 16-bit values  This encoding is the default used in Java strings 8Riccardo Cardin
  9. 9. Programmazione concorrente e distribuita CHARACTER ENCODINGS  There is no realiable way to automatically detect the encoding from a stream of bytes  You should always explicitly specify the encoding using StandardCharsets class  Use the Charset object when reading or writing text  If you specify nothing, the computer default charset is used 9Riccardo Cardin // Values are of type Charset StandardCharsets.UTF_8 StandardCharsets.UTF_16 StandardCharsets.UTF_16BE StandardCharsets.UTF_16LE StandardCharsets.ISO_8859_1 StandardCharsets.US_ASCII String str = new String(bytes, StandardCharsets.UTF_8)
  10. 10. Programmazione concorrente e distribuita TEXT INPUT  To read text input use a Reader  Obtain a reader from an input stream using an InputStreamReader decorator  It is not very convenient to read a char at time  If you want to read an input line by line, usa the decorator class BufferedReader 10Riccardo Cardin // Here you can view the "onion" structure Reader in = new InputStreamReader(new InputStream(/*...*/, charset); // Reads a code unit between 0 and 65536, or -1 int ch =; try (BufferedReader reader = new BufferedReader(new InputStreamReader( new InputStream(/*...*/)) { // A null is returned when the stream is done String line = reader.readLine(); }
  11. 11. Programmazione concorrente e distribuita TEXT OUTPUT  To write text us the class Writer  To turn an output stream to a writer use OutputStreamWriter decorator  The write method writes strings to the stream  Class PrintWriter adapts the interface of writers to the print, println and printf used with System.out  StringWriter lets you to write a stream into a String  You can combine it with a PrintWriter 11Riccardo Cardin Writer out = new OutputStreamWriter( new OutputStream(/*...*/), charset); out.write("Any string"); PrintWriter wrt = new PrintWriter(out, "UTF-8"); wrt.println("Any string");
  13. 13. Programmazione concorrente e distribuita DEALING WITH BINARY DATA  The DataInput type reads from binary source  The DataOutput type writes in binary format  Dealing with binary data is useful beacuse it is fixed in width and efficient  No parsing operation is needed  The classes DataInputStream and DataOutputStream adapt stream to the interface 13Riccardo Cardin byte readByte(); char readChar(); int readInt(); long readLong(); // ... DataInput in = new DataInputStream(Files.newInputStream(path)); DataOutput out = new DataOutputStream(Files.newOutputStream(path));
  14. 14. Programmazione concorrente e distribuita RANDOM ACCESS FILES  The type RandomAccessFile lets you read or write data anywhere in a file  Implements DataInput / DataOutput interfaces  Use "r" to open in read mode, "rw" in read-write  Such a file has a pointer that indicates the position of the next byte  Use the seek method to position the pointer inside the file 14Riccardo Cardin RandomAccessFile file = new RandomAccessFile(path.toString(), "rw"); // Read the next integer in the file int value = file.readInt(); // Move the pointer, 4); // Write an integer to the file file.writeInt(value + 1); Writing moves the pointer to the next sequence of bytes
  15. 15. Programmazione concorrente e distribuita FILE LOCKING  If more than a program tries to modify the same file, it can easily become damaged  The FileLock solves the problem  Use the lock method to lock a file  Use the tryLock method to verify if a file is locked  Returns null if the lock is not available  Use a FileChannel to obtain the lock  The file remains locked until the lock or the channel is closed  It is best to use try-with-resources 15Riccardo Cardin FileChannel channel =; FileLock lock = channel.lock(); // or FileLock lock = channel.tryLock()
  16. 16. Programmazione concorrente e distribuita FILE CREATION  A Path is a sequence of directory names, optionally followed by a file name  If the first component is a root element, then the path is absolute, otherwise is relative  Root elements may be «/» or «C:»  The Paths class is a companion class with a lot of utilities  An InvalidPathException will be thrown if the path is not valid in the given filesystem  A Path does not have to correspond to a file that actually exists 16Riccardo Cardin Path abs = Paths.get("/", "home", "rcardin"); Path rel = Paths.get("home", "rcardin"); Path another = Paths.get("/home/rcardin");
  17. 17. Programmazione concorrente e distribuita FILE CREATION  Create a directory  Create only the last part of the path  Create also intermediate directory as well  Create a file  Throws an exception if the file already exists  Check for existence and creation are an atomic operation 17Riccardo Cardin Files.createDirectory(path); Files.createDirectories(path); Files.createFile(path); // Checks if a file already exists Files.exists(path); Files.isDirectory(); // Test if the path is a directory Files.isRegularFile(); // Test if the path is a regular file
  18. 18. Programmazione concorrente e distribuita FILE CREATION  It’s possible to create temporary file / folders  Names of temp files and folders are generated randomly  It is possible to give a prefix, a suffix and a target path  If no path is given, the default SO temporary folder is used  The file / folder will not be deleted automatically on JVM termination  Call File.deleteOnExit() or specify the attribute StandardOpenOption.DELETE_ON_CLOSE during creation 18Riccardo Cardin Files.createTempFile(dir, prefix, suffix); Files.createTempDirectory(dir, prefix) Files.createTempDirectory(dir, prefix, StandardOpenOption.DELETE_ON_CLOSE);
  19. 19. Programmazione concorrente e distribuita OPERATIONS ON FILES  Abstractions on file serve to allow you to operate on them  It’s possible to copy or move a file or an empty dir.  If the StandardCopyOperation.REPLACE_EXISTING is not set, the above operations will fail if the target exists  With StandardCopyOperation.REPLACE_EXISTING you can specify to maintain the source file if the operation fails  It’s possible to delete a file or an empty directory 19Riccardo Cardin Files.copy(fromPath, toPath); // Moves a file or an empty directory Files.move(fromPath, toPath); Files.delete(path) boolean deleted = Files.deleteIfExists(path)
  20. 20. Programmazione concorrente e distribuita OPERATIONS ON FILES  It is possible to walk through a file tree  Use the walkFileTree method  The FileVisitor interface defines method that are called prior, during and post directory visit  The class SimpleFileVisitor gives a wrapper implementation  The FileVisitResult defines how to continue the walking process 20Riccardo Cardin // The second argument is an implementation of FileVisitor Files.walkFileTree(startingDir, fv); FileVisitResult postVisitDirectory(T dir, IOException exc) FileVisitResult preVisitDirectory(T dir, BasicFileAttributes attrs) FileVisitResult visitFile(T file, BasicFileAttributes attrs) FileVisitResult visitFileFailed(T file, IOException exc)
  21. 21. Programmazione concorrente e distribuita OPERATIONS ON FILES 21Riccardo Cardin public static class PrintFiles extends SimpleFileVisitor<Path> { // Print information about each type of file. @Override public FileVisitResult visitFile(Path file, BasicFileAttributes attr) { if (attr.isSymbolicLink()) { System.out.format("Symbolic link: %s ", file); } else if (attr.isRegularFile()) { System.out.format("Regular file: %s ", file); } else { System.out.format("Other: %s ", file); } return CONTINUE; } // Print each directory visited. @Override public FileVisitResult postVisitDirectory(Path dir, IOException exc) { System.out.format("Directory: %s%n", dir); return CONTINUE; } }
  23. 23. Programmazione concorrente e distribuita URL CONNECTIONS  The simplest method to read from an URL is to get an input stream from it  Using URLConnection it is possible to retrieve additional resources from URL and to write to it  The type has a subtype for each type of URL  Send data to the server using an output stream 23Riccardo Cardin // Opens an input stream on the URL InputStream in = url.openStream(); // From an HTTP URL returns an HttpURLConnection URLConnection con = url.openConnection(); con.setDoOutput(true); try (OutputStream out = con.getOutputStream()) { // Write to out }
  24. 24. Programmazione concorrente e distribuita SERIALIZATION  Representation of an object as a sequence of bytes, that includes data as well as type  The process is JVM independent  The type must implement  Marker interface with no methods  All fields in the class must be serializable. If a field is not, it must be marked as transient  Transient fields are not serialized and are lost intentionally  A transient variable cannot be final or static 24Riccardo Cardin public class Employee implements Serializable { private String name; private double salary; // ... }
  25. 25. Programmazione concorrente e distribuita SERIALIZATION  Serialization process uses the stream API  To serialize objects, use an ObjectOutputStream  The writeObject method serializes the object  To deserialize objects, use an ObjectInputStream  The readObject does the magic  Serialization process is recursive on object attributes  Name of class and name / values of attributes are saved 25Riccardo Cardin ObjectOutputStream out = new ObjectOutputStream(Files.newOutputStream(path)); Employee paul = new Employee("Paul", 29000D); out.writeObject(paul); ObjectInputStream in = new ObjectInputStream(Files.newInputStream(path)); Employee paul = (Employee) in.readObject();
  26. 26. Programmazione concorrente e distribuita SERIALIZATION  Serialization works fine with duplicated objects  Each object gets a serial number when it is saved  An ObjectOutputStream checks if an object was previously written with the same serial number. In that case, it just write out the serial number  The ObjectInputStream works conversely  It is possible to tweak the serialization process  Redefine readObject and writeObject methods  Use defaultWriteObject and defaultReadObject on streams 26Riccardo Cardin // Used to deserialize an object private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException // Used to serialize an object private void writeObject(ObjectOutputStream out) throws IOException
  27. 27. Programmazione concorrente e distribuita SERIALIZATION  Versioning  If you use serialization for long term persistence, you need to consider what happens when classes evolve  How do we deserialize objects into new versions of a class?  Serialization supports versioning  Assign a serialVersionUID to a Serializable class. The version uid is written with other information  When the class change in an incompatible way, modify the version uid  If uids are different, an InvalidClassException is thrown  Default uid is generated from hash code of the class 27Riccardo Cardin Private static final long serialVersionUID = 1L;
