Implementing the Auphonic Web Application Programming Interface

Implementing the Auphonic Web
Application Programming Interface
André Rattinger

Bachelor Thesis

Institute of Information Systems and Computer Media
Graz University of Technology

Advisor

Univ.-Doz. Dipl.-Ing. Dr. techn. Martin Ebner

Contents

1 Abstract 1

2 Zusammenfassung 2

3 Introduction 3
3.1 Web Application Programming Interfaces (APIs) . . . . . . . . . . . . . . . . . . 3
3.2 Auphonic & API Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

4 Libraries 6
4.1 Django . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.2 Piston & PBS Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

5 Design & Implementation 7
5.1 General Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5.2 RESTful Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.3 Design of Uniform Resource Identiﬁers (URI) . . . . . . . . . . . . . . . . . . . 12
5.4 Data exchange formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.5 Versioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

6 Authentication 15
6.1 HTTP Basic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6.2 OAuth 1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.3 OAuth 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

7 Auphonic Simple API 18

8 Auphonic API 20

9 Details of Productions and Presets 24
9.1 Creating a Production with Detailed Audio Metadata . . . . . . . . . . . . . . . . 25
9.2 Output Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
9.3 Outgoing File Transfers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
9.4 Audio Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
9.5 Creation of Presets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

10 Conclusion and future work 34

i

1 | Abstract

Application Programming Interfaces (API) [C2] allow automated communication and data trans-
fer between different entities in the web. This enables a new range of possibilities for commu-
nity interaction, customization and contribution. Over the past years, most of the major online
service providers such as Google, Facebook and other platforms implemented an Application Pro-
gramming Interface. A vast amount of different approaches in design, techniques and protocols
originated from this trend, like Representational State Transfer [C1] or the Simple Object Access
Protocol (SOAP).
Auphonic is an audio post production web service, that automatically applies different audio algo-
rithms, sets metadata and deploys the input content to several other platforms. An API is used to
enable users to write customized scripts, or integrate Auphonic seamlessly into their systems.
This thesis presents the challenges that arise when creating a scalable interface for an existing web
service, and shares practical solutions for solving these problems. Design concepts that need to be
considered are highlighted in the thesis, as well as used technologies, alternatives, and implemen-
tation details of the Auphonic API.

1

2 | Zusammenfassung

Die Erstellung von Programmierschnittstellen für Web-Applikationen (API) eröffnet die
Möglichkeit für einfachen Datenaustausch zwischen verschiedenen Instanzen im Web. Diese
Vorgehensweise ermöglicht Beiträge, Anpassungen und neue Wege der Interaktion mit Benutzern
oder außenstehenden Entwicklern. In den vergangenen Jahren haben der Großteil der Service-
Anbieter im Internet wie zum Beispiel Google, Facebook oder andere Plattformen eine Program-
mierschnittstelle zur Erweiterung ihrer Funktionalität implementiert. Aus dieser Entwicklung ist
eine Vielfalt von Techniken, Protokollen und Design Prinzipien hervorgegangen, wie Representa-
tional State Transfer oder auch das Simple Object Access Protocol (SOAP).
Auphonic ist ein Web Service der automatische Audio-Nachbearbeitung im Web anbietet. Bei
der Verarbeitung der Audiodateien werden verschiedene Algorithmen angewendet, die unter an-
derem Lautstärken angleichen, Rauschen entfernen oder andere Parameter anpassen. Neben dieser
Grundfunktionalität bietet der Service noch andere Optionen die das Setzen von Metadaten verein-
fachen oder die Veröffentlichung von Audio-Dateien automatisierbar machen. Die Programmier-
schnittstelle wird verwendet um es Benutzern zu ermöglichen eigene Skripte zu schreiben oder
Auphonic nahtlos in ihr eigenes System zu integrieren.
Diese Bachelorarbeit behandelt die Herausforderungen die die Erstellung einer skalierbaren Pro-
grammierschnittstelle mit sich bringt, und zeigt Lösungen für die Problemstellungen die sich bei
dem Design ergeben. Es wird im speziellen auf verwendete Bibliotheken, Techniken und Alter-
nativen eingegangen, die sich bei der Implementierung der Auphonic Programmierschnittstelle
ergeben haben.

2

3 | Introduction

Application Programming Interfaces (APIs) have become the pervasive method to provide and
gather data from external sources. The user gains the possibility to access the resources, but isn’t
bound by the natural limitations of a predesigned offered application.
Past examples show that there is a huge interest by third party developers to develop external
interfaces, caused by the possibilities to extend their own service, or to even earn money with
enhanced user interfaces. The trend of offering interfaces led to a increase in useful libraries in the
past years. Chapter 4 offers an overview of the used libraries in the Auphonic API, and the reasons
behind the choices.
One of the big upsides of enabling clients to access the data directly is extensibility. Some un-
exposed speciﬁcation aspects of the system have to be showcased to enable developers to work
with them. In most use-cases user authentication needs to be handled by another entity in the
middle. This however raises a couple of security concerns and is not well suited for traditional
authentication mechanisms. Chapter discusses different possibilities to solve this problem.
In the Auphonic API some trade-offs had to be made concerning usability. One of the goals is that
normal users could use the API without background knowledge of programming. The Auphonic
Simple API does not require complex parameters, or specially developed authentication methods.
The ﬁrst part of the thesis focuses on the design and technologies that can be used to create an
functioning API.
The chapters starting from “Auphonic Simple API” focus on the detailed implementation of the
system that is deployed on the Auphonic Website.

3.1 Web Application Programming Interfaces (APIs)

Numerous websites provide data feeds, or offer their content in other ways, that can be accessed
by automated queries. Many of these provide methods for data retrieval, but no possibilities for
content creation. Instead of sole distribution, content creation interfaces often require a custom
interface that is depending on the underlying structure of the service. This is the reason why
many standardized methods like Representational State Transfer (REST) or the Atom publishing
protocol (AtomPub) were developed. With each design and protocol decision, a few aspects have
to be considered. Main concerns for the design of the Auphonic API were the protocol and styles

3

Implementing the Auphonic Web Application Programming Interface,

SOAP and REST. SOAP is an established protocol and standard that has wide tool support, but has
interoperability issues and is quite complex REST is a very simple format, and has the advantage
that it is build on Web HTTP Methods and URIs. There are many implementation differences
in REST APIs however, which is bad for future extensions and uses like discoverability 1 and
mashups.
If done right, a web API can provide great possibilities for content creation and mashups between
different services. One of the main objectives in creating the Auphonic API is to adhere as close
as possible to the REST principle in order to make the interoperability as easy as possible.

3.2 Auphonic & API Features

Auphonic is an audio post production web service that enables users with little or no background
knowledge in processing of audio files to enhance and deploy their media. The enhancing stage is
enabled by a couple of automated machine learning algorithms like loudness normalization, noise
reduction and others.
A typical workflow includes the following steps:
• Audio Upload from the own system or from a web service like dropbox, soundcloud and
others.
• Entering Metadata (chapter marks, urls, title), that is applied to the result file, depending on
the encoding.
• Result files and their encoding may be entered.
• The file is enhanced by the auphonic system and encoded in different formats.
• The results are published to the web services or ftp and sftp servers the user entered, and can
be accessed through them or through a direct download.
The system distinguishes between two models in the resource design, the production resource and
the preset resource. Creating a production for post-processing is the the main part of the web-
service and defines all sub-resources. A production in the Auphonic System is a uploaded file that
will be enhanced with metadata and several algorithms depending on the decisions made by the
user.
The typical workflow described above also depicts (Figure 3.1) the creation of an production. The
second big resource, the preset, is used as a template to create a new production by referencing it,
and contains the same information entered by the user except chapter marker resources.
The main purpose of the API is to map all Auphonic features to a programmable interface. Chapter
Markers or other resources can be send in the form of a file or a simple request instead of uploading
them in the online web form.

1
http://stackoverflow.com/questions/9204110/restful-api-runtime-discoverability-hateoas-client-design accessed
on 2013-01-13

3.2. Auphonic & API Features 4


Figure 3.1: The Auphonic web-service

3.2. Auphonic & API Features 5

4 | Libraries

4.1 Django

Django 1 is the high-level python web development framework the Auphonic web site is based on.
Therefore most of the following libraries are based on the Model-Template-View design princi-
ple, which is similar to the Model-View-Controller 2 design principle. The main idea behind the
principle is to split the representation, the data structures and the logical part in different layers.
Django as a library provides a rich set of functionalities, which is used throughout the Auphonic
system and the API. The framework eases the creation of complex websites, and enables the cre-
ation of dynamic sites in a short time.

4.2 Piston & PBS Education

Piston 3 is the Django application used to create the API. It provides the necessary features to build
an Restful API, and integrates Django functionality.
Auphonic uses a modiﬁed version of the framework from PBS Education 4 . The PBS Education
Version offers features for better error handling, easier deﬁnition of the returned data (Piston-
Views) and pluggable envelopes. Piston bases the returned information entirely on the internal
data structures in django, an approach that is unpractical for the Auphonic API, because some re-
quests should return aggregated data. Another one of the advantages piston offers is the possibility
to entirely rewrite the returned information structures.

1
https://www.djangoproject.com/ accessed on 2013-01-13
2
http://www.codinghorror.com/blog/2008/05/understanding-model-view-controller.html accessed on 2013-01-13
3
https://bitbucket.org/jespern/django-piston/wiki/Home accessed on 2013-01-13
4
https://github.com/pbs-education/django-piston accessed on 2013-01-13

6

5 | Design & Implementation

5.1 General Architecture

5.1.1 Resources

One of the main principles behind the Auphonic API is to offer the content in the form of Resources
Views. A resource is mapped to an url and different functionality from the same url can be requested
by using different kinds of HTTP requests types like GET, POST, PUSH or DELETE. A Resource
can map anything from a simple log (Figure 5.1) or a blogpost, but can also represent more complex
data-structures. The most used resources in the Auphonic API are the main parts of the web site,
productions and presets.
class LogHandler(BaseHandler):
# allowed methods in REST usually are GET, POST, PUSH, DELETE
allowed_methods = (’GET’, ’POST’,)

# GET maps to the read function
def read(self, request):
log = read_log(request)
# returned data is automatically converted to the requested type
# the default response type int the auphonic api is json
return Log

# POST maps to the create function
def create(self, request):
write_log(request)
return

Figure 5.1: LogHandler: A simple Log Resource with GET and POST deﬁned.

7


5.1.2 Views

The piston version from PBS Education offers the possibility to write custom views. This practice
is heavily used in the Auphonic API. Information exchange structures are organized in views,
which hold a detailed blueprint for the returned information (Figure 5.2).
class AlgorithmView(PistonView):
fields = [
’hipfilter’,
’leveler’,
’normloudness’,
’loudnesstarget’,
’denoise’,
’denoiseamount’,
]

Figure 5.2: AlgorithmView: Every ﬁeld of the returned data can be speciﬁed.

curl -X POST -H "Content-Type: application/json"
https://auphonic.com/api/productions.json
-u username:password
-d ’{
"preset": "ceigtvDv8jH6NaK52Z5eXH",
"metadata": { "title": "My first Production" }
}’

Figure 5.3: Sample request PBS Learning and Piston

{
"status_code": 200,
"form_errors": {},
"error_code": null,
"error_message": "",
"data": {
...
"uuid": "KKw7AxpLrDBQKLVnQCBtCh",
...
"algorithms": {
"hipfilter": true,
"leveler": false,
"denoise": false,
"normloudness": true,
"loudnesstarget": -23,
"denoiseamount": 12
},

5.1. General Architecture 8


}
}

Figure 5.4: Response with PBS Learning and Piston

The requested data is (Figure 5.3) returned in an envelope (Figure 5.4), containing error informa-
tion and the status code for easy parsing of error messages.

5.2 RESTful Web Services

Representational State Transfer (REST) defines a set of software engineering principles that con-
tain rules and constrains for communication between web systems. One of the goals of REST is to
enhance existing infrastructures. This is especially useful in combination with the HTTP protocol
1
. It applies constraints to the way that HTTP requests are used in order to create a system that
flows naturally to achieve a design result in which the desired system behavior is met.
Basic Design Principles behind these constraints are listed in the following sections.

5.2.1 Client-server

The representation of the data is completely separated from the data storage and logic. This intro-
duces better portability and scalability. Clients and Servers are in no way involved with each other,
meaning that they can evolve and change independently. This has several advantages in combi-
nation with complex systems that are updated independently from the API itself and only need a
common interface to retain full functionality like the Auphonic Mobile Application 2 .

5.2.2 HTTP methods are used explicitly

Four of the basic HTTP methods have an explicitly defined use:
• GET: retrieve a resource
• POST: create a resource
• PUT: update or change a resource
• DELETE: delete a resource
Many Web APIs use HTTP Methods in a way that is not intended, leading to great inconsistencies
in implementations. GET parameters for example are described as search parameters in HTTP/1.1
RFC. Therefore a resource or a list of resources should be returned by the request.
1
http://www.ics.uci.edu/~fielding/pubs/dissertation/evaluation.htm#sec_6_3 accessed on 2013-01-23
2
https://auphonic.com/blog/2012/11/19/auphonic-mobile-app-ios/ accessed on 2013-01-23

5.2. RESTful Web Services 9


Multiple web APIs and web services use forms that are incorrect in REST [C1] like in the following
listing:
GET /adduser?name=username HTTP/1.1

The correct version would be to issue a POST request:
POST /users HTTP/1.1
Host: server
Content-Type: application/xml
<?xml version="1.0"?>
<user>
<name>username</name>
</user>

This leads to more natural guessable URIs and helps in creating a coherent URI schema in the API.

5.2.3 Statelessness

Any request needs to contain all the information necessary to understand the request the server
receives from the client without using a session state on the server.
This means that the server does not need to retrieve the application context or state. This has several
advantages in scalability and performance, because without the server state, synchronization and
lookups of the session states are removed. The example (Figure 5.1) pictures a stateful design
approach. The client sends it request to the server assuming that the server keeps track of the
last page the client visited. This means that the server needs to save a state for every client he
encounters.

Figure 5.5: Stateful Design [C1]. The next page is requested, and the client assumes that the
service tracks which page was visited.

The stateless approach (Figure 5.2) offers a solution to the problem, by simply letting the client
keep track of its own state. In the example the next page is send by the client instead of the lookup
by the server.



Figure 5.6: Stateless Design [C1]. The client takes care of the page counter and sends the server
the next page it wants to retrieve.

5.2.4 Cacheable

Resources provided by the API need to be declared as cacheable or non-cacheable. This can
provide efficiency and scalability to a system, and prevents clients from working with outdated
or irrelevant data, and even eliminates unnecessary information exchange with the server. Almost
every piece of information in the Auphonic API is prone to changes and not available in a static
context. Parts of the returned requests are even created on runtime, which causes a problem in
context of caching. If the need for caching, caused by speed issues, arises in the future, specific
parts of the response might be cached.
This is disabled in the django url scheme by:
never_cache(resource)

The consequence is that the user always works with the correct information and does not overwrite
changes in further requests.

5.2.5 Layered system

The client sees only the first layer of a multi-layered system but no intermediary systems. The
server should have the means to run on a multi-server configuration, but should always provide
a common interface. This improves overall system complexity and enables encapsulating legacy
software.
This is a non-issue for the Auphonic web site which contains no legacy software and provides all
its services from a single-layered system.

5.2.6 Code on demand

The Code on demand constraint is an optional constraint that enables extending a service by down-
loading and executing code from the server. This is a technique that is used by the Auphonic mobile



application, which accesses its current implementation through a web-view, and therefore updates
automatically if the version on the server changes. Current versions are hosted on the server, but
this option is only provided for the mobile application at the moment, but it might be extended for
third party developers when needed.

5.3 Design of Uniform Resource Identiﬁers (URI)

Good URI design means that many URIs can be guessed by the client developer or the client
application and leads to an intuitive API. This makes client development easier and has the effect
that URIs are self-explanatory to a point where less documentation of the basic structures is needed.
This enhances the overall usability and accomplishes to reach a certain level of predictability.
One way to achieve this goal is to use nouns instead of verbs. A bad example of using a verb is to
do something like in the following example:
GET /add?type=user&name=username HTTP/1.1

This example is particulary bad as it misuses REST principles, and makes it unclear how other
structures of the services operate. A better request would be:
POST /users HTTP/1.1
Host: server
Content-Type: application/json
{
"username": "{username}"
}

Another important guideline is to use directory-like URIs, in a way a user would suspect that the
data is stored on the server. This approach is also used in the Auphonic API. If a user wants to
retrieve a list of all productions, the user simply calls:
GET /api/productions.json HTTP/1.1

The exact same thing can be done for presets, but with a different virtual directory:
GET /api/presets.json HTTP/1.1

The API also reaches an additional goal with this approach, unintuitive query strings are entirely
avoided and not used at all by the web site. It is generally advisable to bypass the use of query
strings as much as possible, because of bad predictability, which points at bad URI design.
The behavior of accessing similar directory-like structures is only broken in the context of ac-
cessing single elements. To keep the URIs readable and predictable we use the singular forms to
address single productions or presets:
GET /api/production/{uuid}.json HTTP/1.1

5.3. Design of Uniform Resource Identiﬁers (URI) 12


Another important part is to remove any unnecessary information from the URIs, such as server-
side extensions from scripting languages like .php.
Showing this information to the client provides no additional value, and can be easily replaced by
a useful extension. In the case of the Auphonic API, the extension replaced the response format,
which signals the data encoding format to the server. This also has the benefit that the user knows
exactly what to expect from the server response.
This is another example for a directory-like approach with the effect of looking like a direct file
access on a local system. Furthermore it removes the need for an query string in the request for the
return format.
Other good practices in URI Design include using underlines instead of spaces and keeping every
access lowercase:
GET /api/production/{uuid}/outgoing_services HTTP/1.1

5.4 Data exchange formats

A great number of notation formats can be chosen for the data representation in the communication
with a service. The Auphonic API uses json as the default data transfer method. All inputs in the
main part of the API need to be sent as json to the server. Describing a model gets easier, because
json allows it to model hierarchical data-structures.
It is possible to return multiple formats with the API by adding a format specifier to the URI:
GET /api/productions.{format} HTTP/1.1

Possible formats are json, jsonp and xml. To receive all productions as xml a user has to send the
request:
GET /api/productions.xml HTTP/1.1

5.5 Versioning

The URIs in the API should be static. This has the implication that if implementation details
change, backwards compatibility is not maintained. Versioning of the API helps to overcome the
problem. The standard way of addressing the API may change their background implementation,
but stay static:
GET /api/productions.json HTTP/1.1

If a client relies on old features, it could access the API by issuing:

5.4. Data exchange formats 13


GET /api/{version}/productions.json HTTP/1.1

Where version for example is:
GET /api/v1/productions.json HTTP/1.1

5.5. Versioning 14

6 | Authentication

The standard way to access resources in the web, is to authenticate a user with a combination of
two text strings, the username and password. This method might be secure enough for direct web
interaction by the user, but if the API provider enables a third party to access resources on behalf of
the user, security concerns arise. Untrusted parties for example should never be allowed to handle
the unencrypted credentials, because of the possibilities for abuse.
Auphonic uses the HTTP Basic method for users who want to use the simple API, for batch pro-
ductions or other simple tasks. OAuth 2.0 workﬂows are used for applications that can be used by
multiple users. The following chapter lists the most popular choices and the reasons why they are
used in Auphonic.

6.1 HTTP Basic

HTTP Basic authentication is probably the most used authentication method in the web, and pro-
vides a very simple way to authenticate a user:
curl https://auphonic.com/api/{command}.{format} -u username:password

A secure HTTP Basic requires an HTTPS connection with TLS or SSL [C10].
The Client-side approach demonstrates how to use HTTP Basic Authentication:
• Encode username and password with base64 in the form username:password
• Add an Authorization header with “Basic base64-encrypted-string”
It’s not encouraged to use this method in Auphonic third party applications, and clients that violate
the rule might be disabled. This method can still be used in simple scripts that don’t store any user
credentials.

15


6.2 OAuth 1.0

The limitations of HTTP basic authentication and other methods such as OpenID [C3] led to the
development of other protocols such as Flickr Auth 1 , Google AuthSub 2 or Yahoo! BBAuth 3 .
Because neither of these solutions were open, OAuth was developed as an open standard, inspired
by the same design principles as OpenID, and based on Flickr’s API Auth and Google’s AuthSub.
OAuth provides a way for clients to access resources on behalf of their users, without sharing their
credentials.
The third party application requests access to the user resources, and the user has to give permission
[C5]. Now the application obtains the permission in the form of a token and matching shared-
secret. The obtained token is used to access user resources, without the user sharing his credentials.
Tokens can be revoked at all times, and may have a special scope and limited lifetime, unlike the
user credentials (username & password).
OAuth 1.0 was often criticized because its difficulty and bad documentation of cryptographic pa-
rameters 4 led to failed or bad implementation attempts.

6.3 OAuth 2.0

OAuth 2.0 is the next version of the OAuth protocol [C6], and is not backwards compatible with the
previous protocol version. The new version is developed, because some aspects like cryptographic
requirements and general user experience needed upgrades. Additional to the easier implementa-
tion, OAuth 2.0 offers new flows to increase the user experience. One of the main differences is
that it offers the possibility of flows, entirely without any cryptographic methods, only by using
TLS or SSL from the HTTPS Protocol [C11]. Basic flows like the webserver flow are essentially
the same, but lack some of the cryptographic requirements. The basic flow can be seen in Basic
OAuth 2.0 Authentication flow [C12].
OAuth 2.0 flows:
• Web Server Flow: The same principal that OAuth 1.0 provided, but with simpler parameters.
• Web Browser Flow: May be used on the client side by an user-agent (Browser).
• Device Flow: If the application is used from a device that has no possibility for direct access
to the service, the tokens can be obtained by accessing the service with a browser or another
user agent on a different device.
• Username and Password Flow: If the user trusts the client, but it’s insecure to store the user
credentials. This method is used in the Auphonic API to authenticate the mobile application,
1
http://www.flickr.com/services/api/auth.spec.html accessed on 2013-01-21
2
https://developers.google.com/accounts/docs/AuthSub accessed on 2013-01-21
3
http://developer.yahoo.com/auth/ accessed on 2013-01-21
4
http://josephsmarr.com/2009/02/17/implementing-oauth-is-still-too-hard-but-it-doesnt-have-to-be/ accessed on
2013-01-21

6.2. OAuth 1.0 16


but should only be used if it is necessary. User credentials might still be compromised by
third party applications.
• Client Credentials Flow: The client (third party application) uses its own credentials to gain
access to it’s own resources.
The Auphonic API currently supports the web server and the username and password flow. Other
flows can be easily activated in the hiidef 5 authentication framework if the need arises.

Figure 6.1: Basic OAuth 2.0 Authentication flow [C12]

The OAuth 2.0 flows establishes various new possibilities for authentication, but also introduces
new problems [C8]. The total lack of signatures and cryptography means that tokens are inherently
insecure, and all protocol security relies on the security of HTTPS. Other than that necessary token
state management through expiring tokens render an implementation harder than an OAuth 1.0
implementation (except for the crypthographic parameters).

5
http://hiidef.github.com/oauth2app/ accessed on 2013-01-21

6.3. OAuth 2.0 17

7 | Auphonic Simple API

The Auphonic Simple API is used for basic scripts like batch processing of multiple files. It is
possible to save and start productions in a single request by referencing an existing preset, without
storing an internal state, complying to REST constraints.
This makes it ideal for quick shell scripts 1 and basic integrations 2 . In this version of the API it is
possible to create a production without the usage of data structures like json or xml.
All following examples are displayed with the command line tool curl 3 , which supports many
url based data transfer methods, as well as user-password authentication. With the approach in the
simple API it is possible to create an easy script for a batch upload with only one command (Figure
7.1).
for f in $FILES
do
curl -X POST https://auphonic.com/api/simple/productions.json
-u "$username:$password"
-F "preset=$preset"
-F "action=start"
-F "input_file=@$f"
done

Figure 7.1: Batch script for uploading multiple files.

The example iterates through a number of files, uploads them and starts their productions on the
server. The simple API always uses an existing preset 4 from the Auphonic server as basis for
common used data, although a number of parameters can be overwritten as well.
The most basic action that can be taken by an API user is uploading a file and starting a production
(Figure 7.2).
1
https://github.com/auphonic/auphonic-api-examples accessed on 2013-01-23
2
https://auphonic.com/blog/2012/10/08/auphonic-audio-processing-web-api-version-1-released/ accessed on
2013-01-23
3
http://curl.haxx.se/ accessed on 2013-01-23
4
https://auphonic.com/engine/presets/ accessed on 2013-01-23

18


-F "preset=ceigtvDv8jH6NaK52Z5eXH"
-F "title=My First Production"
-F "input_file=@/home/user/audio_or_video_file.mp3"
-F "action=start"

Figure 7.2: Batch script for uploading multiple files.

Auphonic supports many different upload types like uploads by url (HTTP) (Figure 7.3), or uploads
from services like dropbox and soudcloud (Figure 7.4). The services can be registered at the web
site and referenced with an unique id, and a filename to upload the file and start the production.
-F "title=My First HTTP Production"
-F "input_file=http://the_server.com/somefile.mp3"
-F "action=start"

Figure 7.3: Upload with HTTP.

-F "service=pmefeNCzkyT4TbRbDmoCDf"
-F "input_file=my_dropbox_file.mp3"
-F "title=My First Dropbox Production"
-F "action=start"

Figure 7.4: Upload with Dropbox/Soundcloud.

The following pages will discuss the more complex JSON-based API.

19

8 | Auphonic API

The main part of the Auphonic API is based on the format JSON 1 for data exchange, but can return
other data formats based on the url schema (Figure 8.1). JSON enables the system to encapsulate
complex data in lists and dictionaries, which simpliﬁes the data transfer and makes the requests
more humanly readable (Figure 8.2).
-F "preset={preset_uuid}"
-F "service=pmefeNCzkyT4TbRbDmoCDf"
-F "input_file=local_file.mp3"
-F "title=Big Request with Details"
-F "artist=The Artist"
-F "album=Our Album"
-F "track=1"
-F "subtitle=Our subtitle"
-F "append_chapters=true"
-F "summary=Our very long summary."
-F "genre=Podcast"
-F "year=2012"
-F "publisher=that’s me"
-F "url=https://auphonic.com"
-F "license=Creative Commons Attribution 3.0 Austria"
-F "license_url=http://creativecommons.org/licenses/by/3.0/at/"
-F "tags=podcast, auphonic api, metadata"
-F "image=@/home/user/a_image_used_as_thumbnail.jpg"
-F "output_basename=basename"
-F "chapters=@/home/user/chapters.txt"
-F "hipfilter=false"
-F "leveler=false"
-F "normloudness=false"
-F "denoise=true"
-F "loudnesstarget=-23"
-F "action=start"

1
http://www.json.org/ accessed on 2013-01-23

20


Figure 8.1: Simple Request: A complete request to the service with all supported ﬁelds in the
non-JSON based simple API.

-d ’{
"preset": "{preset_uuid}",
"metadata": {
"title": "Production Title",
"artist": "The Artist",
"album": "Our Album",
"track": 1,
"subtitle": "Our subtitle",
"append_chapters": true,
"summary": "Our very long summary.",
"genre": "Podcast",
"year": 2012,
"publisher": "that’s me",
"url": "https://auphonic.com",
"license": "Creative Commons Attribution 3.0 Austria",
"license_url": "http://creativecommons.org/licenses/by/3.0/at/",
"tags": ["podcast", "auphonic api", "metadata"]
},
"output_basename": "production-filename",
"algorithms": {
"hipfilter": true, "leveler": true,
"normloudness": true, "denoise": false,
"loudnesstarget": -23
}
}’

Figure 8.2: JSON Request: JSON helps in structuring the request, and provides better readability.

At the same time however, the requests become more complex. With the used libraries and HTTP
in general, it is not supported to encode and transfer data while using JSON as content type.
The easiest solution, without providing client side libraries, is to split the ﬁle upload from local
computers from the JSON part of the requests.
Three steps are needed with this approach to create and start a production:
Step 1: The user creates a production and references a preset if one should be used (Figure 8.3).

21


-d ’{
"preset": "{preset_uuid}",
"metadata": {
},
"output_files": [
{"format": "mp3", "bitrate": "96",
"mono_mixdown": true},
{"format":"aac", "bitrate":"32",
"suffix":"-small", "ending":"m4a"},
],
"algorithms": {
"hipfilter": true, "leveler": true,
"loudnesstarget": -23,
},
}’

Figure 8.3: The request returns the full production with information like the uuid which can be
used for further requests.

Step 2 (optional): The user uploads a file and adds it to the newly created production (Figure 8.4).
curl -X POST https://auphonic.com/api/production/{uuid}/upload.json
-F "input_file=@/home/user/the_audio_file.mp3"

Figure 8.4: Fileupload to the Auphonic server.

Step 3: After all data is entered and a file was added, the user can start the audio post-production
on the Auphonic servers (Figure 8.5).
curl -X POST https://auphonic.com/api/production/{uuid}/start.json

Figure 8.5: Starting a production.

This approach is especially useful for more complex workflows and detailed control of the ex-
changed data. The general format for data exchange is json, but xml or jsonp can as well be used
for the returned information (Figure 8.6).

22


{
"status_code": 200,
"form_errors": {},
"error_code": null,
"data": {
"status": 3,
"status_string": "Done",
...

}

Figure 8.6: Returned information with json.

Adding a ﬁle from the previous example, but returning xml instead (Figure 8.7).
<?xml version="1.0" encoding="UTF-8"?>
<response>
<status_code>200</status_code>
<form_errors />
<error_code>None</error_code>
<error_message />
<data>
<resource>
<status>3</status>
<status_string>Done</status_string>
...
</resource>
</data>
</response>

Figure 8.7: Returned information with xml.

Which is done by changing the return format for the ﬁle upload request:
curl -X POST https://auphonic.com/api/production/{uuid}/upload.xml
-F "input_file=@/home/user/the_audio_file.mp3"

23

9 | Details of Productions and Presets

The most important parts of the Auphonic API and the web service are productions and presets.
This chapter points out the detailed structure of the requests and explains the different combinations
of these resources provide.
Presets can be created in a similar way as productions:
https://auphonic.com/api/{presets|productions}.json

There are only two differences in the general structure of presets and productions, because a preset
basically defines a subset of production functionality, and does not have a file, or any metadata
belonging to the file like chapter markers.
The basic syntax for creating and changing productions and presets is very similar, only the URLs
are different.
Create a production/preset:
/api/{presets|productions}.json
This creates a new preset or production, where {presets|productions} should
be either presets or productions.
Change a production/preset:
/api/{preset|production}/{uuid}/{command}.json
All parts of a production or a preset can be changed separately. This has the advantage
that writing requests for testing becomes much easier for a user. At the same time all
data can still be changed at once.
Upload files:
/api/{preset|production}/{uuid}/upload.json
File uploads must be handled in an additional request, and must be encoded in
multipart/form-data, not JSON. This is necessary to let the server in which
way he should handle the data he receives from API requests.
Start a production:
/api/production/{uuid}/start.json

24


This starts the audio post production of the production with the given UUID.

9.1 Creating a Production with Detailed Audio Metadata

Audio metadata can be set using the metadata parameter (Figure 9.1).
-d ’{
"metadata": {
"artist": "The Artist",
"album": "Our Album",
"track": 1,
"subtitle": "Our subtitle",
"append_chapters": true,
"summary": "Our very long summary.",
"genre": "Podcast",
"year": 2012,
"publisher": "that’s me",
"url": "https://auphonic.com",
"license": "Creative Commons Attribution 3.0 Austria",
"license_url": "http://creativecommons.org/licenses/by/3.0/at/",
"tags": ["podcast", "auphonic api", "metadata"]
}
}’

Figure 9.1: Create a new production and add metadata.

The response contains the UUID of the created production/preset (Figure 9.2).
{
"status_code": 200,
...
"data": {
...
"uuid": "KKw7AxpLrDBQKLVnQCBtCh",
...
}
}

Figure 9.2: Response for a new production.

9.1. Creating a Production with Detailed Audio Metadata 25


All metadata can be changed after this step, by resetting old data, or using a dedicated request to
the url:
https://auphonic.com/api/production/{uuid}/metadata.json

9.2 Output Files

The same mechanism for adding metadata can be used for adding output files to a production or a
preset. Adding output files to an existing production/preset:
curl -H "Content-Type: application/json" -X POST
https://auphonic.com/api/production/{uuid}/output_files.json
-d ’[{"format":"mp3","bitrate":"96"},
{"format":"aac","bitrate":"64"},
{"format":"flac"}]’

All Auphonic output files were made available in the API as well 1 . Another one of the goals
of the API was to make information about the Auphonic system public. This information is often
detailed production information that enables the user to parse, what is supported by the Auphonic
subsystems.
It’s possible to query all available formats, bitrates and filename endings (Figure 9.3):
curl https://auphonic.com/api/info/output_files.json

{
"status_code": 200,
...
"data": {

"mp3": {
"type": "lossy",
"bitrates": ["32", "40", "48", ... ],
"bitrate_strings":
["32 kbps (~14MB/h)",
"40 kbps (~18MB/h)" , ...],
"display_name": "MP3",
"default_bitrate": "96",
"endings": ["mp3"]
},
"opus": {
"type": "lossy",
"bitrates": ["6", "12", "16", ...],
1
https://auphonic.com/blog/2011/07/13/audio-file-formats-podcasts/ accessed on 2013-01-23

9.2. Output Files 26


"bitrate_strings":
["~6 kbps (~3MB/h)",
"~12 kbps (~5MB/h)", ...],
"display_name": "Opus",
"endings": ["opus"],
},
"aac": {
"type": "lossy",
"bitrates": ["24", "32", "40", "48", ... ],
"bitrate_strings":
["24 kbps, HE AAC (~11MB/h)",
"32 kbps, HE AAC (~14MB/h)", ...],
"display_name": "AAC (M4A, MP4, M4B)",
"endings": ["mp4", "m4a", "m4b"]
},
"vorbis": {
"type": "lossy",
"bitrates": ["32", "40", "48", ... ],
"bitrate_strings":
["~32 kbps (~14MB/h)",
"~40 kbps (~18MB/h)" , ...],
"display_name": "Ogg Vorbis",
"endings": ["ogg", "oga"]
},
...
}
}

Figure 9.3: Detailed information for creating formats.

If no bitrate is specified in lossy audio formats, the default_bitrate is taken.

9.2.1 Setting Filenames

It’s also possible to control how output filenames are constructed, giving a more customizable
access to the API user, which allows detailed control over the results.
{"format":"mp3", "bitrate":"48", "filename":"TheFilename.mp3",
"mono_mixdown":true}
Directly generates the file TheFilename.mp3.
{"format":"aac", "bitrate":"64", "ending":"mp4"}



This will take the input file basename and adds mp4 as ending, e.g. original-
filename.mp4.
A filename in auphonic consists of an output_basename, suffix
and an ending. The output basename can be set using the
URL https://auphonic.com/api/productions.json or
https://auphonic.com/api/production/{uuid}.json (Figure 9.4).
curl -X POST -H
"Content-Type: application/json"
-d ’{
...
"output_files": [
"suffix":"-small", "ending":"m4a"}
"ending":"m4a"}
]
...
}’

Figure 9.4: Set the output filename.

This will create the output files production-filename-small.m4a and
production-filename.m4a.
It’s also possible to set the full output filename for an output format. The full filename has priority
over the single components like suffix, basename, etc. (Figure 9.5).
https://auphonic.com/api/production/KKw7AxpLrDBQKLVnQCBtCh.json
-d ’{
...
"output_files": [
{"format":"mp3", "bitrate":"48",
"filename":"TheFilename1.mp3", "suffix": "-small"}
"filename":"TheFilename2.mp4", "ending": "m4a"}
]
...
}’

This will create the output files TheFilename1.mp3 and TheFilename2.mp4.



Figure 9.5: Create a detailed ﬁlename.

9.3 Outgoing File Transfers

Auphonic provides API implementation to a number of external services, such as youtube, sound-
cloud, dropbox and others. File transfers to this services can be directly accessed trough the API
by referencing them in a request:
https://auphonic.com/api/production/{uuid}/outgoing_services.json
-d ’[{"uuid": "{service1_uuid}"}, {"uuid": "{service2_uuid}"}, ...]’

where {service1_uuid}, {service2_uuid}, etc. is the UUID of the external service the
user wants to add for outgoing ﬁle transfers. It’s also possible to query all registered external
services (Figure 9.6) of a user:
curl https://auphonic.com/api/services.json -u username:password

{
"status_code": 200,
...
"data": [
{
"display_name": "my soundcloud account",
"type": "soundcloud",
"uuid": "Asu5PxueRRxtqfZhe7zdia",
"incoming": true,
"outgoing": true
},
{
"display_name": "my ftp server",
"path": "mirror/",
"host": "ftp.myserver.at",
"type": "ftp",
"uuid": "r6MSycBwyeWFAJYqUKtGeX",
"port": 21,
"base_url": "",
"permissions": "",
"incoming": true,
"outgoing": true
},
...
]
}

9.3. Outgoing File Transfers 29


Figure 9.6: Get registered external services.

All possible external service types on auphonic.com and their parameters can be queried with
(Figure 9.7):
curl https://auphonic.com/api/info/service_types.json

This also includes all available parameter values for YouTube and SoundCloud outgoing ﬁle trans-
fers.
{
"status_code": 200,
...
"data": {
"youtube": {
"display_name": "YouTube",
"parameters": {
"category": {
"default_value": "",
"type": "select",
"display_name": "Category",
...
},
...
}
},
"dropbox": {
"display_name": "Dropbox",
"parameters": null
},
...
}
}

Figure 9.7: Get service types and ﬁelds.

9.4 Audio Algorithms

Audio Algorithms can also be set in the same way (Figure 9.8).
https://auphonic.com/api/production/{uuid}/algorithms.json

9.4. Audio Algorithms 30


-d ’{"hipfilter": true, "leveler": true,
"loudnesstarget": -23}’

Figure 9.8: Add Audio Algorithms to the request.

Available Audio Algorithms and parameters can be queried by (Figure 9.9):
curl https://auphonic.com/api/info/algorithms.json

{
"status_code": 200,
"form_errors": {},
"error_code": null,
"data": {
"hipfilter": {
"default_value": true,
"type": "checkbox",
"display_name": "Filtering",
"description":
"Filters unnecessary and disturbing low frequencies
depending on the context (speech, music, noise)."
},
"denoise": {
"default_value": false,
"type": "checkbox",
"display_name": "Noise Reduction ",
"description": "Classifies regions with different
background noises and automatically removes noise and hum."
},
"leveler": {
"type": "checkbox",
"display_name": "Adaptive Leveler ",
"description": "Corrects level differences between speakers,
music and speech, etc. to achieve a balanced overall loudness."
},
"normloudness": {
"type": "checkbox",
"display_name": "Global Loudness Normalization",
"description": "Adjusts the global, overall loudness to the
specified Loudness Target, so that all processed files have
a similar average loudness."
},

9.4. Audio Algorithms 31


"loudnesstarget": {
"default_value": -18,
"type": "select",
"display_name": "Loudness Target",
"description": "Select the loudness target in LUFS for Global
Loudness Normalization, higher values result in louder audio
outputs.",
"options": [
{
"display_name": "-15 LUFS (loud)",
"value": -15
},
{
"display_name": "-18 LUFS (internet audio)",
"value": -18
},
...
]
}
}
}

Figure 9.9: All available audio algorithms are shown.

9.5 Creation of Presets

The creation of presets is mostly equivalent, with the exception that the user has to provide a
preset_name and that it’s not possible to add chapters or an input ﬁle (Figure 9.10).
https://auphonic.com/api/presets.json
-d ’{
"preset_name": "The New Preset",
"metadata": {
...
},
...
}’

Figure 9.10: Create a preset.

Furthermore the user can change and update existing presets with the command:

9.5. Creation of Presets 32


https://auphonic.com/api/preset/{uuid}/{command}.json

where command should be one of metadata, output_files, outgoing_services or
algorithms.

9.5. Creation of Presets 33

10 | Conclusion and future work

Almost all functionalities of the Auphonic web service is covered by the API methods, enabling
users or third party developers to create their own functionality within the system infrastructure.
The API provides an interface that is easily testable, and introduced the possibility to write tests
for the majority of the internal system.
Many design decisions needed to be made, in order to create a practicable, scalable interface, that
can be used by normal users, as well as advanced developers. Especially the implementation of
a second easier version of the API that does not rely on JSON, is a measure that can improve the
overall usability of an API.
It is possible to create a fast interface, that enables developers to implement the clients without
the use of caching or long loading screens. This is required for the mobile application 1 and other
projects, such as a publishing system 2 , which solely rely on the API for their data manipulation
and exchange with other web services.
Future versions will focus on extending the API to cover the entire system functionality. This will
enable a rewrite of the web service, to only operate on API functions, and providing a common,
speciﬁed and well documented interface for the whole system.

1
https://github.com/auphonic/auphonic-mobile accessed on 2013-01-23
2
https://github.com/tisba/gst-kitchen accessed on 2013-01-23

34

List of Figures

3.1 The Auphonic web-service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

5.1 LogHandler: A simple Log Resource with GET and POST defined. . . . . . . . . . 7
5.2 AlgorithmView: Every field of the returned data can be specified. . . . . . . . . . . 8
5.3 Sample request PBS Learning and Piston . . . . . . . . . . . . . . . . . . . . . . 8
5.4 Response with PBS Learning and Piston . . . . . . . . . . . . . . . . . . . . . . . 9
5.5 Stateful Design [C1]. The next page is requested, and the client assumes that the
service tracks which page was visited. . . . . . . . . . . . . . . . . . . . . . . . . 10
5.6 Stateless Design [C1]. The client takes care of the page counter and sends the
server the next page it wants to retrieve. . . . . . . . . . . . . . . . . . . . . . . . 11

6.1 Basic OAuth 2.0 Authentication flow [C12] . . . . . . . . . . . . . . . . . . . . . 17

7.1 Batch script for uploading multiple files. . . . . . . . . . . . . . . . . . . . . . . . 18
7.2 Batch script for uploading multiple files. . . . . . . . . . . . . . . . . . . . . . . . 19
7.3 Upload with HTTP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
7.4 Upload with Dropbox/Soundcloud. . . . . . . . . . . . . . . . . . . . . . . . . . . 19

8.1 Simple Request: A complete request to the service with all supported fields in the
non-JSON based simple API. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
8.2 JSON Request: JSON helps in structuring the request, and provides better readability. 21
8.3 The request returns the full production with information like the uuid which can
be used for further requests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
8.4 Fileupload to the Auphonic server. . . . . . . . . . . . . . . . . . . . . . . . . . . 22
8.5 Starting a production. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
8.6 Returned information with json. . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
8.7 Returned information with xml. . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

9.1 Create a new production and add metadata. . . . . . . . . . . . . . . . . . . . . . 25
9.2 Response for a new production. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
9.3 Detailed information for creating formats. . . . . . . . . . . . . . . . . . . . . . . 27
9.4 Set the output filename. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
9.5 Create a detailed filename. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
9.6 Get registered external services. . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
9.7 Get service types and fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

35


9.8 Add Audio Algorithms to the request. . . . . . . . . . . . . . . . . . . . . . . . . 31
9.9 All available audio algorithms are shown. . . . . . . . . . . . . . . . . . . . . . . 32
9.10 Create a preset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

List of Figures 36

Bibliography

[C1] Rodriguez, A., (2008). RESTful Web services: The basics.
http://www.ibm.com/developerworks/webservices/library/ws-restful/ (accessed on 2012-
12-17)
[C2] Fielding, R. T., (2000). Architectural Styles and the Design of Network-based Software Ar-
chitectures. http://www.ics.uci.edu/~ﬁelding/pubs/dissertation/top.htm (accessed on 2012-12-
17)
[C3] Hammer-Lahav E., (2010). The OAuth 1.0 Guide. http://hueniverse.com/oauth/guide/ (ac-
cessed on 2012-12-17)
[C4] Atwood, M., Conlan R. M., et al. (2007). OAuth Core 1.0. http://oauth.net/core/1.0/ (accessed
on 2012-12-17)
[C5] Hammer-Lahav E., (2010). The OAuth 1.0 Protocol. http://tools.ietf.org/html/rfc5849 (ac-
cessed on 2012-12-17)
[C6] Hammer-Lahav E., (2010). Introducing OAuth 2.0.
http://hueniverse.com/2010/05/introducing-oauth-2-0/ (accessed on 2012-12-17)
[C7] Hardt, E. D., (2012), The OAuth 2.0 Authorization Framework draft-ietf-oauth-v2-31.
http://tools.ietf.org/html/draft-ietf-oauth-v2-31 (accessed on 2012-12-17)
[C8] Hammer-Lahav E., (2012). OAuth 2.0 and the Road to Hell.
http://hueniverse.com/2012/07/oauth-2-0-and-the-road-to-hell/ (accessed on 2012-12-17)
[C9] Fielding, R. Irvine UC., et al. (1999). Hypertext Transfer Protocol – HTTP/1.1.
http://www.ietf.org/rfc/rfc2616.txt (accessed on 2012-12-17)
[C10] Franks, J. et. al. (1999). HTTP Authentication: Basic and Digest Access Authentication.
http://tools.ietf.org/html/rfc2617 (accessed on 2012-12-17)
[C11] Rescorla E. (2000). HTTP Over TLS. http://tools.ietf.org/html/rfc2818 (accessed on 2012-
12-17)
[C12] Ortiz E. C. (1010). Introduction to Facebook APIs.
http://www.ibm.com/developerworks/library/x-androidfacebookapi/ (accessed on 2012-
12-17)

37

Implementing the Auphonic Web Application Programming Interface

Recommended

Recommended

More Related Content

Similar to Implementing the Auphonic Web Application Programming Interface

Similar to Implementing the Auphonic Web Application Programming Interface (20)

More from Educational Technology

More from Educational Technology (20)

Implementing the Auphonic Web Application Programming Interface