SlideShare a Scribd company logo
1 of 15
Download to read offline
Reading Report 3. Data Representation
Valerii Klymchuk
April 3, 2015
0. EXERCISE 0
Chapter 3. Summary
0.1 Continuous Data
A lot of phenomena are modeled in terms of various physical quantities. In data representation these quanti-
ties can be classified in two fundamentally different categories: intrinsically continuous and intrinsically
discrete ones. Continuous data are usually manipulated by computers in some finite approximate form.
Continuous sampled data are also discrete, since they consist of finite set of data elements, however in con-
trast to intrinsically discrete data, sampled data always originates from, and is intended to approximate, a
continuous quantity. In contrast, intrinsically discrete data has no counterpart in the continuous world,
as is the case of page of text, for example. This is a fundamental difference between continuous (sampled)
and discrete data.
Mathematically, continuous data can be modeled as a function
f : D ⊂ Rd
→ C ⊂ Rc
between domain and codomain respectively. f is called a d-dimensional, or d-variate, c-value function. In
visualization f sometimes is called a field.
Function f is continuous if the graph of the function is a connected surface without “holes” or “jumps.”
Cauchy − δ criterion states, that f is continuous, if for every point p ∈ C the following holds:
∀ > 0, ∃δ > 0 : if x − p < δ, x ∈ C ⇒ f(x) − f(p) < .
Also, f is continuous of order k if the function itself and all its derivatives up to and including order k are
also continuous in this sense. This is denoted as f ∈ Ck
.
Functions f whose derivatives are continuous on compact intervals are called piecewise continuous.
The triplet D = (D, C, f) defines a continuous dataset. The dimension d of the space Rd
in which
function’s domain is embedded is called the geometrical dimension. Topological dimension of the
dataset is the dimension s ≤ d of the function domain D itself - the number of independent variables that
we need to represent our domain D. For a line of curve in Euclidean space R3
we have s = 1 and d = 3; if D
is a plane or curved surface, then s = 2. The geometric dimension is always fixed to d = 3, hence, the only
dimension that varies in datasets is the topological dimension s, there fore in practice it is often called the
dataset dimension. We assume that geometrical dimension is always three.
The co-dimension of an object of topological dimension s and geometrical dimension d is the difference
d − s.
The function values are usually called dataset attributes. The dimensionality c of the function codomain
C is also called the attribute dimension (usually ranges from 1 to 4).
0.2 Sampled Data
The two operations relate sampled data and continuous data:
1
• sampling: given a continuous dataset, we have to be able to produce sampled data from it;
• reconstruction: given a sampled dataset, we have to be able to recover an (approximated) version of
the original continuous data.
The reconstruction involves specifying the value of the function between its sample points, using the
sample values, using a technique called interpolation. The reconstruction quality is a function of the amount
and distribution of sample points used.
To be used in practice, a sampled dataset should comply with several requirements: it should be accurate,
minimal, generic, efficient, and simple. By accurate, we mean that one should be able to control the
production of a sampled dataset D∫ from a continuous one D such that D can be constructed from D∫ with
a small user specified error. By minimal, we mean that D∫ contains the least number of sample points needed
to ensure a reconstruction with the desired error. By generic, we mean that we can easily replace the various
data processing operations we had for the continuous D with equivalent counterparts for the sampled D∫ .
By efficient we mean that both the reconstruction operation and the data processing operations we wish to
perform on D∫ can be done efficiently from an algorithmic point of view. By simple we mean that we can
design a reasonably simple software implementation of both D∫ and the operations we want to perform on
it.
We define reconstruction as follows: given a sampled dataset {pi, fi} consisting of a set of N sample
points pi ∈ D and sample values fi ∈ C, we want to produce a continuous function ˆf : D → C that
approximates the original f. The reconstructed function should equal the original one at all sample points,
i.e., ˆf(pi) = f(pi) = fi. One way to define the reconstructed function that satisfies this property is to set
ˆf =
N
i=1 fiφi, where φi : D → C are called basis functions or interpolation functions. In other
words, we defined the reconstruction operation using a weighted sum of a given set of basis functions φi,
where weights are exactly our sample values fi. Since we want that ˆf = fj for all sample points pi, we get
N
i=1 fiφi(pj) = fj, ∀j. This equation must hold to any function f. Let us consider a function
φi(pi) =
1, i = j
0, i = j
Equation above is sometimes referred to as the orthogonality of basis functions. Let us now consider
the constant function g(x) = 1 for any x ∈ D, we obtain
N
i=1 φi(pi) = 1, ∀pi ∈ D, or
N
i=1
φi(x) = 1, ∀x ∈ D.
The property described in the equation above is called the normality of basis functions. Basis functions
that are both orthogonal and normal are called orthonormal. To reconstruct a sampled function, we can
use different orthonormal basis functions.
A grid, sometimes also called a mesh, is a subdivision of a given domain D ∈ Rd
into a collection of cells,
sometimes also called elements, denoted ci. The union of the cells completely covers the sample domain, i.e.,
i ci = D, and the cells are non-overlapping, i.e., ci cj = 0, ∀i = j.
We can now define the simplest set of basis functions, the constant basis functions. These functions
approximate a given function by the piecewise, per-cell, constant sample value fi (for every point x ∈ D
it assigns the sample value of the nearest cell center). For this reason, the piecewise constant interpolation
is also called nearest-neighbor interpolation. Constant basis functions are simple to implement and
have no computational cost, they work for any cell shape and in any dimension, however, these functions
provide a poor, staircase like approximation ˆf of the original f. Over every cell visualization has a visible
discontinuity.
By using higher-order basis functions we can provide a better and more continuous reconstruction. The
next-simples basis functions beyond the constant ones are the linear basis functions. To use these,
however, we need to make some assumptions about the cell types used in the grid. Let us consider a single
quadrilateral cell c having the vertices (v1, v2, v3, v4), where v1 = (0, 0), v2 = (1, 0), v3 = (1, 1) and
v4 = (0, 1) - axis-aligned square of edge size 1 with the origin as first vertex. We call this the reference
2
cell in R2
. Coordinates in the reference cell [0, 1]d
are called reference coordinates: r1, ..., rd (or r, s, t for
d = 3). We define now four local basis functions Φ1
1, Φ1
2, Φ1
3, and Φ1
4; Φ1
i : [0, 1]2
→ R as follows:
Φ1
1(r, s) = (1 − r)(1 − s),
Φ1
2(r, s) = r(1 − s),
Φ1
3(r, s) = rs),
Φ1
4(r, s) = (1 − r)s.
These basis functions are indeed orthonormal. For any point (r, s) in the reference cell, we can now use
these basis functions to define a linear function ˆf(r, s) =
4
i=1 fiΦ1
i (r, s) as a sum of linear basis functions,
which makes it a first-order continuous reconstruction of the four sample values f1, f2, f3, f4 defined at the
cell vertices. For every arbitrary quadrilateral cell c in R3
, we can define a coordinate transformation
T : [0, 1]2
→ R3
that maps our reference cell to c. We want to map the reference cell vertices vi to
the corresponding world cell pi, so T(vi) = pi. We define our transformation T using our reference basis
functions to map from a point r, s, t in the reference cell coordinate system to a point x, y, z in the actual
cell to be
(x, y, z) = T(r, s, t) =
n
i=1
piΦ1
i (r, s, t).
If T maps the reference cell to the world cell then its inverse T−1
maps points x, y, z in the world cell
to points r, s, t in the reference cell, where our basis functions Φ1
i are defined, Using T−1
, we can rewrite
equation (3.2) for our quad cell c:
ˆf(x, y) =
4
i=1
fiΦ1
i (T−1
(x, y)).
In order to compute the inverse transformation T−1
, we must invert the expression given by Equation (3.8).
This inversion depends on the actual cell type.
We now have a way to reconstruct a piecewise linear function ˆf from samples on any quad grid: for
every cell c in the grid, we simply apply Equation(3.9). We can now finally define our piecewise linear
reconstruction in terms of a set of global basis functions φ, just like we did for piecewise constant
reconstruction (Equation (3.6)). Given a grid with sample points pi and quad cells ci, we can define our
grid-wise linear basis functions φ1
i as follows:
φ1
i (x, y) =
0, if (x,y)/∈ cells(pi),
Φ1
i (T−1
(x, y)), if(x,y) ∈ c=v1,v2,v3,v4, where vj = pi,
where cells(pi) denotes the cells that have pi as a vertex. Sampling the continuous signal f produces a set
of samples fi. Multiplying the samples by the global basis functions φi obtained from the reference basis
functions Φj via the transform T, we obtain the reconstructed signal ˆf.
We can use basis-function machinery and sampling and reconstruction mechanisms applied to more data
attributes than surface geometry alone (e.g., - to shading). Gourand shading produces a smooth illumination
over the polygon by reconstructing original continuous surface using piecewise linear interpolation for both
the geometry and illumination.
0.3 Discrete Datasets
We can say, that, given:
• a grid in terms of a set of cells defined by a set of sample points,
• some sampled values at the cell centers or cell vertices,
• a set of basis functions, we can define a piecewise continuous reconstruction of the sampled signal on
this grid and work with it.
We defined a continuous dataset dataset for a function f : D → C as the triplet D = (D, C, f). In the
discrete case, we replace the function domain D by the sampling grid (pi, ci), and the continuous function
3
f by its piecewise k-order continuous reconstruction ˆf computed using the grid, the sample values fi, and
a set of basis functions {Φk
i }. Hence, the discrete (sampled) dataset counterpart of (D, C, f) is the tuple
Ds = ({pi}, {ci}, {fi}, {Φk
i }): grid points, grid cells, sample values, and reference basis functions.
Replacing a continuous dataset D with its discrete counterpart D∫ means working with a piecewise
k-order continuous function ˆf instead of a potentially higher-order continuous function f. Dataset require-
ments: accurate, minimal, generic, efficient and simple for a discrete dataset translate to constraints on the
number and position of sample points pi, shape of cells ci, type of reference basis functions Φi, and number
and type of sampling values fi. These constraints determine specific implementation solutions as follows.
The cell shapes, together with the basis functions, determine different cell types. The number and type of
sample values fi determine the attribute types.
0.4 Cell Types
A grid is a collection of cells ci, whose vertices are the grid sample points pi. Given some data sampled at
the points pi, the cells are used to define supports for the basis functions φi used to interpolate the data
between the sample points.
The dimensionality d of the cells ci has to be the same as the topological dimension of the sampled
domain D, if we want to approximate D by the union of all cells i ci. For example, if D is a plane (d = 2),
we must use planar cells, such as polygons. If D is a volume (d = 3), we must use volumetric cells, such
as tetrahedra. For each cell type we shall present the linear basis functions it supports, as well as the
coordinate transformation T−1
that maps from locations (x, y, z) in the actual world cell to locations (r, s, t)
in the reference cell.
0.4.1 Vertex
The simplest cell type of dimension d = 0 is identical to its single vertex, c = v1. The vertex has a single,
constant basis function Φ0
1(r) = 1. In practice there us no distinction between sample points and vertex
cells.
0.4.2 Line
Line cells have dimension d = 1 and two vertices c = v1, v2. Line cells used to interpolate along any kind of
curves embedded in any dimension. Given the reference line cell defined by the points v1 = 0, v2 = 1, the
two linear basis functions are
Φ1
1(r) = (1 − r),
Φ1
2(r) = r
The transformation T−1
for line cells is simply the dot product between the position vector of the desired
point in the cell p = (x, y, z) with respect to the first cell’s vertex p1 and the cell vector p1p2:
T−1
line(x, y, z) = (p = p1)(p2 − p2).
0.4.3 Triangle
The simplest cell type in dimension d = 2 is the triangle, i.e., c = v1, v2, v3. Triangle can be used to
interpolate along any kind of surfaces embedded into any dimension (planar or curved). Given the reference
triangle cell defined by the points v1 = (0, 0), v2 = (1, 0), v3 = (0, 1), the three linear basis functions are
Φ1
1(r, s) = 1 − r − s,
Φ1
2(r, s) = r,
Φ1
3(r, s) = s.
The transformation T−1
for triangular cells is T−1
tri = (r, s) = (p−p1)×(p3−p1)
(p2−p1)×(p3−p1) , (p−p1)×(p2−p1)
(p3−p1)×(p2−p1) It is
computed as dot products between the position vector p−p1 of the point p in the world cell with the respect
to the world cell’s first vertex p1 and the world cell edges p2p1.
4
0.4.4 Quad
Another possibility to interpolate over two-dimensional surfaces is to use quadrilateral cells, or quads.
The reference quad is defined by the points v1 = (0, 0), v2 = (1, 0), v3 = (1, 1) and v4 = (0, 1) and is an
axis-aligned square of edge size 1. On this reference quad the basis functions are
Φ1
1(r, s) = (1 − r)(1 − s),
Φ1
2(r, s) = r(1 − s),
Φ1
3(r, s) = rs,
Φ1
4(r, s) = (1 − r)s.
A good trade-off between flexibility and simplicity is to support quad cells as input data, but transform
them internally into triangle cells, by dividing every quad into two triangles using one of its two diagonals.
The transformation T−1
quad for a general quad cell deals with bilinear basis functions and can not
be easily inverted. We can only solve it numerically for r, s as functions of x, y, z. If our actual cells are
rectangular instead of arbitrary quads, like in uniform or rectilinear grid, we can do better. In this case
the transformation T−1
rect: T−1
rect = (r, s) = (p−p1)·(p2−p1)
p2−p1
2 , (p−p1)·(p4−p1)
p4−p1
2 .
0.4.5 Tetrahedron
The simplest cell type in demotion d = 3 is the tetrahedron, defined by its four vertices c = (v1, v2, v3, v4).
On the reference tetrahedron defined by the points v1 = (0, 0, 0), v2 = (1, 0, 0), v3 = (0, 1, 0) and v4 = (0, 0, 1),
the four linear basis functions are
Φ1
1(r, s, t) = 1 − r − s − t,
Φ1
2(r, s, t) = r,
Φ1
3(r, s, t) = s,
Φ1
4(r, s, t) = t.
Given a tetrahedral cell with vertices p1, p2, p3, p4, the transformation T−1
tet = (r, s, t) follows the same pattern:
r = |((p−p4)·((p1−p4)×(p3−p4))|
|((p1−p4)·((p2−p4)×(p3−p4))| ,
s = |((p−p4)·((p1−p4)×(p2−p4))|
|((p1−p4)·((p2−p4)×(p3−p4))| ,
t = |((p−p3)·((p1−p3)×(p2−p3))|
|((p1−p4)·((p2−p4)×(p3−p4))| ,
Some applications use also pyramid cells and prism cells to discretize volumetric domain. Pyramid and
prism cells can be split into tetrahedral cells.
0.4.6 Hexahedron
The next d = 3 dimensional cell type is the hexahedron, or hex, defined by its eight vertices c = (v1, ..., v8).
The reference hexahedron is the axis-aligned cube of unit edge length, with v1 at the origin. On this cell the
eight linear basis functions are
Φ1
1(r, s, t) = (1 − r)(1 − s)(1 − t),
Φ1
2(r, s, t) = r(1 − s)(1 − t),
Φ1
3(r, s, t) = rs(1 − t),
Φ1
4(r, s, t) = (1 − r)s(1 − t),
Φ1
5(r, s, t) = (1 − r)(1 − s)t,
Φ1
6(r, s, t) = r(1 − s)t,
Φ1
7(r, s, t) = rst,
Φ1
8(r, s, t) = (1 − r)st.
WE can split hexahedral cells into six tetrahedra each and then use only tetrahedra as 3D cell types, simpli-
fying software implementations and maintenance. T−1
hex for hexahedral cells cannot be computed analyticaly,
and must be determined using numerical methods. However, in case our actual hex cells are parallelepipeds
5
(orthogonal edges), these cells can be called box cells. In this case, T−1
hex can be computed by taking a dot
product of the position vector p − p1 with the cell edges. For a box cell with vertices p1...p8, we obtain:
T−1
box(x, y, z) = (r, s, t) =
(p − p1)(p2 − p1)
p2 − p1
2 ,
(p − p1)(p4 − p1)
p4 − p1
2 ,
(p − p1)(p5 − p1)
p5 − p1
2 .
Software packages sometimes offer more cell types, such as squares and pixels (identical to rectangle grid),
triangle strips (memory-efficient way to store sequences of triangle cells that share edges), polygons in 2D,
and cubes and voxels in 3D (same role as squares and pixels have in 2D). Some applications use quadratic
cells and support quadratic basis functions and provide piecewise quadratic (smoother) reconstruction of
data, which is C2
continuous, and are often used in numerical simulations applications such as finite element
methods.
In general, you should add new cell types to your application data representation only if these allow you
to implement some particular visualization or data processing algorithms much more easily and/or efficiently
that cell types your software already supports. Quadratic cells also contain a midpoint for edges and, for
3D cells, centers of cell faces.
0.5 Grid Types
0.5.1 Uniform Grids
In a uniform grid, the domain D is an axis-aligned box, e.g., a line segment for d = 1, rectangle for d = 2,
or parallelepiped for d = 3. On a uniform grid , sample points pi ∈ D ⊂ R are equally spaced along the d
axes of the domain D. Hence, in the uniform grid, a sample point is described by its d integer coordinates
n1 . . . , nd. These integer coordinates are sometimes called structured coordinates. A simple example of
uniform grid is a 2D pixel image, where every pixel pi is located by two integer coordinates. This regular
point ordering allows us to define the grid cells implicitly by using the point indexes.
The magor advantages of uniform grids are their simple implementation and practically zero storage
requirements. Regardless of its size, storing d-dimentional grid itself takes 3d floatong-point values, i.e., only
12d bytes of memory. Storing the actual sample values at the grid points takes storage proportional to the
number of sample points.
0.5.2 Rectilinear Grids
Uniform grids are simple and efficient, but have limited modeling power. To accurately represent a function
with a non-uniform variation rate, we need either to use a high sampling density on a uniform grid, or use a
grid with non-uniform sample density. Rectilinear grids relax the constraint of equal sampling distances for
a given axis, but keeps the axis-aligned, matrix-like point ordering and implicit cell definition. These grids
are similar to the uniform ones, except that the distances δi,j between the sample points are now not equal
along the grid axes. Implementing a rectilinear grid implies storing the grid origins (mi, Ni) and sample
counts for every dimension d, as for the uniform grid. Additionally, we must store sample steps. In total,
the storage requirements are 2d +
d
i=1 Ni values.
0.5.3 Structured Grids
In rectilinear grids the samples domain is still a rectangular box. and the sample point density can be
changed only one axis at a time. Rectangular grids, for example, do not allow us to place more sample
points only in the central peak region of an exponential function.
Structured grids allow explicit placement of every sample point pi = (xi1, . . . , xid). The user can freely
specify the coordinates xij of all points. At the same time structured grids preserve the matrix-like ordering
of the sample points. Implementing a structured grid implies storing the coordinates of all grid sample points
pi and the number of points N1, . . . , Nd per dimension. Structured grids can represent a large number of
shapes.
6
0.5.4 Unstructured Grids
Structured grids can be seen as a deformation of uniform grids, where topological ordering of the points
(cells) stays the same, but their geometrical position is allowed to vary freely.
There are, however, shapes that cannot be effectively modeled by structured grids. They allow defining
both their sample points and cells explicitly. An unstructured grid can be modeled as a collection of sample
points pi, i ∈ [0, N] and cells ci = (vi1, . . . , viCi ). The values vij ∈ [0, N] are called cell vertices and refer
to the sample points pvij
used by the cell. A cell is thus an ordered list of sample point indices. This model
allows us to define every cell separately and independently of the other cells. Also, cells of different type
and even dimensionality can be freely mixed in the same grid, if desired. If cells share the same sample
points as their vertices, this can be directly expressed, which is useful in several contexts. * Storing index
represented by integer is usually cheaper than storing a d-dimensional coordinate (d floating numbers) * We
can process the grid geometry (positions of the sample points pi) independently of the grid topology, i.e., the
cell definitions. In practice, it is preferable to use unstructured grids containing a single cell type, as these
are simpler to implement and also can lead to faster application code. The costs of storing an unstructured
grid depend on the types of cells used and the actual grid. For example, a grid of C d-dimensional cells with
V vertices per cell and N sample points would require dN + CV values.
0.6 Attributes
In visualization, the set of sample values of a sampled dataset is usually called attribute data. Attribute
data can be characterized by their dimension c, as well as the semantics of the data they represent. This
gives rize to several attribute types.
0.6.1 Scalar Attributes
Scalar attributes are c = 1 dimentional. They are represented by plain real numbers. They encode various
physical quantities such as temperature, concentration, pressure, or density, or geometrical measures, such
as length or height (elevation plot function f : R2
→ R).
0.6.2 Vector Attributes
They are usually c = 2 or c = 3 dimensional. Vector attributes can encode position, direction, force, or
gradients of scalar functions. Usually vectors have an orientation and a magnitude, also called length or
norm.
0.6.3 Color Attributes
Color attributes are usually c = 3 dimensional and represent the displayable colors on a computer screen.
Three components of a color attribute can have different meanings, depending on the color system in use
(RGB system). RGB is an additive system, since every color is represented as a mix of pure red, green
and blue colors in different amounts. Equal amounts of these colors determine gray shades, whereas other
combinations determine various hues.
Another popular color representation system is the HSV system, where the three color components specify
the hue, saturation, and value of a given color. The advantage of the HSV system is that it is more intuitive
for the human user. Hue distinguishes between different colors of different wavelengths, such as red, yellow,
and blue. Saturation represents the color purity. A saturation of 1 corresponds to pure, undiluted color,
whereas a saturation of 0 corresponds to white. Value represents the brightness, or luminance, or a given
color. A value of 0 is always black, whereas a value of 1 is three brightest color of a given hue and saturation
that can be represented on a given system. The value of luminance component of an HSV color is equal to
the maximum of the R, G, and B components.
0.6.4 Tensor Attributes
Tensor attributes are high-dimensional generalizations of vectors and matrices. We can compute the curva-
ture of a planar curve using its second derivative d2
f
dx2 , and the curvature of a 3D surface in a given direction
7
using its Hessian matrix H of partial derivatives. The hessian matrix is also called the curvative tensor
of the given surface.
Besides curvature, tensors can describe other physical quantities that depend on direction, such as water
diffusivity or stress and strain in materials. Tensors are characterized by their rank. Scalars are tensors of
rank 0. Vectors are tensors of rank 1. The Hessian curvature tensor is a rank 2 symmetric tensor since it is
expressed by a symmetric, rank 2 matrix.
0.6.5 Non-Numerical Attributes
Examples of possible non-numerical attribute types are text, images, file names, or even sound samples.
The main property for D∫ is to permit us to reconstruct some piecewise, k-order continuous function
ˆf : D → C, given the sample values fi ∈ C. What should the meaning of the multiplication between sample
values fi and real-valued basis functions Φi and of addition of the sample values fi in Equation (3.9) be?
0.6.6 Properties of Attribute Data
The main purpose of attribute data is to allow a reconstructions ˆf of the sampled information fi. Attribute
data has several general properties:
• attribute data, the sample values fi, must be defined for all sample points pi of a dataset Ds. If samples
in some points pi are missing, there several solutions: 1. remove these points completely from the grig,
2. define missing values fi in some way or replace them with some special value (like 0), 3) we can
define missing values using existing values, using some complex interpolation scheme.
• cell type can contain any number of attributes, of any type, as long as these are defined for all data
points. We can choose whether we want to model our data as a single c-value dataset or as c one-value
datasets. The answer is to consider all attributes that have a related meaning as a single higher-
dimensional attribute - separate attributes with different meanings.
Operations of color attributes must consider all color components simultaneously, as color components
R, G, B have a related meaning.
Some data visualization applications classify attribute data into:
• node or vertex attributes - defined at the vertices of the grid cell and correspond to a sampled
dataset and
• cell attributes - defined at the center points of the grid cells - correspond to sampled dataset that
uses constant basis functions. Vertex attributes can be converted to cell attributes and conversely by
resampling.
The attribute components are sometimes related by some constraints. This happens for normal attributes
n ∈ R3
, where the three components are constrained to yield unit length normals, i.e., |n| = n2
x + n2
y + n2
z.
Depending on the choice of the basis functions, interpolating these components separately as scalar values
may not preserve the unit length properly on the interpolated normal n. First solution is to interpolate the
components separately, and then enforce the desired constraint on the result by normalizing it. i.e., replacing
n with n/|n| (works when sample values do not vary too strongly across a grid cell). Second solution is to
represent the constraint directly in the data attributes, rather than enforcing it after interpolation. For
normal attribute types, this means representing 3D normals as two independent orientations, e.g., using
polar coordinates α, β, instead of using the tree x, y, z components, which are dependent via the unit length
constraint. We can now interpolate the normal orientations α, β using the desired basis functions, and will
always obtain the correct result.
0.7 Computing Derivatives of Sampled Data
One of the requirements for a sampled dataset D∫ = (pi, ci, fi, Φi) is that it should be generic: we can easily
replace various data processing operations available for continuous counterpart with equivalent operations
in D∫ .
8
ˆf =
N
i=1 fiφi, then ∂ ˆf
∂xi
=
N
j=1 fj
∂φi
∂xi
. Using the expressions of the reference basis functions:
∂ ˆf
∂xi
=
N
j=1 fj
∂Φi
∂xi
(r). We now use the chain rule and obtain: ∂Φ
∂xi
=
d
j=1
∂Φi
∂rj
∂rj
∂xi
to obtain ∂ ˆf
∂xi
=
N
j=1 fj
d
k=1
∂Φi
∂rk
∂rk
∂xi
. Finally, we canx rewrite last equation in a convenient matrix form, as follows:





∂ ˆf
∂x1
∂ ˆf
∂x2
. . .
∂ ˆf
∂xd





=
N
j=1
fj




∂r1
∂x1
∂r2
∂x1
. . . ∂rd
∂x1
∂r1
∂x2
∂r2
∂x2
. . . ∂rd
∂x2
. . .
∂r1
∂xd
∂r2
∂xd
. . . ∂rd
∂xd




inverse Jacobian matrix J−1





∂Φj
∂r1
∂Φj
∂r2
. . .
∂Φj
∂rd





The matrix above is called the inverse Jacobian matrix J−1
= (∂ri/∂xj)ij. this matrix is in-
verse of the Jacobian matrix J = (∂xi/∂rj)ij. Using, T−1
, we can rewrite the inverse Jacobian as
J−1
= (
∂T −1
i (x1,...,xd)
∂xj
)ij, where T−1
denotes the it-h component of the function T−1
. Putting it all to-
gether, we get the formula for computing the partial derivatives of a sampled dataset ˆf with respect to all
coordinates xi: ( ∂ ˆf
∂xi
) =
N
k=1 fk(
∂T −1
i
∂xk
)ij(∂Φk
∂ri
)i.
To use this equation in practice, we need to evaluate the derivatives of both the reference basis functions
Φk and T−1
for every cell type. Alternatively, we can evaluate the Jacobian matrix instead of its reverse,
using the reference-cell to world-cell coordinate transform T instead of T−1
, then numerically invert J, and
finally apply Equation (3.33). For all cells described in Section 3.4, the coordinate transformation T−1
are
linear functions of the arguments xi, so their derivatives are constant. Hence, the derivatives of ˆf are of the
same order as those of the basis functions Φk we choose to use.
Partial derivatives of ˆf inside a given cell are computed by linearly interpolating the 1D derivatives of ˆf
along opposite cell edges. A similar result can be obtained for rectilinear grids as well as for hexahedral cells.
If a dataset is noisy, the computed derivatives tend to exhibit even stronger noise that the original data. A
simple method to limit these problems is to pre-filter the input dataset in order to eliminate high frequency
noise, using methods such as the Laplacian smoothing described in Section 8.4. However, smoothing can
also eliminate important information from the dataset together with the noise.
0.8 Implementation
0.8.1 Grid Implementation
0.9 Advanced Data Representation
Sometimes more advanced forms of data manipulation and representation are needed. We will describe the
task of data resampling, which is used in the process of converting information between different types of
datasets that have different sample points, cells or basis functions.
0.9.1 Data Resampling
Lets consider piecewise constant normal - polygon normals themselves, which are discontinuous at the poly-
gon vertices and actually, over the complete polygon edges - so we can not use them for approximations
for the vertex normals. How can we compute vertex normal values from the known polygon normals? The
answer is provided by operation called resampling.
Resampling computes the values fi of the target dataset as function of the values fi of the source dataset.
For simplicity, we assume that both datasets use the same set of basis functions Φi.
Let us now consider a common resampling operation in data visualization: converting cell attributes (fi)
to vertex attributes (fi ). Cell attributes imply the use of constant basis functions Φi, vertex attributes, in
contrast, imply the use of higher-order basis functions, such as linear ones. On the other hand we want the
sample points of the target grid cells (target grid vertices) to be identical to the source vertices for the two
grids to match.
Vertex data is the area weighted-average of the cell data in the cells that use a given vertex. Cell attributes
are the average of the cell’s vertex attributes.
9
Resampling data from cells to vertices increases the assumed continuity. If our original sampled data
were indeed continuous of that order, no problem appears. However if the original data contained, e.g.,
zero-order discontinuities, such as jumps or holes, resampling it to a higher-continuity grid also throws
away discontinuities which might have been a feature of the data and not a sampling artifact. In contrast,
resampling from a higher continuity (vertex data) to a lower continuity (cell data) has fewer side effects-
overall, the smoothness of the data decreases globally.
Two other frequently used resampling operations are subsampling and supersampling. Subsampling re-
duces the number of sample points that are the subset of original dataset points (optimizing the process speed
and memory demands, working with smaller datasets). After eliminating some number of points subsampling
operations can choose or redistribute the remaining points in order to obtain a better approximation of the
original data. Subsampling implementations can take advantage of dataset topology. A desirable property
of subsampling is to keep most samples in the regions of rapid data variations and cull most samples from
the regions of slow data variation. A technique, called uniform subsampling, is simple and effective when
the original dataset is densely sampled it is used in uniform , rectilinear and structured grids to keep every
k-th point along every dimention and discard the remaining ones.
Supersampling or refinement is the inverse of subsampling: more data points are created from an
existing dataset. It is useful in situations when we try to create or manipulate information on a dataset at
a level of detail, or scale, that is below the one captured by the sampling frequency. Uniform supersampling
introduces k points into every cell of the original dataset. An efficient supersampling implementation usually
inserts extra points only in those regions where we need to further add extra information.
0.9.2 Scattered Point Interpolation
There are situations when we would like to avoid constructing and storing a grid of cells to represent
data domain. 3D scanner delivers a scattered 3D point set, also called a point cloud: point and their
corresponding data values pi, fi. For scanner the data values fi are the surface normals and/or color measured
by the device.
How do we reconstruct continuous surface if we were given a set above with points and normals?
Constructing a grid from scattered points (triangulation): unstructured grid with 2D cells, e.g, triangles,
which have pi as vertices and approximate the surface as much as possible.
A second way is griddles interpolation. Storing the cell information can double the amount of memory
required in the worst case. To reconstruct a continuous function from a scattered point set we need a set
of griddles basis functions. There are several ways to construct such functions, frequently used choice for
griddles basis functions is radial basis functions or RBFs. These functions depend only on the distance
between the current point and the origin r = |x| =
d
i=1 x2
i .
RBFs smoothly drop from 1 at their origin (r = 0) to a vanishing value for large values of the distance
r. To limit the effect of a basis function to its immediate neighborhood, we specify a radius of influence R,
or support radius, beyond which Φ is equal to zero. In this setup a common RBF is the Gaussian function.
Φ(x) =
e−kr2
, r < R,
0, r ≥ R,
where r = |x|.
The parameter k ≥ 0 controls the decay speed, or the shape of the radial basis functions. Setting k = 0
yields constant cylinder shaped radial functions, which are equivalent to the constant basis functions for grid
based datasets. Another popular choice are inverse distance functions defined as
Φ(x) =
1
1+r2 , r < R,
0, r ≥ R,
where r = |x|.
The radius values Ri control the influence of the sample data value of a point pi. Higher values of Ri yield
smoother reconstructions at higher computational cost, lower values of Ri yield less-smooth reconstruction
but higher performance. In practice, setting Ri to the average inter-point distance in the neighborhood of
point pi gives a good balance between smoothness and efficiency.
Given a point p, we shall sum only those basis functions φk that are nonzero at p. In case of radial
basis functions, we must find the k nearest sample points p1, . . . , pk to p so that |p − pk| < Rk. One way
10
to accomplish this is to store all sample points pi in a spatial search structure such as a kd-tree. Spatial
search structures provide efficient retrieval of the k nearest neighbors at any given location. A good, scalable
implementation of such a search structure is provided by the Approximate Nearest Neighbor (ANN) library.
Scattered point data sets sometimes are called unstructured point datasets, however, if the function
of a dataset is to provide a piecewise continuous reconstruction of its data samples, we need to specify
also a choice for the basis functions Φi to have a complete dataset (pi, fi, Φi). To effectively perform the
reconstruction, searching methods are needed that return the sample points pi located in the neighborhood
of a given point p.
What have you learned in this chapter?
This chapter lays out a discussion on discrete data representation, continuous data sampling and re-
construction. Fundamental differences between continuous (sampled) and discrete data are outlined. It
introduces basic functions, discrete meshes and cells as means of constructing piecewise continuous approx-
imations from sampled data. I learned about various types of datasets commonly used in the visualization
practice: their advantages, limitations and constraints This chapter gives an understanding of various trade-
offs involved in the choice of a dataset for a given visualization application while focuses on efficiency of
implementing the most commonly used datasets presented with cell types in d ∈ [0, 3] dimensions.
What surprised you the most? I was surprised to find out that there are few representations and
mapping of colors between RGB and HSV space.
I was surprised to find out how griddles interpolation works and that it exists. Also, that reconstruction
of scattered/unstructured point datasets requires using searching methods to locate nearest sample points
in the neighborhood of a given point.
I was surprised that datasets with attributes such as text, images, or relations form the target of infor-
mation visualization applications, since they are purely discrete, and often not defined on a spatial domain.
What applications not mentioned in the book you could imagine for the techniques ex-
plained in this chapter? I can only imagine a datasets that stores high dimentional attributes in order
to allow just enough continuity to perform various types of resampling between target and source grids of
certain type. Selecting a set of useful grids and proper resampling might improve original visualization model
in a way that it will focus more on nature of a signal, by depending less on the structure/representation of
its sampled data .
1. EXERCISE 1
Consider the following datasets:
• The evolution in time of the prices of N different stock-exchange shares, recorded at one-second intervals
over the period of one hour. • The paths covered by all cars driving through a given city, recorded at one-
minute intervals over the period of one hour. For each record, we store the car ID, the car’s position, and
the car’s speed. • The amount of rainfall and the air temperature, recorded at a given time instant at N
given weather stations over some geographical area.
Describe the kind of grid, grid cells, and data attributes that you would use to store such a dataset.
Argue your proposal by considering the kind of data to store, and the locations at which data is recorded
(sampled).
• grid: uniform linear grid with 1 second intervals; grid cells - lines length of 1 second; data attributes:
price for each our of N shares (360 samples per hour times N shares = 360N values to store)
• grid - a data structure with spatial search that utilizes average interpoint distance between points;
cells - grigless radial basis function with compact support; data attributes: car ID, car’s position, car’s
speed, basis functions.
• grid - rectilinear structured grid with specified sampling locations; cells: quads; data attributes: am-
mount of rainfall, temperature, location
2. EXERCISE 2
Sampling and reconstruction are closely related operations which reduce a function y = f(x) to a finite
set of sample points (xi, yi) and, respectively, reconstruct an approximation ˆy = ˆf(x) of f(x) from the
sample points. Consider an application where you have to perform the above reconstruction ˆf(x), but you
11
are only allowed to use a fixed finite number N of sample points xi. How would you place these sample
points over the domain of definition of x so that the reconstruction error | ˆf − f| k is equally well minimized
over the entire range of x?
Hints: first, consider the kinds of basis functions you want to use (e.g., constant or linear). Next, consider
how you can minimize the reconstruction error by shifting the points xi around the x axis.
• In case of constant basis functions we can use unoform sampling density with N points placed at equal
distances from each other.
• In case of linear basis functions we can use non-uniform sampling density, in order to assign more
sample points to those areas of domain, where function’s higher order derivatives change fast.
3. EXERCISE 3
In Figure 3.10 in Chapter 3 (also displayed below), it is shown that we can use structured grids to cover
a 2D disk shape. Now, consider an arbitrary convex 2D shape of genus 0 (that is, without holes). The
2D shape is specified by means of its contour, which is given as a closed 2D polyline of N points. • Can
we always construct a structured grid so that all points of this polyline will be also points on the grid’s
boundary? If not, sketch a simple counter-example. • Can we always construct a structured grid with the
conditions listed in the point above and the additional condition that no grid-boundary point exists which
is not a polyline point? If not, sketch a simple counter-example.
Hints: Think about the number of points on the boundary of a structured grid.
• Yes, we can always construct a structured grid of N points and N −2 triangular cells. Since all internal
angles of the shape are less that 180 degrees, then it is always possible to take one vertice and connect
it to remaning (N − 2) vertices in order to form a triangular structured grid.
• Yes, it is always possible for a convex 2D polyline to use triangulars as describes above so, that all
polyline points are also grid-boundary points.
4. EXERCISE 4
As shown in Figure 3.11 in Chapter 3 (also shown below), not all 2D shapes can be covered by structured
grids. Consider now a 3D (curved) surface of a half sphere. Can we cover this surface with a structured
grid? Argue your answer.
Yes, we can cover such half a sphere with a structured grid, consisting of tetraahedral cells. Such shape
consist only of one component and genus of domain here I assume equals to 0.
5. EXERCISE 5
Consider the 2D cells in the figure below. For each cell, scalar data values vi are indicated at its sample
points (vertices). Additionally, a separate point p inside the cell is indicated. If bilinear interpolation is
used, compute the interpolated value v(p) of the vertex data values vi at the point p. Detail your answer by
explaining how you computed the interpolated value.
• For rectangular quad:
T−1
rect = (r, s) =
(p − p1) · (p2 − p1)
p2 − p1
2 ,
(p − p1) · (p4 − p1)
p4 − p1
2 where :



(p − p1) = (4 − x1, 3 − y1) = (3, 1),
(p2 − p1) = (x2 − x1, y2 − y1) = (4, 0),
(p4 − p1) = (x4 − x1, y4 − y1) = (0, 3),
p2 − p1
2
= (x2 − x1)2
= 42
= 16,
p4 − p1
2
= (y4 − y1)2
= 32
= 9.
12
T−1
rect = (r, s) =
(3, 1) · (4, 0)
16
,
(3, 1) · (0, 3)
9
=
12
16
,
3
9
=
3
4
,
1
3
.
Calculating 4 basis functions as follows:
Φi(T−1
rect) =



Φ1
1(r, s) = (1 − r)(1 − s) = (1 − 3/4)(1 − 1/3) = 1/6,
Φ1
2(r, s) = r(1 − s) = (3/4) · (1 − 1/3) = 3/4 × 2/3 = 1/2,
Φ1
3(r, s) = rs = 3/4 × 1/3 = 3/12 = 1/4,
Φ1
4(r, s) = (1 − r)s = (1 − 3/4) · (1/3) = 1/4 × 1/3 = 1/12.
Finally, we calculate value for: v(p) = ˆp(x, y) =
4
i=1 viΦ1
i = 3· 1
6 +1· 1
2 +4· 1
4 +0· 1
12 = 1/2+1/2+1+0 = 2.
Answer: v(p) = 2.
• For triangular cell:
T−1
tri = (r, s) =
(p − p1) × (p3 − p1)
(p2 − p1) × (p3 − p1)
,
(p − p1) × (p2 − p1)
(p3 − p1) × (p2 − p1)
, where :



(p − p1) = (3 − x1, 3 − y1) = (2, 1),
(p2 − p1) = (x2 − x1, y2 − y1) = (4, 1),
(p3 − p1) = (x3 − x1, y3 − y1) = (0, 3),
T−1
tri = (r, s) =
(2, 1) × (0, 3)
(4, 1) × (0, 3)
,
(2, 1) × (4, 1)
(0, 3) × (4, 1)
=
3
3
,
9
3
= (1, 3)
Calculating 4 basis functions as follows:
Φi(T−1
tri ) =



Φ1
1(r, s) = 1 − r − s = 1 − 1 − 3 = −3,
Φ1
2(r, s) = r = 1 = 1,
Φ1
3(r, s) = s = 3 = 3,
Finally, we calculate value for: v(p) = ˆp(x, y) =
3
i=1 viΦ1
i = 3 · (−3) + 1 · 1 + 4 · 3 = 4.
Answer: v(p) = 4.
6. EXERCISE 6
Consider the 2D cells in the figures below. For each cell, vector data values vi are indicated at its sample
points (vertices). Additionally, a separate point p inside the cell is indicated. If bilinear interpolation is
used, compute the interpolated value v(p) of the vertex data values vi at the point p. Detail your answer by
explaining how you computed the interpolated value.
• For rectangular quad:
T−1
rect = (r, s) =
(p − p1) · (p2 − p1)
p2 − p1
2 ,
(p − p1) · (p4 − p1)
p4 − p1
2 where :



(p − p1) = (4 − x1, 3 − y1) = (3, 1),
(p2 − p1) = (x2 − x1, y2 − y1) = (4, 0),
(p4 − p1) = (x4 − x1, y4 − y1) = (0, 3),
p2 − p1
2
= (x2 − x1)2
= 42
= 16,
p4 − p1
2
= (y4 − y1)2
= 32
= 9.
T−1
rect = (r, s) =
(3, 1) · (4, 0)
16
,
(3, 1) · (0, 3)
9
=
12
16
,
3
9
=
3
4
,
1
3
.
13
Calculating 4 basis functions as follows:
Φi(T−1
rect) =



Φ1
1(r, s) = (1 − r)(1 − s) = (1 − 3/4)(1 − 1/3) = 1/6,
Φ1
2(r, s) = r(1 − s) = (3/4) · (1 − 1/3) = 3/4 × 2/3 = 1/2,
Φ1
3(r, s) = rs = 3/4 × 1/3 = 3/12 = 1/4,
Φ1
4(r, s) = (1 − r)s = (1 − 3/4) · (1/3) = 1/4 × 1/3 = 1/12.
Finally, we calculate value for: v(p) = ˆp(x, y) =
4
i=1 viΦ1
i = (1, 0) · 1
6 + (0, 1) · 1
2 + (1, 1) · 1
4 + (2, 1) · 1
12 =
(1
6 , 0) + (0, 1
2 ) + (1
4 , 1
4 ) + (1
6 , 1
12 ) = (7/12, 5/6).
Answer: v(p) ≈ (0.58, 0.83).
• For triangular cell:
T−1
tri = (r, s) =
(p − p1) × (p3 − p1)
(p2 − p1) × (p3 − p1)
,
(p − p1) × (p2 − p1)
(p3 − p1) × (p2 − p1)
, where :



(p − p1) = (3 − x1, 3 − y1) = (2, 1),
(p2 − p1) = (x2 − x1, y2 − y1) = (4, 1),
(p3 − p1) = (x3 − x1, y3 − y1) = (0, 3),
T−1
tri = (r, s) =
(2, 1) × (0, 3)
(4, 1) × (0, 3)
,
(2, 1) × (4, 1)
(0, 3) × (4, 1)
=
3
3
,
9
3
= (1, 3)
Calculating 4 basis functions as follows:
Φi(T−1
tri ) =



Φ1
1(r, s) = 1 − r − s = 1 − 1 − 3 = −3,
Φ1
2(r, s) = r = 1 = 1,
Φ1
3(r, s) = s = 3 = 3,
Finally, we calculate value for: v(p) = ˆp(x, y) =
3
i=1 viΦ1
i = (0, −1) · (−3) + (1, 0) · 1 + (1, 1) · 3 =
(0, 3) + (1, 0) + (3, 3) = (4, 6).
Answer: v(p) = (4, 6).
7. EXERCISE 7
Color selection, by end users, is typically done by various widgets which represent the space of available
colors, such as the color wheel, color hexagon, or three separate color sliders for the R, G, and B (or
alternatively H, S, and V ) color components. Assume, now, that we want to select only colors present in a
given subset of the entire color space. Concretely, we have a large set of color photographs, and we next want
to select only colors predominantly present in these photographs, rather than any possible color. Sketch and
argue for a color-selection widget that would optimally help users to select only these specific colors. Hints:
Think how to modify any of the existing color-selection widgets to ‘focus’ on a specific color range where
many samples exist.
We can specify a subset of colors we are interested in by specifying R,G,B values for each color in
our sample. Then, we have a scatterred field of dots inside a color cube. After doing so we can conduct
supersampling by ading even more dots in the neighbourhood of each specified color. Finally, we can use
interpolate a 3D surface along each axis and project result on color cube facets, or on RGB hexagon.
Also, we can modify HSV color wheel, by cutting out segments, corresponding to lowes density of our
color sample, so that remaining colors have represent magority of our color samples.
8. EXERCISE 8
Consider a grid where we have color data values recorded at its cell vertices. We would like to use linear
interpolation to compute colors at all points inside the grid cells. We can do this by interpolating colors
represented as RGB triplets or, alternatively, colors represented as HSV triplets. Discuss the advantages
14
and disadvantages of both schemes. Can you imagine a situation where the RGB interpolation would be
arguably preferable to HSV interpolation? Can you imagine a situation when the converse (HSV interpolation
is preferable to RGB interpolation) is true? Describe such situations or alternatively argue for the fact that
they do not exist.
HSV interpolation gives better results since it is separeted from luminance and saturation. In RGB color
scheme we need to interpolate alnong all 3 components.
9. EXERCISE 9
Consider a grid cell, such as a 1D line, 2D triangle or quad, or 3D parallelepiped or cube, and some
scalar values vi recorded at the cell vertices. Consider that we are using linear interpolation to reconstruct
the sampled scalar signal v(x) at any point x inside the cell. Does a cell shape exist, and a point x in that
cell, so that v(x) is larger than the maximum of vi over all cell vertices? Does a cell shape exist, and a point
x in that cell, so that v(x) is smaller than the minimum of vi over all cell vertices? Argue your answers.
No
10. EXERCISE 10
Consider a grid-cell like in the Exercise 9, and some color values vi recorded at the cell vertices. Consider
that we are using linear interpolation to compute a color v(x) at any point x inside the cell. Does a point
x exist so that v(x) is brighter than any of the colors vi? Does a point x exist so that v(x) is darker than
any of the colors vi? Do the answers to the above two sub-questions depend on the choice of the system, or
space, used to represent colors (RGB or HSV )? Explain your answer.
No.
15

More Related Content

What's hot

What's hot (20)

Lecture-9.pptx
Lecture-9.pptxLecture-9.pptx
Lecture-9.pptx
 
5.5 graph mining
5.5 graph mining5.5 graph mining
5.5 graph mining
 
Graph Coloring
Graph ColoringGraph Coloring
Graph Coloring
 
Digital Image Processing - Image Enhancement
Digital Image Processing  - Image EnhancementDigital Image Processing  - Image Enhancement
Digital Image Processing - Image Enhancement
 
Computer graphics realism
Computer graphics realismComputer graphics realism
Computer graphics realism
 
compiler design
compiler designcompiler design
compiler design
 
Spline representations
Spline representationsSpline representations
Spline representations
 
Digital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial DomainDigital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial Domain
 
Scan line method
Scan line methodScan line method
Scan line method
 
Hidden lines & surfaces
Hidden lines & surfacesHidden lines & surfaces
Hidden lines & surfaces
 
Graph coloring Algorithm
Graph coloring AlgorithmGraph coloring Algorithm
Graph coloring Algorithm
 
TYBSC IT PGIS Unit I Chapter II Geographic Information and Spacial Database
TYBSC IT PGIS Unit I Chapter II Geographic Information and Spacial DatabaseTYBSC IT PGIS Unit I Chapter II Geographic Information and Spacial Database
TYBSC IT PGIS Unit I Chapter II Geographic Information and Spacial Database
 
Graph colouring
Graph colouringGraph colouring
Graph colouring
 
Connectivity of graphs
Connectivity of graphsConnectivity of graphs
Connectivity of graphs
 
MATCHING GRAPH THEORY
MATCHING GRAPH THEORYMATCHING GRAPH THEORY
MATCHING GRAPH THEORY
 
Computer Graphics - clipping
Computer Graphics - clippingComputer Graphics - clipping
Computer Graphics - clipping
 
Seed filling algorithm
Seed filling algorithmSeed filling algorithm
Seed filling algorithm
 
Gis Application
Gis ApplicationGis Application
Gis Application
 
Pixel relationships
Pixel relationshipsPixel relationships
Pixel relationships
 
Clipping
ClippingClipping
Clipping
 

Viewers also liked

Presentación1
Presentación1Presentación1
Presentación1
vargastor
 

Viewers also liked (20)

04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
 
Data Warehouse Project
Data Warehouse ProjectData Warehouse Project
Data Warehouse Project
 
Presentación1
Presentación1Presentación1
Presentación1
 
Should teachers be armed in schools?
Should teachers be armed in schools?Should teachers be armed in schools?
Should teachers be armed in schools?
 
Opleiding Florist bij Syntra-AB
Opleiding Florist bij Syntra-ABOpleiding Florist bij Syntra-AB
Opleiding Florist bij Syntra-AB
 
Student centered discipline plan
Student centered discipline planStudent centered discipline plan
Student centered discipline plan
 
Database Project
Database ProjectDatabase Project
Database Project
 
Workshop kwaliteit jeugd regiobijeenkomst vng december 2016
Workshop kwaliteit jeugd   regiobijeenkomst vng  december 2016Workshop kwaliteit jeugd   regiobijeenkomst vng  december 2016
Workshop kwaliteit jeugd regiobijeenkomst vng december 2016
 
Sollicitatie flyer
Sollicitatie flyerSollicitatie flyer
Sollicitatie flyer
 
PowerPoint les 2 - vacatures schrijven
PowerPoint les 2 - vacatures schrijvenPowerPoint les 2 - vacatures schrijven
PowerPoint les 2 - vacatures schrijven
 
Digitalisering geindustrialiseerd
Digitalisering geindustrialiseerdDigitalisering geindustrialiseerd
Digitalisering geindustrialiseerd
 
02 Related Concepts
02 Related Concepts02 Related Concepts
02 Related Concepts
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
 
Hydraulic Pump
Hydraulic PumpHydraulic Pump
Hydraulic Pump
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
Artificial Intelligence for Automated Decision Support Project
Artificial Intelligence for Automated Decision Support ProjectArtificial Intelligence for Automated Decision Support Project
Artificial Intelligence for Automated Decision Support Project
 
Murmures
MurmuresMurmures
Murmures
 
Problemas de-contaminación-al-medio-ambiente
Problemas de-contaminación-al-medio-ambienteProblemas de-contaminación-al-medio-ambiente
Problemas de-contaminación-al-medio-ambiente
 
Presentación Somos Más junio 2012
Presentación Somos Más junio 2012Presentación Somos Más junio 2012
Presentación Somos Más junio 2012
 

Similar to 03 Data Representation

Section5 Rbf
Section5 RbfSection5 Rbf
Section5 Rbf
kylin
 
functions
 functions  functions
functions
Gaditek
 
Problem Solving by Computer Finite Element Method
Problem Solving by Computer Finite Element MethodProblem Solving by Computer Finite Element Method
Problem Solving by Computer Finite Element Method
Peter Herbert
 

Similar to 03 Data Representation (20)

Dycops2019
Dycops2019 Dycops2019
Dycops2019
 
Integration material
Integration material Integration material
Integration material
 
Integration
IntegrationIntegration
Integration
 
Derivative rules.docx
Derivative rules.docxDerivative rules.docx
Derivative rules.docx
 
Master Thesis on the Mathematial Analysis of Neural Networks
Master Thesis on the Mathematial Analysis of Neural NetworksMaster Thesis on the Mathematial Analysis of Neural Networks
Master Thesis on the Mathematial Analysis of Neural Networks
 
A Systematic Approach To Probabilistic Pointer Analysis
A Systematic Approach To Probabilistic Pointer AnalysisA Systematic Approach To Probabilistic Pointer Analysis
A Systematic Approach To Probabilistic Pointer Analysis
 
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
 
Section5 Rbf
Section5 RbfSection5 Rbf
Section5 Rbf
 
functions
 functions  functions
functions
 
5 numerical analysis
5 numerical analysis5 numerical analysis
5 numerical analysis
 
CORCON2014: Does programming really need data structures?
CORCON2014: Does programming really need data structures?CORCON2014: Does programming really need data structures?
CORCON2014: Does programming really need data structures?
 
Problem Solving by Computer Finite Element Method
Problem Solving by Computer Finite Element MethodProblem Solving by Computer Finite Element Method
Problem Solving by Computer Finite Element Method
 
Algorithms for computing the static single assignment form.pdf
Algorithms for computing the static single assignment form.pdfAlgorithms for computing the static single assignment form.pdf
Algorithms for computing the static single assignment form.pdf
 
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)
 
Function notation by sadiq
Function notation by sadiqFunction notation by sadiq
Function notation by sadiq
 
Introduction to python programming ( part-2 )
Introduction to python programming ( part-2 )Introduction to python programming ( part-2 )
Introduction to python programming ( part-2 )
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
Computer programming 2 Lesson 10
Computer programming 2  Lesson 10Computer programming 2  Lesson 10
Computer programming 2 Lesson 10
 
5 4 Notes
5 4 Notes5 4 Notes
5 4 Notes
 
Radial Basis Function Interpolation
Radial Basis Function InterpolationRadial Basis Function Interpolation
Radial Basis Function Interpolation
 

Recently uploaded

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Recently uploaded (20)

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 

03 Data Representation

  • 1. Reading Report 3. Data Representation Valerii Klymchuk April 3, 2015 0. EXERCISE 0 Chapter 3. Summary 0.1 Continuous Data A lot of phenomena are modeled in terms of various physical quantities. In data representation these quanti- ties can be classified in two fundamentally different categories: intrinsically continuous and intrinsically discrete ones. Continuous data are usually manipulated by computers in some finite approximate form. Continuous sampled data are also discrete, since they consist of finite set of data elements, however in con- trast to intrinsically discrete data, sampled data always originates from, and is intended to approximate, a continuous quantity. In contrast, intrinsically discrete data has no counterpart in the continuous world, as is the case of page of text, for example. This is a fundamental difference between continuous (sampled) and discrete data. Mathematically, continuous data can be modeled as a function f : D ⊂ Rd → C ⊂ Rc between domain and codomain respectively. f is called a d-dimensional, or d-variate, c-value function. In visualization f sometimes is called a field. Function f is continuous if the graph of the function is a connected surface without “holes” or “jumps.” Cauchy − δ criterion states, that f is continuous, if for every point p ∈ C the following holds: ∀ > 0, ∃δ > 0 : if x − p < δ, x ∈ C ⇒ f(x) − f(p) < . Also, f is continuous of order k if the function itself and all its derivatives up to and including order k are also continuous in this sense. This is denoted as f ∈ Ck . Functions f whose derivatives are continuous on compact intervals are called piecewise continuous. The triplet D = (D, C, f) defines a continuous dataset. The dimension d of the space Rd in which function’s domain is embedded is called the geometrical dimension. Topological dimension of the dataset is the dimension s ≤ d of the function domain D itself - the number of independent variables that we need to represent our domain D. For a line of curve in Euclidean space R3 we have s = 1 and d = 3; if D is a plane or curved surface, then s = 2. The geometric dimension is always fixed to d = 3, hence, the only dimension that varies in datasets is the topological dimension s, there fore in practice it is often called the dataset dimension. We assume that geometrical dimension is always three. The co-dimension of an object of topological dimension s and geometrical dimension d is the difference d − s. The function values are usually called dataset attributes. The dimensionality c of the function codomain C is also called the attribute dimension (usually ranges from 1 to 4). 0.2 Sampled Data The two operations relate sampled data and continuous data: 1
  • 2. • sampling: given a continuous dataset, we have to be able to produce sampled data from it; • reconstruction: given a sampled dataset, we have to be able to recover an (approximated) version of the original continuous data. The reconstruction involves specifying the value of the function between its sample points, using the sample values, using a technique called interpolation. The reconstruction quality is a function of the amount and distribution of sample points used. To be used in practice, a sampled dataset should comply with several requirements: it should be accurate, minimal, generic, efficient, and simple. By accurate, we mean that one should be able to control the production of a sampled dataset D∫ from a continuous one D such that D can be constructed from D∫ with a small user specified error. By minimal, we mean that D∫ contains the least number of sample points needed to ensure a reconstruction with the desired error. By generic, we mean that we can easily replace the various data processing operations we had for the continuous D with equivalent counterparts for the sampled D∫ . By efficient we mean that both the reconstruction operation and the data processing operations we wish to perform on D∫ can be done efficiently from an algorithmic point of view. By simple we mean that we can design a reasonably simple software implementation of both D∫ and the operations we want to perform on it. We define reconstruction as follows: given a sampled dataset {pi, fi} consisting of a set of N sample points pi ∈ D and sample values fi ∈ C, we want to produce a continuous function ˆf : D → C that approximates the original f. The reconstructed function should equal the original one at all sample points, i.e., ˆf(pi) = f(pi) = fi. One way to define the reconstructed function that satisfies this property is to set ˆf = N i=1 fiφi, where φi : D → C are called basis functions or interpolation functions. In other words, we defined the reconstruction operation using a weighted sum of a given set of basis functions φi, where weights are exactly our sample values fi. Since we want that ˆf = fj for all sample points pi, we get N i=1 fiφi(pj) = fj, ∀j. This equation must hold to any function f. Let us consider a function φi(pi) = 1, i = j 0, i = j Equation above is sometimes referred to as the orthogonality of basis functions. Let us now consider the constant function g(x) = 1 for any x ∈ D, we obtain N i=1 φi(pi) = 1, ∀pi ∈ D, or N i=1 φi(x) = 1, ∀x ∈ D. The property described in the equation above is called the normality of basis functions. Basis functions that are both orthogonal and normal are called orthonormal. To reconstruct a sampled function, we can use different orthonormal basis functions. A grid, sometimes also called a mesh, is a subdivision of a given domain D ∈ Rd into a collection of cells, sometimes also called elements, denoted ci. The union of the cells completely covers the sample domain, i.e., i ci = D, and the cells are non-overlapping, i.e., ci cj = 0, ∀i = j. We can now define the simplest set of basis functions, the constant basis functions. These functions approximate a given function by the piecewise, per-cell, constant sample value fi (for every point x ∈ D it assigns the sample value of the nearest cell center). For this reason, the piecewise constant interpolation is also called nearest-neighbor interpolation. Constant basis functions are simple to implement and have no computational cost, they work for any cell shape and in any dimension, however, these functions provide a poor, staircase like approximation ˆf of the original f. Over every cell visualization has a visible discontinuity. By using higher-order basis functions we can provide a better and more continuous reconstruction. The next-simples basis functions beyond the constant ones are the linear basis functions. To use these, however, we need to make some assumptions about the cell types used in the grid. Let us consider a single quadrilateral cell c having the vertices (v1, v2, v3, v4), where v1 = (0, 0), v2 = (1, 0), v3 = (1, 1) and v4 = (0, 1) - axis-aligned square of edge size 1 with the origin as first vertex. We call this the reference 2
  • 3. cell in R2 . Coordinates in the reference cell [0, 1]d are called reference coordinates: r1, ..., rd (or r, s, t for d = 3). We define now four local basis functions Φ1 1, Φ1 2, Φ1 3, and Φ1 4; Φ1 i : [0, 1]2 → R as follows: Φ1 1(r, s) = (1 − r)(1 − s), Φ1 2(r, s) = r(1 − s), Φ1 3(r, s) = rs), Φ1 4(r, s) = (1 − r)s. These basis functions are indeed orthonormal. For any point (r, s) in the reference cell, we can now use these basis functions to define a linear function ˆf(r, s) = 4 i=1 fiΦ1 i (r, s) as a sum of linear basis functions, which makes it a first-order continuous reconstruction of the four sample values f1, f2, f3, f4 defined at the cell vertices. For every arbitrary quadrilateral cell c in R3 , we can define a coordinate transformation T : [0, 1]2 → R3 that maps our reference cell to c. We want to map the reference cell vertices vi to the corresponding world cell pi, so T(vi) = pi. We define our transformation T using our reference basis functions to map from a point r, s, t in the reference cell coordinate system to a point x, y, z in the actual cell to be (x, y, z) = T(r, s, t) = n i=1 piΦ1 i (r, s, t). If T maps the reference cell to the world cell then its inverse T−1 maps points x, y, z in the world cell to points r, s, t in the reference cell, where our basis functions Φ1 i are defined, Using T−1 , we can rewrite equation (3.2) for our quad cell c: ˆf(x, y) = 4 i=1 fiΦ1 i (T−1 (x, y)). In order to compute the inverse transformation T−1 , we must invert the expression given by Equation (3.8). This inversion depends on the actual cell type. We now have a way to reconstruct a piecewise linear function ˆf from samples on any quad grid: for every cell c in the grid, we simply apply Equation(3.9). We can now finally define our piecewise linear reconstruction in terms of a set of global basis functions φ, just like we did for piecewise constant reconstruction (Equation (3.6)). Given a grid with sample points pi and quad cells ci, we can define our grid-wise linear basis functions φ1 i as follows: φ1 i (x, y) = 0, if (x,y)/∈ cells(pi), Φ1 i (T−1 (x, y)), if(x,y) ∈ c=v1,v2,v3,v4, where vj = pi, where cells(pi) denotes the cells that have pi as a vertex. Sampling the continuous signal f produces a set of samples fi. Multiplying the samples by the global basis functions φi obtained from the reference basis functions Φj via the transform T, we obtain the reconstructed signal ˆf. We can use basis-function machinery and sampling and reconstruction mechanisms applied to more data attributes than surface geometry alone (e.g., - to shading). Gourand shading produces a smooth illumination over the polygon by reconstructing original continuous surface using piecewise linear interpolation for both the geometry and illumination. 0.3 Discrete Datasets We can say, that, given: • a grid in terms of a set of cells defined by a set of sample points, • some sampled values at the cell centers or cell vertices, • a set of basis functions, we can define a piecewise continuous reconstruction of the sampled signal on this grid and work with it. We defined a continuous dataset dataset for a function f : D → C as the triplet D = (D, C, f). In the discrete case, we replace the function domain D by the sampling grid (pi, ci), and the continuous function 3
  • 4. f by its piecewise k-order continuous reconstruction ˆf computed using the grid, the sample values fi, and a set of basis functions {Φk i }. Hence, the discrete (sampled) dataset counterpart of (D, C, f) is the tuple Ds = ({pi}, {ci}, {fi}, {Φk i }): grid points, grid cells, sample values, and reference basis functions. Replacing a continuous dataset D with its discrete counterpart D∫ means working with a piecewise k-order continuous function ˆf instead of a potentially higher-order continuous function f. Dataset require- ments: accurate, minimal, generic, efficient and simple for a discrete dataset translate to constraints on the number and position of sample points pi, shape of cells ci, type of reference basis functions Φi, and number and type of sampling values fi. These constraints determine specific implementation solutions as follows. The cell shapes, together with the basis functions, determine different cell types. The number and type of sample values fi determine the attribute types. 0.4 Cell Types A grid is a collection of cells ci, whose vertices are the grid sample points pi. Given some data sampled at the points pi, the cells are used to define supports for the basis functions φi used to interpolate the data between the sample points. The dimensionality d of the cells ci has to be the same as the topological dimension of the sampled domain D, if we want to approximate D by the union of all cells i ci. For example, if D is a plane (d = 2), we must use planar cells, such as polygons. If D is a volume (d = 3), we must use volumetric cells, such as tetrahedra. For each cell type we shall present the linear basis functions it supports, as well as the coordinate transformation T−1 that maps from locations (x, y, z) in the actual world cell to locations (r, s, t) in the reference cell. 0.4.1 Vertex The simplest cell type of dimension d = 0 is identical to its single vertex, c = v1. The vertex has a single, constant basis function Φ0 1(r) = 1. In practice there us no distinction between sample points and vertex cells. 0.4.2 Line Line cells have dimension d = 1 and two vertices c = v1, v2. Line cells used to interpolate along any kind of curves embedded in any dimension. Given the reference line cell defined by the points v1 = 0, v2 = 1, the two linear basis functions are Φ1 1(r) = (1 − r), Φ1 2(r) = r The transformation T−1 for line cells is simply the dot product between the position vector of the desired point in the cell p = (x, y, z) with respect to the first cell’s vertex p1 and the cell vector p1p2: T−1 line(x, y, z) = (p = p1)(p2 − p2). 0.4.3 Triangle The simplest cell type in dimension d = 2 is the triangle, i.e., c = v1, v2, v3. Triangle can be used to interpolate along any kind of surfaces embedded into any dimension (planar or curved). Given the reference triangle cell defined by the points v1 = (0, 0), v2 = (1, 0), v3 = (0, 1), the three linear basis functions are Φ1 1(r, s) = 1 − r − s, Φ1 2(r, s) = r, Φ1 3(r, s) = s. The transformation T−1 for triangular cells is T−1 tri = (r, s) = (p−p1)×(p3−p1) (p2−p1)×(p3−p1) , (p−p1)×(p2−p1) (p3−p1)×(p2−p1) It is computed as dot products between the position vector p−p1 of the point p in the world cell with the respect to the world cell’s first vertex p1 and the world cell edges p2p1. 4
  • 5. 0.4.4 Quad Another possibility to interpolate over two-dimensional surfaces is to use quadrilateral cells, or quads. The reference quad is defined by the points v1 = (0, 0), v2 = (1, 0), v3 = (1, 1) and v4 = (0, 1) and is an axis-aligned square of edge size 1. On this reference quad the basis functions are Φ1 1(r, s) = (1 − r)(1 − s), Φ1 2(r, s) = r(1 − s), Φ1 3(r, s) = rs, Φ1 4(r, s) = (1 − r)s. A good trade-off between flexibility and simplicity is to support quad cells as input data, but transform them internally into triangle cells, by dividing every quad into two triangles using one of its two diagonals. The transformation T−1 quad for a general quad cell deals with bilinear basis functions and can not be easily inverted. We can only solve it numerically for r, s as functions of x, y, z. If our actual cells are rectangular instead of arbitrary quads, like in uniform or rectilinear grid, we can do better. In this case the transformation T−1 rect: T−1 rect = (r, s) = (p−p1)·(p2−p1) p2−p1 2 , (p−p1)·(p4−p1) p4−p1 2 . 0.4.5 Tetrahedron The simplest cell type in demotion d = 3 is the tetrahedron, defined by its four vertices c = (v1, v2, v3, v4). On the reference tetrahedron defined by the points v1 = (0, 0, 0), v2 = (1, 0, 0), v3 = (0, 1, 0) and v4 = (0, 0, 1), the four linear basis functions are Φ1 1(r, s, t) = 1 − r − s − t, Φ1 2(r, s, t) = r, Φ1 3(r, s, t) = s, Φ1 4(r, s, t) = t. Given a tetrahedral cell with vertices p1, p2, p3, p4, the transformation T−1 tet = (r, s, t) follows the same pattern: r = |((p−p4)·((p1−p4)×(p3−p4))| |((p1−p4)·((p2−p4)×(p3−p4))| , s = |((p−p4)·((p1−p4)×(p2−p4))| |((p1−p4)·((p2−p4)×(p3−p4))| , t = |((p−p3)·((p1−p3)×(p2−p3))| |((p1−p4)·((p2−p4)×(p3−p4))| , Some applications use also pyramid cells and prism cells to discretize volumetric domain. Pyramid and prism cells can be split into tetrahedral cells. 0.4.6 Hexahedron The next d = 3 dimensional cell type is the hexahedron, or hex, defined by its eight vertices c = (v1, ..., v8). The reference hexahedron is the axis-aligned cube of unit edge length, with v1 at the origin. On this cell the eight linear basis functions are Φ1 1(r, s, t) = (1 − r)(1 − s)(1 − t), Φ1 2(r, s, t) = r(1 − s)(1 − t), Φ1 3(r, s, t) = rs(1 − t), Φ1 4(r, s, t) = (1 − r)s(1 − t), Φ1 5(r, s, t) = (1 − r)(1 − s)t, Φ1 6(r, s, t) = r(1 − s)t, Φ1 7(r, s, t) = rst, Φ1 8(r, s, t) = (1 − r)st. WE can split hexahedral cells into six tetrahedra each and then use only tetrahedra as 3D cell types, simpli- fying software implementations and maintenance. T−1 hex for hexahedral cells cannot be computed analyticaly, and must be determined using numerical methods. However, in case our actual hex cells are parallelepipeds 5
  • 6. (orthogonal edges), these cells can be called box cells. In this case, T−1 hex can be computed by taking a dot product of the position vector p − p1 with the cell edges. For a box cell with vertices p1...p8, we obtain: T−1 box(x, y, z) = (r, s, t) = (p − p1)(p2 − p1) p2 − p1 2 , (p − p1)(p4 − p1) p4 − p1 2 , (p − p1)(p5 − p1) p5 − p1 2 . Software packages sometimes offer more cell types, such as squares and pixels (identical to rectangle grid), triangle strips (memory-efficient way to store sequences of triangle cells that share edges), polygons in 2D, and cubes and voxels in 3D (same role as squares and pixels have in 2D). Some applications use quadratic cells and support quadratic basis functions and provide piecewise quadratic (smoother) reconstruction of data, which is C2 continuous, and are often used in numerical simulations applications such as finite element methods. In general, you should add new cell types to your application data representation only if these allow you to implement some particular visualization or data processing algorithms much more easily and/or efficiently that cell types your software already supports. Quadratic cells also contain a midpoint for edges and, for 3D cells, centers of cell faces. 0.5 Grid Types 0.5.1 Uniform Grids In a uniform grid, the domain D is an axis-aligned box, e.g., a line segment for d = 1, rectangle for d = 2, or parallelepiped for d = 3. On a uniform grid , sample points pi ∈ D ⊂ R are equally spaced along the d axes of the domain D. Hence, in the uniform grid, a sample point is described by its d integer coordinates n1 . . . , nd. These integer coordinates are sometimes called structured coordinates. A simple example of uniform grid is a 2D pixel image, where every pixel pi is located by two integer coordinates. This regular point ordering allows us to define the grid cells implicitly by using the point indexes. The magor advantages of uniform grids are their simple implementation and practically zero storage requirements. Regardless of its size, storing d-dimentional grid itself takes 3d floatong-point values, i.e., only 12d bytes of memory. Storing the actual sample values at the grid points takes storage proportional to the number of sample points. 0.5.2 Rectilinear Grids Uniform grids are simple and efficient, but have limited modeling power. To accurately represent a function with a non-uniform variation rate, we need either to use a high sampling density on a uniform grid, or use a grid with non-uniform sample density. Rectilinear grids relax the constraint of equal sampling distances for a given axis, but keeps the axis-aligned, matrix-like point ordering and implicit cell definition. These grids are similar to the uniform ones, except that the distances δi,j between the sample points are now not equal along the grid axes. Implementing a rectilinear grid implies storing the grid origins (mi, Ni) and sample counts for every dimension d, as for the uniform grid. Additionally, we must store sample steps. In total, the storage requirements are 2d + d i=1 Ni values. 0.5.3 Structured Grids In rectilinear grids the samples domain is still a rectangular box. and the sample point density can be changed only one axis at a time. Rectangular grids, for example, do not allow us to place more sample points only in the central peak region of an exponential function. Structured grids allow explicit placement of every sample point pi = (xi1, . . . , xid). The user can freely specify the coordinates xij of all points. At the same time structured grids preserve the matrix-like ordering of the sample points. Implementing a structured grid implies storing the coordinates of all grid sample points pi and the number of points N1, . . . , Nd per dimension. Structured grids can represent a large number of shapes. 6
  • 7. 0.5.4 Unstructured Grids Structured grids can be seen as a deformation of uniform grids, where topological ordering of the points (cells) stays the same, but their geometrical position is allowed to vary freely. There are, however, shapes that cannot be effectively modeled by structured grids. They allow defining both their sample points and cells explicitly. An unstructured grid can be modeled as a collection of sample points pi, i ∈ [0, N] and cells ci = (vi1, . . . , viCi ). The values vij ∈ [0, N] are called cell vertices and refer to the sample points pvij used by the cell. A cell is thus an ordered list of sample point indices. This model allows us to define every cell separately and independently of the other cells. Also, cells of different type and even dimensionality can be freely mixed in the same grid, if desired. If cells share the same sample points as their vertices, this can be directly expressed, which is useful in several contexts. * Storing index represented by integer is usually cheaper than storing a d-dimensional coordinate (d floating numbers) * We can process the grid geometry (positions of the sample points pi) independently of the grid topology, i.e., the cell definitions. In practice, it is preferable to use unstructured grids containing a single cell type, as these are simpler to implement and also can lead to faster application code. The costs of storing an unstructured grid depend on the types of cells used and the actual grid. For example, a grid of C d-dimensional cells with V vertices per cell and N sample points would require dN + CV values. 0.6 Attributes In visualization, the set of sample values of a sampled dataset is usually called attribute data. Attribute data can be characterized by their dimension c, as well as the semantics of the data they represent. This gives rize to several attribute types. 0.6.1 Scalar Attributes Scalar attributes are c = 1 dimentional. They are represented by plain real numbers. They encode various physical quantities such as temperature, concentration, pressure, or density, or geometrical measures, such as length or height (elevation plot function f : R2 → R). 0.6.2 Vector Attributes They are usually c = 2 or c = 3 dimensional. Vector attributes can encode position, direction, force, or gradients of scalar functions. Usually vectors have an orientation and a magnitude, also called length or norm. 0.6.3 Color Attributes Color attributes are usually c = 3 dimensional and represent the displayable colors on a computer screen. Three components of a color attribute can have different meanings, depending on the color system in use (RGB system). RGB is an additive system, since every color is represented as a mix of pure red, green and blue colors in different amounts. Equal amounts of these colors determine gray shades, whereas other combinations determine various hues. Another popular color representation system is the HSV system, where the three color components specify the hue, saturation, and value of a given color. The advantage of the HSV system is that it is more intuitive for the human user. Hue distinguishes between different colors of different wavelengths, such as red, yellow, and blue. Saturation represents the color purity. A saturation of 1 corresponds to pure, undiluted color, whereas a saturation of 0 corresponds to white. Value represents the brightness, or luminance, or a given color. A value of 0 is always black, whereas a value of 1 is three brightest color of a given hue and saturation that can be represented on a given system. The value of luminance component of an HSV color is equal to the maximum of the R, G, and B components. 0.6.4 Tensor Attributes Tensor attributes are high-dimensional generalizations of vectors and matrices. We can compute the curva- ture of a planar curve using its second derivative d2 f dx2 , and the curvature of a 3D surface in a given direction 7
  • 8. using its Hessian matrix H of partial derivatives. The hessian matrix is also called the curvative tensor of the given surface. Besides curvature, tensors can describe other physical quantities that depend on direction, such as water diffusivity or stress and strain in materials. Tensors are characterized by their rank. Scalars are tensors of rank 0. Vectors are tensors of rank 1. The Hessian curvature tensor is a rank 2 symmetric tensor since it is expressed by a symmetric, rank 2 matrix. 0.6.5 Non-Numerical Attributes Examples of possible non-numerical attribute types are text, images, file names, or even sound samples. The main property for D∫ is to permit us to reconstruct some piecewise, k-order continuous function ˆf : D → C, given the sample values fi ∈ C. What should the meaning of the multiplication between sample values fi and real-valued basis functions Φi and of addition of the sample values fi in Equation (3.9) be? 0.6.6 Properties of Attribute Data The main purpose of attribute data is to allow a reconstructions ˆf of the sampled information fi. Attribute data has several general properties: • attribute data, the sample values fi, must be defined for all sample points pi of a dataset Ds. If samples in some points pi are missing, there several solutions: 1. remove these points completely from the grig, 2. define missing values fi in some way or replace them with some special value (like 0), 3) we can define missing values using existing values, using some complex interpolation scheme. • cell type can contain any number of attributes, of any type, as long as these are defined for all data points. We can choose whether we want to model our data as a single c-value dataset or as c one-value datasets. The answer is to consider all attributes that have a related meaning as a single higher- dimensional attribute - separate attributes with different meanings. Operations of color attributes must consider all color components simultaneously, as color components R, G, B have a related meaning. Some data visualization applications classify attribute data into: • node or vertex attributes - defined at the vertices of the grid cell and correspond to a sampled dataset and • cell attributes - defined at the center points of the grid cells - correspond to sampled dataset that uses constant basis functions. Vertex attributes can be converted to cell attributes and conversely by resampling. The attribute components are sometimes related by some constraints. This happens for normal attributes n ∈ R3 , where the three components are constrained to yield unit length normals, i.e., |n| = n2 x + n2 y + n2 z. Depending on the choice of the basis functions, interpolating these components separately as scalar values may not preserve the unit length properly on the interpolated normal n. First solution is to interpolate the components separately, and then enforce the desired constraint on the result by normalizing it. i.e., replacing n with n/|n| (works when sample values do not vary too strongly across a grid cell). Second solution is to represent the constraint directly in the data attributes, rather than enforcing it after interpolation. For normal attribute types, this means representing 3D normals as two independent orientations, e.g., using polar coordinates α, β, instead of using the tree x, y, z components, which are dependent via the unit length constraint. We can now interpolate the normal orientations α, β using the desired basis functions, and will always obtain the correct result. 0.7 Computing Derivatives of Sampled Data One of the requirements for a sampled dataset D∫ = (pi, ci, fi, Φi) is that it should be generic: we can easily replace various data processing operations available for continuous counterpart with equivalent operations in D∫ . 8
  • 9. ˆf = N i=1 fiφi, then ∂ ˆf ∂xi = N j=1 fj ∂φi ∂xi . Using the expressions of the reference basis functions: ∂ ˆf ∂xi = N j=1 fj ∂Φi ∂xi (r). We now use the chain rule and obtain: ∂Φ ∂xi = d j=1 ∂Φi ∂rj ∂rj ∂xi to obtain ∂ ˆf ∂xi = N j=1 fj d k=1 ∂Φi ∂rk ∂rk ∂xi . Finally, we canx rewrite last equation in a convenient matrix form, as follows:      ∂ ˆf ∂x1 ∂ ˆf ∂x2 . . . ∂ ˆf ∂xd      = N j=1 fj     ∂r1 ∂x1 ∂r2 ∂x1 . . . ∂rd ∂x1 ∂r1 ∂x2 ∂r2 ∂x2 . . . ∂rd ∂x2 . . . ∂r1 ∂xd ∂r2 ∂xd . . . ∂rd ∂xd     inverse Jacobian matrix J−1      ∂Φj ∂r1 ∂Φj ∂r2 . . . ∂Φj ∂rd      The matrix above is called the inverse Jacobian matrix J−1 = (∂ri/∂xj)ij. this matrix is in- verse of the Jacobian matrix J = (∂xi/∂rj)ij. Using, T−1 , we can rewrite the inverse Jacobian as J−1 = ( ∂T −1 i (x1,...,xd) ∂xj )ij, where T−1 denotes the it-h component of the function T−1 . Putting it all to- gether, we get the formula for computing the partial derivatives of a sampled dataset ˆf with respect to all coordinates xi: ( ∂ ˆf ∂xi ) = N k=1 fk( ∂T −1 i ∂xk )ij(∂Φk ∂ri )i. To use this equation in practice, we need to evaluate the derivatives of both the reference basis functions Φk and T−1 for every cell type. Alternatively, we can evaluate the Jacobian matrix instead of its reverse, using the reference-cell to world-cell coordinate transform T instead of T−1 , then numerically invert J, and finally apply Equation (3.33). For all cells described in Section 3.4, the coordinate transformation T−1 are linear functions of the arguments xi, so their derivatives are constant. Hence, the derivatives of ˆf are of the same order as those of the basis functions Φk we choose to use. Partial derivatives of ˆf inside a given cell are computed by linearly interpolating the 1D derivatives of ˆf along opposite cell edges. A similar result can be obtained for rectilinear grids as well as for hexahedral cells. If a dataset is noisy, the computed derivatives tend to exhibit even stronger noise that the original data. A simple method to limit these problems is to pre-filter the input dataset in order to eliminate high frequency noise, using methods such as the Laplacian smoothing described in Section 8.4. However, smoothing can also eliminate important information from the dataset together with the noise. 0.8 Implementation 0.8.1 Grid Implementation 0.9 Advanced Data Representation Sometimes more advanced forms of data manipulation and representation are needed. We will describe the task of data resampling, which is used in the process of converting information between different types of datasets that have different sample points, cells or basis functions. 0.9.1 Data Resampling Lets consider piecewise constant normal - polygon normals themselves, which are discontinuous at the poly- gon vertices and actually, over the complete polygon edges - so we can not use them for approximations for the vertex normals. How can we compute vertex normal values from the known polygon normals? The answer is provided by operation called resampling. Resampling computes the values fi of the target dataset as function of the values fi of the source dataset. For simplicity, we assume that both datasets use the same set of basis functions Φi. Let us now consider a common resampling operation in data visualization: converting cell attributes (fi) to vertex attributes (fi ). Cell attributes imply the use of constant basis functions Φi, vertex attributes, in contrast, imply the use of higher-order basis functions, such as linear ones. On the other hand we want the sample points of the target grid cells (target grid vertices) to be identical to the source vertices for the two grids to match. Vertex data is the area weighted-average of the cell data in the cells that use a given vertex. Cell attributes are the average of the cell’s vertex attributes. 9
  • 10. Resampling data from cells to vertices increases the assumed continuity. If our original sampled data were indeed continuous of that order, no problem appears. However if the original data contained, e.g., zero-order discontinuities, such as jumps or holes, resampling it to a higher-continuity grid also throws away discontinuities which might have been a feature of the data and not a sampling artifact. In contrast, resampling from a higher continuity (vertex data) to a lower continuity (cell data) has fewer side effects- overall, the smoothness of the data decreases globally. Two other frequently used resampling operations are subsampling and supersampling. Subsampling re- duces the number of sample points that are the subset of original dataset points (optimizing the process speed and memory demands, working with smaller datasets). After eliminating some number of points subsampling operations can choose or redistribute the remaining points in order to obtain a better approximation of the original data. Subsampling implementations can take advantage of dataset topology. A desirable property of subsampling is to keep most samples in the regions of rapid data variations and cull most samples from the regions of slow data variation. A technique, called uniform subsampling, is simple and effective when the original dataset is densely sampled it is used in uniform , rectilinear and structured grids to keep every k-th point along every dimention and discard the remaining ones. Supersampling or refinement is the inverse of subsampling: more data points are created from an existing dataset. It is useful in situations when we try to create or manipulate information on a dataset at a level of detail, or scale, that is below the one captured by the sampling frequency. Uniform supersampling introduces k points into every cell of the original dataset. An efficient supersampling implementation usually inserts extra points only in those regions where we need to further add extra information. 0.9.2 Scattered Point Interpolation There are situations when we would like to avoid constructing and storing a grid of cells to represent data domain. 3D scanner delivers a scattered 3D point set, also called a point cloud: point and their corresponding data values pi, fi. For scanner the data values fi are the surface normals and/or color measured by the device. How do we reconstruct continuous surface if we were given a set above with points and normals? Constructing a grid from scattered points (triangulation): unstructured grid with 2D cells, e.g, triangles, which have pi as vertices and approximate the surface as much as possible. A second way is griddles interpolation. Storing the cell information can double the amount of memory required in the worst case. To reconstruct a continuous function from a scattered point set we need a set of griddles basis functions. There are several ways to construct such functions, frequently used choice for griddles basis functions is radial basis functions or RBFs. These functions depend only on the distance between the current point and the origin r = |x| = d i=1 x2 i . RBFs smoothly drop from 1 at their origin (r = 0) to a vanishing value for large values of the distance r. To limit the effect of a basis function to its immediate neighborhood, we specify a radius of influence R, or support radius, beyond which Φ is equal to zero. In this setup a common RBF is the Gaussian function. Φ(x) = e−kr2 , r < R, 0, r ≥ R, where r = |x|. The parameter k ≥ 0 controls the decay speed, or the shape of the radial basis functions. Setting k = 0 yields constant cylinder shaped radial functions, which are equivalent to the constant basis functions for grid based datasets. Another popular choice are inverse distance functions defined as Φ(x) = 1 1+r2 , r < R, 0, r ≥ R, where r = |x|. The radius values Ri control the influence of the sample data value of a point pi. Higher values of Ri yield smoother reconstructions at higher computational cost, lower values of Ri yield less-smooth reconstruction but higher performance. In practice, setting Ri to the average inter-point distance in the neighborhood of point pi gives a good balance between smoothness and efficiency. Given a point p, we shall sum only those basis functions φk that are nonzero at p. In case of radial basis functions, we must find the k nearest sample points p1, . . . , pk to p so that |p − pk| < Rk. One way 10
  • 11. to accomplish this is to store all sample points pi in a spatial search structure such as a kd-tree. Spatial search structures provide efficient retrieval of the k nearest neighbors at any given location. A good, scalable implementation of such a search structure is provided by the Approximate Nearest Neighbor (ANN) library. Scattered point data sets sometimes are called unstructured point datasets, however, if the function of a dataset is to provide a piecewise continuous reconstruction of its data samples, we need to specify also a choice for the basis functions Φi to have a complete dataset (pi, fi, Φi). To effectively perform the reconstruction, searching methods are needed that return the sample points pi located in the neighborhood of a given point p. What have you learned in this chapter? This chapter lays out a discussion on discrete data representation, continuous data sampling and re- construction. Fundamental differences between continuous (sampled) and discrete data are outlined. It introduces basic functions, discrete meshes and cells as means of constructing piecewise continuous approx- imations from sampled data. I learned about various types of datasets commonly used in the visualization practice: their advantages, limitations and constraints This chapter gives an understanding of various trade- offs involved in the choice of a dataset for a given visualization application while focuses on efficiency of implementing the most commonly used datasets presented with cell types in d ∈ [0, 3] dimensions. What surprised you the most? I was surprised to find out that there are few representations and mapping of colors between RGB and HSV space. I was surprised to find out how griddles interpolation works and that it exists. Also, that reconstruction of scattered/unstructured point datasets requires using searching methods to locate nearest sample points in the neighborhood of a given point. I was surprised that datasets with attributes such as text, images, or relations form the target of infor- mation visualization applications, since they are purely discrete, and often not defined on a spatial domain. What applications not mentioned in the book you could imagine for the techniques ex- plained in this chapter? I can only imagine a datasets that stores high dimentional attributes in order to allow just enough continuity to perform various types of resampling between target and source grids of certain type. Selecting a set of useful grids and proper resampling might improve original visualization model in a way that it will focus more on nature of a signal, by depending less on the structure/representation of its sampled data . 1. EXERCISE 1 Consider the following datasets: • The evolution in time of the prices of N different stock-exchange shares, recorded at one-second intervals over the period of one hour. • The paths covered by all cars driving through a given city, recorded at one- minute intervals over the period of one hour. For each record, we store the car ID, the car’s position, and the car’s speed. • The amount of rainfall and the air temperature, recorded at a given time instant at N given weather stations over some geographical area. Describe the kind of grid, grid cells, and data attributes that you would use to store such a dataset. Argue your proposal by considering the kind of data to store, and the locations at which data is recorded (sampled). • grid: uniform linear grid with 1 second intervals; grid cells - lines length of 1 second; data attributes: price for each our of N shares (360 samples per hour times N shares = 360N values to store) • grid - a data structure with spatial search that utilizes average interpoint distance between points; cells - grigless radial basis function with compact support; data attributes: car ID, car’s position, car’s speed, basis functions. • grid - rectilinear structured grid with specified sampling locations; cells: quads; data attributes: am- mount of rainfall, temperature, location 2. EXERCISE 2 Sampling and reconstruction are closely related operations which reduce a function y = f(x) to a finite set of sample points (xi, yi) and, respectively, reconstruct an approximation ˆy = ˆf(x) of f(x) from the sample points. Consider an application where you have to perform the above reconstruction ˆf(x), but you 11
  • 12. are only allowed to use a fixed finite number N of sample points xi. How would you place these sample points over the domain of definition of x so that the reconstruction error | ˆf − f| k is equally well minimized over the entire range of x? Hints: first, consider the kinds of basis functions you want to use (e.g., constant or linear). Next, consider how you can minimize the reconstruction error by shifting the points xi around the x axis. • In case of constant basis functions we can use unoform sampling density with N points placed at equal distances from each other. • In case of linear basis functions we can use non-uniform sampling density, in order to assign more sample points to those areas of domain, where function’s higher order derivatives change fast. 3. EXERCISE 3 In Figure 3.10 in Chapter 3 (also displayed below), it is shown that we can use structured grids to cover a 2D disk shape. Now, consider an arbitrary convex 2D shape of genus 0 (that is, without holes). The 2D shape is specified by means of its contour, which is given as a closed 2D polyline of N points. • Can we always construct a structured grid so that all points of this polyline will be also points on the grid’s boundary? If not, sketch a simple counter-example. • Can we always construct a structured grid with the conditions listed in the point above and the additional condition that no grid-boundary point exists which is not a polyline point? If not, sketch a simple counter-example. Hints: Think about the number of points on the boundary of a structured grid. • Yes, we can always construct a structured grid of N points and N −2 triangular cells. Since all internal angles of the shape are less that 180 degrees, then it is always possible to take one vertice and connect it to remaning (N − 2) vertices in order to form a triangular structured grid. • Yes, it is always possible for a convex 2D polyline to use triangulars as describes above so, that all polyline points are also grid-boundary points. 4. EXERCISE 4 As shown in Figure 3.11 in Chapter 3 (also shown below), not all 2D shapes can be covered by structured grids. Consider now a 3D (curved) surface of a half sphere. Can we cover this surface with a structured grid? Argue your answer. Yes, we can cover such half a sphere with a structured grid, consisting of tetraahedral cells. Such shape consist only of one component and genus of domain here I assume equals to 0. 5. EXERCISE 5 Consider the 2D cells in the figure below. For each cell, scalar data values vi are indicated at its sample points (vertices). Additionally, a separate point p inside the cell is indicated. If bilinear interpolation is used, compute the interpolated value v(p) of the vertex data values vi at the point p. Detail your answer by explaining how you computed the interpolated value. • For rectangular quad: T−1 rect = (r, s) = (p − p1) · (p2 − p1) p2 − p1 2 , (p − p1) · (p4 − p1) p4 − p1 2 where :    (p − p1) = (4 − x1, 3 − y1) = (3, 1), (p2 − p1) = (x2 − x1, y2 − y1) = (4, 0), (p4 − p1) = (x4 − x1, y4 − y1) = (0, 3), p2 − p1 2 = (x2 − x1)2 = 42 = 16, p4 − p1 2 = (y4 − y1)2 = 32 = 9. 12
  • 13. T−1 rect = (r, s) = (3, 1) · (4, 0) 16 , (3, 1) · (0, 3) 9 = 12 16 , 3 9 = 3 4 , 1 3 . Calculating 4 basis functions as follows: Φi(T−1 rect) =    Φ1 1(r, s) = (1 − r)(1 − s) = (1 − 3/4)(1 − 1/3) = 1/6, Φ1 2(r, s) = r(1 − s) = (3/4) · (1 − 1/3) = 3/4 × 2/3 = 1/2, Φ1 3(r, s) = rs = 3/4 × 1/3 = 3/12 = 1/4, Φ1 4(r, s) = (1 − r)s = (1 − 3/4) · (1/3) = 1/4 × 1/3 = 1/12. Finally, we calculate value for: v(p) = ˆp(x, y) = 4 i=1 viΦ1 i = 3· 1 6 +1· 1 2 +4· 1 4 +0· 1 12 = 1/2+1/2+1+0 = 2. Answer: v(p) = 2. • For triangular cell: T−1 tri = (r, s) = (p − p1) × (p3 − p1) (p2 − p1) × (p3 − p1) , (p − p1) × (p2 − p1) (p3 − p1) × (p2 − p1) , where :    (p − p1) = (3 − x1, 3 − y1) = (2, 1), (p2 − p1) = (x2 − x1, y2 − y1) = (4, 1), (p3 − p1) = (x3 − x1, y3 − y1) = (0, 3), T−1 tri = (r, s) = (2, 1) × (0, 3) (4, 1) × (0, 3) , (2, 1) × (4, 1) (0, 3) × (4, 1) = 3 3 , 9 3 = (1, 3) Calculating 4 basis functions as follows: Φi(T−1 tri ) =    Φ1 1(r, s) = 1 − r − s = 1 − 1 − 3 = −3, Φ1 2(r, s) = r = 1 = 1, Φ1 3(r, s) = s = 3 = 3, Finally, we calculate value for: v(p) = ˆp(x, y) = 3 i=1 viΦ1 i = 3 · (−3) + 1 · 1 + 4 · 3 = 4. Answer: v(p) = 4. 6. EXERCISE 6 Consider the 2D cells in the figures below. For each cell, vector data values vi are indicated at its sample points (vertices). Additionally, a separate point p inside the cell is indicated. If bilinear interpolation is used, compute the interpolated value v(p) of the vertex data values vi at the point p. Detail your answer by explaining how you computed the interpolated value. • For rectangular quad: T−1 rect = (r, s) = (p − p1) · (p2 − p1) p2 − p1 2 , (p − p1) · (p4 − p1) p4 − p1 2 where :    (p − p1) = (4 − x1, 3 − y1) = (3, 1), (p2 − p1) = (x2 − x1, y2 − y1) = (4, 0), (p4 − p1) = (x4 − x1, y4 − y1) = (0, 3), p2 − p1 2 = (x2 − x1)2 = 42 = 16, p4 − p1 2 = (y4 − y1)2 = 32 = 9. T−1 rect = (r, s) = (3, 1) · (4, 0) 16 , (3, 1) · (0, 3) 9 = 12 16 , 3 9 = 3 4 , 1 3 . 13
  • 14. Calculating 4 basis functions as follows: Φi(T−1 rect) =    Φ1 1(r, s) = (1 − r)(1 − s) = (1 − 3/4)(1 − 1/3) = 1/6, Φ1 2(r, s) = r(1 − s) = (3/4) · (1 − 1/3) = 3/4 × 2/3 = 1/2, Φ1 3(r, s) = rs = 3/4 × 1/3 = 3/12 = 1/4, Φ1 4(r, s) = (1 − r)s = (1 − 3/4) · (1/3) = 1/4 × 1/3 = 1/12. Finally, we calculate value for: v(p) = ˆp(x, y) = 4 i=1 viΦ1 i = (1, 0) · 1 6 + (0, 1) · 1 2 + (1, 1) · 1 4 + (2, 1) · 1 12 = (1 6 , 0) + (0, 1 2 ) + (1 4 , 1 4 ) + (1 6 , 1 12 ) = (7/12, 5/6). Answer: v(p) ≈ (0.58, 0.83). • For triangular cell: T−1 tri = (r, s) = (p − p1) × (p3 − p1) (p2 − p1) × (p3 − p1) , (p − p1) × (p2 − p1) (p3 − p1) × (p2 − p1) , where :    (p − p1) = (3 − x1, 3 − y1) = (2, 1), (p2 − p1) = (x2 − x1, y2 − y1) = (4, 1), (p3 − p1) = (x3 − x1, y3 − y1) = (0, 3), T−1 tri = (r, s) = (2, 1) × (0, 3) (4, 1) × (0, 3) , (2, 1) × (4, 1) (0, 3) × (4, 1) = 3 3 , 9 3 = (1, 3) Calculating 4 basis functions as follows: Φi(T−1 tri ) =    Φ1 1(r, s) = 1 − r − s = 1 − 1 − 3 = −3, Φ1 2(r, s) = r = 1 = 1, Φ1 3(r, s) = s = 3 = 3, Finally, we calculate value for: v(p) = ˆp(x, y) = 3 i=1 viΦ1 i = (0, −1) · (−3) + (1, 0) · 1 + (1, 1) · 3 = (0, 3) + (1, 0) + (3, 3) = (4, 6). Answer: v(p) = (4, 6). 7. EXERCISE 7 Color selection, by end users, is typically done by various widgets which represent the space of available colors, such as the color wheel, color hexagon, or three separate color sliders for the R, G, and B (or alternatively H, S, and V ) color components. Assume, now, that we want to select only colors present in a given subset of the entire color space. Concretely, we have a large set of color photographs, and we next want to select only colors predominantly present in these photographs, rather than any possible color. Sketch and argue for a color-selection widget that would optimally help users to select only these specific colors. Hints: Think how to modify any of the existing color-selection widgets to ‘focus’ on a specific color range where many samples exist. We can specify a subset of colors we are interested in by specifying R,G,B values for each color in our sample. Then, we have a scatterred field of dots inside a color cube. After doing so we can conduct supersampling by ading even more dots in the neighbourhood of each specified color. Finally, we can use interpolate a 3D surface along each axis and project result on color cube facets, or on RGB hexagon. Also, we can modify HSV color wheel, by cutting out segments, corresponding to lowes density of our color sample, so that remaining colors have represent magority of our color samples. 8. EXERCISE 8 Consider a grid where we have color data values recorded at its cell vertices. We would like to use linear interpolation to compute colors at all points inside the grid cells. We can do this by interpolating colors represented as RGB triplets or, alternatively, colors represented as HSV triplets. Discuss the advantages 14
  • 15. and disadvantages of both schemes. Can you imagine a situation where the RGB interpolation would be arguably preferable to HSV interpolation? Can you imagine a situation when the converse (HSV interpolation is preferable to RGB interpolation) is true? Describe such situations or alternatively argue for the fact that they do not exist. HSV interpolation gives better results since it is separeted from luminance and saturation. In RGB color scheme we need to interpolate alnong all 3 components. 9. EXERCISE 9 Consider a grid cell, such as a 1D line, 2D triangle or quad, or 3D parallelepiped or cube, and some scalar values vi recorded at the cell vertices. Consider that we are using linear interpolation to reconstruct the sampled scalar signal v(x) at any point x inside the cell. Does a cell shape exist, and a point x in that cell, so that v(x) is larger than the maximum of vi over all cell vertices? Does a cell shape exist, and a point x in that cell, so that v(x) is smaller than the minimum of vi over all cell vertices? Argue your answers. No 10. EXERCISE 10 Consider a grid-cell like in the Exercise 9, and some color values vi recorded at the cell vertices. Consider that we are using linear interpolation to compute a color v(x) at any point x inside the cell. Does a point x exist so that v(x) is brighter than any of the colors vi? Does a point x exist so that v(x) is darker than any of the colors vi? Do the answers to the above two sub-questions depend on the choice of the system, or space, used to represent colors (RGB or HSV )? Explain your answer. No. 15