SlideShare a Scribd company logo
1 of 33
Stockpile Resource Center –
Aircraft Compatibility
Summer Work Presentation:
Graflab Data Compression
Study
Myuran Kanga
August 12, 2010
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,
for the United States Department of Energy’s National Nuclear Security Administration
under contract DE-AC04-94AL85000.
Presentation Outline
Introduction
Project Overview – Sam Sterns
Data Compression
Uses for Data Compression
Types of Data Compression
Three Algorithms
Testing Procedure
Compression/Decompression Example
Findings
Conclusion
Page ii
Presentation Outline
Introduction
Project Overview – Sam Sterns
Data Compression
Uses for Data Compression
Types of Data Compression
Three Algorithms
Testing Procedure
Compression/Decompression Example
Findings
Conclusion
Page 1
Introduction
• Myuran Kanga
– Bachelors Degree:
Oklahoma State University – Electrical Engineering
– Master’s Fellowship Program:
Rice University – Electrical Engineering (Communications
Specialization)
– Sandia: Meaningful Work/Projects:
- Team Assimilation
- Shaker Testing
- Cadence ORCAD – Electronic Design Software familiarization
- ORCAD Installation/licensing procedure documentation
- Courses – Quality for Project Management, Engineering Excellence,
Labview Core I, and Labview Core II
- Graflab Data Compression Study/Evaluation
Page 2
Presentation Outline
Introduction
Project Overview – Sam Sterns
Data Compression
Uses for Data Compression
Types of Data Compression
Three Algorithms
Testing Procedure
Compression/Decompression Example
Findings
Conclusion
Page 3
Project Overview – Graflab Data
Compression Study
Page 4
Summary: Evaluation of three Data Compression Algorithms
created by Dr. Samuel D. Sterns.
Primary Investigator/Technical Project Lead: Myuran Kanga
Key Personnel: Jerry Cap and Troy Skousen
Biography: Author – Compression Algorithms: Dr. Sam Sterns [1]
- Electrical Engineer specializing in digital signal processing
and adaptive signal processing
- Distinguished Member of the Technical Staff at Sandia
National Laboratories for 27 years. Retired in 1996.
- Author/Co-author of 7 signal processing textbooks
- Professor Emeritus at the University of New Mexico,
involved with teaching/research at the university since
1960.
Project Overview – Graflab Data
Compression Study cont.
Page 5
Project: Evaluation and interpretation of three data compression
algorithms.
- Algorithms labeled “2”, “3”, and “4”
- Code written in Matlab
- Each similar in nature
- Algorithms implement additional and more sophisticated
methods of compression
- More complex algorithms said to require longer
computational time but greater accuracy
- Hope to utilize compression with GRAFLAB
- GRAFLAB is a database, analysis, and plotting
package used for data reduction, analysis, and
archival purposes at Sandia.
Presentation Outline
Introduction
Project Overview – Sam Sterns
Data Compression
Uses for Data Compression
Types of Data Compression
Three Algorithms
Testing Procedure
Compression/Decompression Example
Findings
Conclusion
Page 6
What is Data
Compression?
Page 7
[2]
Data Compression
Definition: The process of encoding information using
fewer units of storage than an un-encoded
representation of data, through the use of
specific encoding schemes. [3]
Data compression, or sometimes called source coding, is
the process of converting input data into another data
stream that has a smaller size, but retains the essential
information contained within the original data stream.
Page 8
Presentation Outline
Introduction
Project Overview – Sam Sterns
Data Compression
Uses for Data Compression
Types of Data Compression
Three Algorithms
Testing Procedure
Compression/Decompression Example
Findings
Conclusion
Page 9
Data Compression Implementations
Page 10
- Compression is useful because it helps reduce the
consumption of resources, such as hard disk space or
transmission bandwidth.
- With the interest and surge in environmental test data for
the Surveillance Program, significant strains on computer
storage resources will occur.
- Archiving of environmental test data from legacy systems,
including data for the Environment Test lab.
- Familiar examples of data compressed files include .zip,
.rar, .tar file extensions.
[4]
Presentation Outline
Introduction
Project Overview – Sam Sterns
Data Compression
Uses for Data Compression
Types of Data Compression
Three Algorithms
Testing Procedure
Compression/Decompression Example
Findings
Conclusion
Page 11
Lossless vs. Lossy Compression
Two forms of compression: Lossless and Lossy
Lossless compression:
- These types of algorithms usually exploit statistical
redundancy to represent the user’s data more concisely
without error.
- Most real-world data has statistical redundancy
- Example – In English text, the letter ‘e’ is much more
common than the letter ‘z’. Similarly the probability that
the letter ‘q’ will be followed by the letter ‘z’ is very small.
Page 12
Lossless vs. Lossy Compression
Lossy Compression:
- Guided by research on how people perceive the data in
question.
- Used when some loss of fidelity is acceptable.
- As an example, the human eye is more sensitive to subtle
variations in luminance than to variations in color.
Therefore, color complexity can be reduced to maintain
the integrity of images, etc.
- JPEG image compression works in part by “rounding off”
some of this less important information.
- Lossy data compression provides a method of obtaining
the best fidelity for a given amount of compression
desired.
Page 13
Presentation Outline
Introduction
Project Overview – Sam Sterns
Data Compression
Uses for Data Compression
Types of Data Compression
Three Algorithms
Testing Procedure
Compression/Decompression Example
Findings
Conclusion
Page 14
Compression Algorithms
Page 15
Compression “2”
- Quantizes the data signal and packs the result into a sequence
of bytes.
Compression “3”
- Predicts the quantized data and packs the prediction error into
a sequence of bytes.
Compression “4”
- Said to provide the maximum compression
- Encodes the prediction error into a sequence of bytes using
adaptive arithmetic coding.
[5]
Compression Algorithms cont.
Page 16
Quantization
- The process of mapping a continuous range of values by a
relatively small set of discrete symbols or integer values.
- Sampling occurs on a periodic basis to convert the continuous
signal to discrete values.
- Can by viewed as accumulating data in bins
[6]
Compression Algorithms cont.
Page 17
Linear Prediction [7]
- Signal processing tool used in which future values of a digital signal
are estimated as a linear function of previous samples in the data.
- Time varying digital filter, excitation function, desired output y(n)
- Finding the appropriate excitation function and filter coefficients to
minimize the error of the predicted y(n) and original y(n).
- Also called Linear Predictive Coding - Common application:
- Speech compression
- Transmit only filter coefficients (Hk) and excitation sequence
x(n)
- For extreme compression, only transmit filter coefficients and
use a fix-frequency excitation – voice-coder
)(
1
0
0
)()( jnx
N
j
M
j
b jjnya jny 







N
j
j
nejnyny a1
)()()(


N
j
j
jnyn ay
1
^
)()(
)()()(
^
nnyne y
Compression Algorithms cont.
Page 18
Arithmetic Coding [8]
- Long data strings are represented by a single number, which is
obtained by repeatedly partitioning the range of possible values in
proportion to the probabilities of the data string.
- Example string: DABDDB
Symbol Part 1 Part 2 – Freq.
Product
Total
D 65 x 3 23328
A 64 x 0 3 0
B 63 x 1 3 x 1 648
D 62 x 3 3 x 1 x 2 648
D 61 x 3 3 x 1 x 2 x 3 324
B 60 x 1 3 x 1 x 2 x 3 x 3 54
25002
sFrequencieTotalDataCoded _
2510023321325002 
Part 1:
- 6 digit string = Radix of 6
- Multiplied by index of letter A = 0 to
D = 3
Part 2:
- Multiply by frequency of
accumulated product in
symbol data
Presentation Outline
Introduction
Project Overview – Sam Sterns
Data Compression
Uses for Data Compression
Types of Data Compression
Three Algorithms
Testing Procedure
Compression/Decompression Example
Findings
Conclusion
Page 19
Evaluation Procedure/Analysis
Page 20
Classical Waveform Compression Study:
- Triangle Wave - Trapezoid Wave
- Sine Wave - Sawtooth Wave
- Hanning Window - Harmonic Sine Waves
- Combined Sine Waves - Gap Analysis
- White Noise - Sine Wave with Noise
- Power Spectral Density - Square Wave
- .wav File
Waveforms created manually in individual m-files for predictability of
vector arrangement in Matlab. Frequencies and signal durations are
easily modifiable.
Waveform Examples
Page 21
0 1 2 3 4 5 6 7 8 9 10
-5
0
5
Original
Time (Seconds)
Amplitude
0 1 2 3 4 5 6 7 8 9 10
-5
0
5
Decompressed Waveform
Time (Seconds)
Amplitude
0 1 2 3 4 5 6 7 8 9 10
-0.02
0
0.02
Difference
Time (Seconds)
Amplitude
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
0
50
100
Original
Time (Seconds)
Amplitude
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
0
50
100
Decompressed Waveform
Time (Seconds)
Amplitude
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
-5
0
5
x 10
-4 Difference
Time (Seconds)
Amplitude
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
-1
0
1
Original
Time (Seconds)
Amplitude
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
-1
0
1
Decompressed Waveform
Time (Seconds)
Amplitude
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
-2
0
2
x 10
-5 Difference
Time (Seconds)
Amplitude
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
-1
0
1
Original
Time (Seconds)
Amplitude
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
-1
0
1
Decompressed Waveform
Time (Seconds)
Amplitude
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
-2
0
2
x 10
-5 Difference
Time (Seconds)
Amplitude
Trapezoid WaveWhite Noise
Gap AnalysisSawtooth Wave
Testing and Measurements
Page 22
Implemented Analysis and Measurements:
- Input and output data array
sizes
- Percentage accuracy of
compression
- Compression ratio - Relative computational time
- Percent difference: Max. and
Min. values of original and
decompressed waveforms
- Percent difference: Standard
deviation value of original and
decompressed waveforms
- Percent error: Max. and min.
values of original and
decompressed waveforms
- Percent error: Standard
deviation value of original and
decompressed waveforms
- Root Mean Square values of
original and decompressed
waveforms
- Normal values of original and
decompressed waveforms
- Difference in RMS values - Difference in Normal values
Presentation Outline
Introduction
Project Overview – Sam Sterns
Data Compression
Uses for Data Compression
Types of Data Compression
Three Algorithms
Testing Procedure
Compression/Decompression Example
Findings
Conclusion
Page 23
Compression/Decompression Example
Page 24
Using Compression “4”, the
compression ratio of the file was
1.52 with an accuracy of 99.6078
percent.
M-file written to create this
.wav file for real-world
compression/decompression
testing.
Compressed output using
Compression “2” and “4” –
Turn up your volume, the
amplitude of the compressed
file is much lower.
Compressed data should
not represent the original
data string. This example
demonstrates the
inefficiency of
Compression “2”.
Original Song
Compressed Song – Compression 2
Decompressed Song
Compressed Song – Compression 4
Presentation Outline
Introduction
Project Overview – Sam Sterns
Data Compression
Uses for Data Compression
Types of Data Compression
Three Algorithms
Testing Procedure
Compression/Decompression Example
Findings
Conclusion
Page 25
Findings
Page 26
Compression “2”:
- Generally, this algorithm produced a compression ratio of about 1 in most
cases. For simple waveforms like the square wave, compression did occur.
- Fastest compression algorithm of the three
- Inefficient compression – Compression ratio of 1 = No compression
Compression “3” and “4”:
- Compression ratio increases with increased data length/duration
- Increased data length/duration causes longer calculation times – Within limits
- Compression “4” produced a much higher compression ratio in comparison to
other algorithms
- Compression “4” is the slowest algorithm – Three compression methods
Special Cases:
- The square wave produces 100% accuracy and very high compression with
all three algorithms
- White Noise does not seem to compress much past a ratio of 1
- Code has been modified to handle gaps in the input data
- The accuracy of compression/decompression for all three algorithms has
proven to be above 99% in all cases
Presentation Outline
Introduction
Project Overview – Sam Sterns
Data Compression
Uses for Data Compression
Types of Data Compression
Three Algorithms
Testing Procedure
Compression/Decompression Example
Findings
Conclusion
Page 27
Future Work
Page 28
- Similar waveform analysis with the raw data files provided by Dr.
Sam Sterns
- Additional error or warning messages
- Noise
- Gaps
- Invalid array data
- Implementation of compression algorithms into Graflab database
- Investigate possibilities of real-time compression/decompression
Recommendations:
- Filter noise from data prior to compression
- Compress all data, disregarding size
- Continue implementation of replacing gaps
with zeros
Summer Work Applicability / Benefit
Page 29
- Applicability to our organization - Meaningful work
- Storing new and legacy environmental test data from the
surveillance program
- Environmental Test lab data storage
- Opportunity to continue education
- Improved Matlab skills
- Introduction to Labview
- ORCAD familiarity
- Organizational and leadership skills – Management course
- Assimilation to Albuquerque, work environment at Sandia
National Laboratories, and Aircraft Compatibility
[9] [10]
Citations and Questions
[1] University of New Mexico – ECE, “Dr. Samuel D. Stearns,” 2010. [Online]. Available:
http://www.ece.unm.edu/faculty/stearns/. [Accessed: July 2010].
[2] Plus Magazine, “Text, Bytes and Videotape,” January 1, 2003. [Online]. Available:
http://plus.maths.org/issue23/features/data/data.jpg. [Accessed: August 2010].
[3] Wikipedia, “Data compression,” July 20, 2010. [Online]. Available:
http://en.wikipedia.org/wiki/Data_compression. [Accessed: August 2010].
[4] Hoax-slyer.com, “Burning-hard-drive,” 2010. [Online]. Available: http://www.hoax-
slayer.com/images/burning-hard-drive.jpg. [Accessed: August 2010].
[5] S. Sterns, Encoding and Decoding of Instrumentation and Telemetry Waveforms. Samuel D. Sterns:
Sandia National Laboratories. January 25, 2008.
[6] Wikipedia, “Quantization (signal processing),” July 2, 2010. [Online]. Available:
http://en.wikipedia.org/wiki/Quantization_(signal_processing). [Accessed: June 2010].
[7] Connexions, “Linear Prediction and Cross Synthesis,” March 18, 2008. [Online]. Available:
http://cnx.org/content/m15478/latest/ . [Accessed: June 2010].
[8] Wikipedia, “Arithmetic coding,” August 7, 2010. [Online]. Available:
http://en.wikipedia.org/wiki/Arithmetic_coding. [Accessed: June 2010].
[9] Rice University, Home page, 2010. [Online]. Available: http://www.rice.edu. [Accessed: August 2010].
Appendix I
Citations and Questions
[10] Sandia National Laboratories, Home page, 2010. [Online]. Available: http://www.sandia.gov. [Accessed:
August 2010].
[11] T. Skousen. (private communication). 2010.
[12] J. Cap. (private communication). 2010.
Appendix II

More Related Content

What's hot

What's hot (20)

image compression ppt
image compression pptimage compression ppt
image compression ppt
 
Fundamentals of Data compression
Fundamentals of Data compressionFundamentals of Data compression
Fundamentals of Data compression
 
Lzw coding technique for image compression
Lzw coding technique for image compressionLzw coding technique for image compression
Lzw coding technique for image compression
 
Sharpening using frequency Domain Filter
Sharpening using frequency Domain FilterSharpening using frequency Domain Filter
Sharpening using frequency Domain Filter
 
Audio and Video Compression
Audio and Video CompressionAudio and Video Compression
Audio and Video Compression
 
digital image processing
digital image processingdigital image processing
digital image processing
 
Multimedia systems
Multimedia systemsMultimedia systems
Multimedia systems
 
Image compression .
Image compression .Image compression .
Image compression .
 
Introduction to Image Compression
Introduction to Image CompressionIntroduction to Image Compression
Introduction to Image Compression
 
Distributed Multimedia Systems(DMMS)
Distributed Multimedia Systems(DMMS)Distributed Multimedia Systems(DMMS)
Distributed Multimedia Systems(DMMS)
 
Cloud computing system models for distributed and cloud computing
Cloud computing system models for distributed and cloud computingCloud computing system models for distributed and cloud computing
Cloud computing system models for distributed and cloud computing
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image Segmentation
 
Data compression
Data  compressionData  compression
Data compression
 
Vector quantization
Vector quantizationVector quantization
Vector quantization
 
Video compression
Video compressionVideo compression
Video compression
 
Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processing
 
cluster computing
cluster computingcluster computing
cluster computing
 
Data compression
Data compression Data compression
Data compression
 
JPEG
JPEGJPEG
JPEG
 
Homomorphic filtering
Homomorphic filteringHomomorphic filtering
Homomorphic filtering
 

Viewers also liked

Основи на public key infrastructure
Основи на   public key infrastructureОснови на   public key infrastructure
Основи на public key infrastructure
Tita Toneva
 
Data Compression for Multi-dimentional Data Warehouses
Data Compression for Multi-dimentional Data WarehousesData Compression for Multi-dimentional Data Warehouses
Data Compression for Multi-dimentional Data Warehouses
Mushfiqur Rahman
 
G zip compresser ppt
G zip compresser pptG zip compresser ppt
G zip compresser ppt
gaurav kumar
 
Mekanika Tanah - Triaxial shear test
Mekanika Tanah - Triaxial shear testMekanika Tanah - Triaxial shear test
Mekanika Tanah - Triaxial shear test
Reski Aprilia
 
Text compression in LZW and Flate
Text compression in LZW and FlateText compression in LZW and Flate
Text compression in LZW and Flate
Subeer Rangra
 
Buckling test engt110
Buckling test engt110Buckling test engt110
Buckling test engt110
asghar123456
 
Image compression using discrete wavelet transform
Image compression using discrete wavelet transformImage compression using discrete wavelet transform
Image compression using discrete wavelet transform
Harshal Ladhe
 
Data compression introduction
Data compression introductionData compression introduction
Data compression introduction
Rahul Khanwani
 

Viewers also liked (20)

Основи на public key infrastructure
Основи на   public key infrastructureОснови на   public key infrastructure
Основи на public key infrastructure
 
PREDICTION OF COMPRESSIVE STRENGTH OF CONCRETE FROM EARLY AGE TEST RESULT
PREDICTION OF COMPRESSIVE STRENGTH OF CONCRETE FROM EARLY AGE TEST RESULTPREDICTION OF COMPRESSIVE STRENGTH OF CONCRETE FROM EARLY AGE TEST RESULT
PREDICTION OF COMPRESSIVE STRENGTH OF CONCRETE FROM EARLY AGE TEST RESULT
 
Data Compression for Multi-dimentional Data Warehouses
Data Compression for Multi-dimentional Data WarehousesData Compression for Multi-dimentional Data Warehouses
Data Compression for Multi-dimentional Data Warehouses
 
How GZIP works... in 10 minutes
How GZIP works... in 10 minutesHow GZIP works... in 10 minutes
How GZIP works... in 10 minutes
 
Introduction for Data Compression
Introduction for Data Compression Introduction for Data Compression
Introduction for Data Compression
 
Sindrome de Compresion Vertebral, Medular y Radicular
Sindrome de Compresion Vertebral, Medular y RadicularSindrome de Compresion Vertebral, Medular y Radicular
Sindrome de Compresion Vertebral, Medular y Radicular
 
Compression
CompressionCompression
Compression
 
SHEAR STRENGTH THEORY
SHEAR STRENGTH THEORYSHEAR STRENGTH THEORY
SHEAR STRENGTH THEORY
 
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
Keystone summer school_2015_miguel_antonio_ldcompression_4-joinedKeystone summer school_2015_miguel_antonio_ldcompression_4-joined
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
 
G zip compresser ppt
G zip compresser pptG zip compresser ppt
G zip compresser ppt
 
Data compression techniques
Data compression techniquesData compression techniques
Data compression techniques
 
Mekanika Tanah - Triaxial shear test
Mekanika Tanah - Triaxial shear testMekanika Tanah - Triaxial shear test
Mekanika Tanah - Triaxial shear test
 
Vane shear test
Vane shear testVane shear test
Vane shear test
 
Text compression in LZW and Flate
Text compression in LZW and FlateText compression in LZW and Flate
Text compression in LZW and Flate
 
Buckling test engt110
Buckling test engt110Buckling test engt110
Buckling test engt110
 
Data compression
Data compressionData compression
Data compression
 
Image compression using discrete wavelet transform
Image compression using discrete wavelet transformImage compression using discrete wavelet transform
Image compression using discrete wavelet transform
 
Data compression introduction
Data compression introductionData compression introduction
Data compression introduction
 
Compression techniques
Compression techniquesCompression techniques
Compression techniques
 
data compression technique
data compression techniquedata compression technique
data compression technique
 

Similar to Data Compression Project Presentation

IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...
JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...
JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...
ijma
 
The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...
Thanh Hieu
 
Compression of digital voice and video
Compression of digital voice and videoCompression of digital voice and video
Compression of digital voice and video
sangusajjan
 

Similar to Data Compression Project Presentation (20)

A new algorithm for data compression technique using vlsi
A new algorithm for data compression technique using vlsiA new algorithm for data compression technique using vlsi
A new algorithm for data compression technique using vlsi
 
Data compression, data security, and machine learning
Data compression, data security, and machine learningData compression, data security, and machine learning
Data compression, data security, and machine learning
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Data compression
Data compressionData compression
Data compression
 
Image Compression Through Combination Advantages From Existing Techniques
Image Compression Through Combination Advantages From Existing TechniquesImage Compression Through Combination Advantages From Existing Techniques
Image Compression Through Combination Advantages From Existing Techniques
 
Design and Implementation of A Data Stream Management System
Design and Implementation of A Data Stream Management SystemDesign and Implementation of A Data Stream Management System
Design and Implementation of A Data Stream Management System
 
Teknik Pengkodean (2).pptx
Teknik Pengkodean (2).pptxTeknik Pengkodean (2).pptx
Teknik Pengkodean (2).pptx
 
Lossless Data Compression Using Rice Algorithm Based On Curve Fitting Technique
Lossless Data Compression Using Rice Algorithm Based On Curve Fitting TechniqueLossless Data Compression Using Rice Algorithm Based On Curve Fitting Technique
Lossless Data Compression Using Rice Algorithm Based On Curve Fitting Technique
 
Compression technologies
Compression technologiesCompression technologies
Compression technologies
 
Gene's law
Gene's lawGene's law
Gene's law
 
Aa31163168
Aa31163168Aa31163168
Aa31163168
 
Affable Compression through Lossless Column-Oriented Huffman Coding Technique
Affable Compression through Lossless Column-Oriented Huffman Coding TechniqueAffable Compression through Lossless Column-Oriented Huffman Coding Technique
Affable Compression through Lossless Column-Oriented Huffman Coding Technique
 
Wireless Ad Hoc Networks
Wireless Ad Hoc NetworksWireless Ad Hoc Networks
Wireless Ad Hoc Networks
 
THE WORLD IN A NUTSHELL
THE WORLD IN A  NUTSHELLTHE WORLD IN A  NUTSHELL
THE WORLD IN A NUTSHELL
 
An Optics Life
An Optics LifeAn Optics Life
An Optics Life
 
JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...
JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...
JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...
 
The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...
 
Compression of digital voice and video
Compression of digital voice and videoCompression of digital voice and video
Compression of digital voice and video
 
F5242832
F5242832F5242832
F5242832
 
Multivariate dimensionality reduction in cross-correlation analysis
Multivariate dimensionality reduction in cross-correlation analysis Multivariate dimensionality reduction in cross-correlation analysis
Multivariate dimensionality reduction in cross-correlation analysis
 

More from Myuran Kanga, MS, MBA (6)

Operations Management & Production Efficiency Analysis
Operations Management & Production Efficiency AnalysisOperations Management & Production Efficiency Analysis
Operations Management & Production Efficiency Analysis
 
Customer and Corporate Vision Alignment Analysis Sample – Field & Retail Venu...
Customer and Corporate Vision Alignment Analysis Sample – Field & Retail Venu...Customer and Corporate Vision Alignment Analysis Sample – Field & Retail Venu...
Customer and Corporate Vision Alignment Analysis Sample – Field & Retail Venu...
 
Audi Brand Strategy Evaluation
Audi Brand Strategy EvaluationAudi Brand Strategy Evaluation
Audi Brand Strategy Evaluation
 
ICS Careline Final Presentation_2
ICS Careline Final Presentation_2ICS Careline Final Presentation_2
ICS Careline Final Presentation_2
 
Audi Brand Awareness Study, Customer Experience Identification, and Analysis
Audi Brand Awareness Study, Customer Experience Identification, and AnalysisAudi Brand Awareness Study, Customer Experience Identification, and Analysis
Audi Brand Awareness Study, Customer Experience Identification, and Analysis
 
Holy Hand Grenade
Holy Hand GrenadeHoly Hand Grenade
Holy Hand Grenade
 

Data Compression Project Presentation

  • 1. Stockpile Resource Center – Aircraft Compatibility Summer Work Presentation: Graflab Data Compression Study Myuran Kanga August 12, 2010 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.
  • 2. Presentation Outline Introduction Project Overview – Sam Sterns Data Compression Uses for Data Compression Types of Data Compression Three Algorithms Testing Procedure Compression/Decompression Example Findings Conclusion Page ii
  • 3. Presentation Outline Introduction Project Overview – Sam Sterns Data Compression Uses for Data Compression Types of Data Compression Three Algorithms Testing Procedure Compression/Decompression Example Findings Conclusion Page 1
  • 4. Introduction • Myuran Kanga – Bachelors Degree: Oklahoma State University – Electrical Engineering – Master’s Fellowship Program: Rice University – Electrical Engineering (Communications Specialization) – Sandia: Meaningful Work/Projects: - Team Assimilation - Shaker Testing - Cadence ORCAD – Electronic Design Software familiarization - ORCAD Installation/licensing procedure documentation - Courses – Quality for Project Management, Engineering Excellence, Labview Core I, and Labview Core II - Graflab Data Compression Study/Evaluation Page 2
  • 5. Presentation Outline Introduction Project Overview – Sam Sterns Data Compression Uses for Data Compression Types of Data Compression Three Algorithms Testing Procedure Compression/Decompression Example Findings Conclusion Page 3
  • 6. Project Overview – Graflab Data Compression Study Page 4 Summary: Evaluation of three Data Compression Algorithms created by Dr. Samuel D. Sterns. Primary Investigator/Technical Project Lead: Myuran Kanga Key Personnel: Jerry Cap and Troy Skousen Biography: Author – Compression Algorithms: Dr. Sam Sterns [1] - Electrical Engineer specializing in digital signal processing and adaptive signal processing - Distinguished Member of the Technical Staff at Sandia National Laboratories for 27 years. Retired in 1996. - Author/Co-author of 7 signal processing textbooks - Professor Emeritus at the University of New Mexico, involved with teaching/research at the university since 1960.
  • 7. Project Overview – Graflab Data Compression Study cont. Page 5 Project: Evaluation and interpretation of three data compression algorithms. - Algorithms labeled “2”, “3”, and “4” - Code written in Matlab - Each similar in nature - Algorithms implement additional and more sophisticated methods of compression - More complex algorithms said to require longer computational time but greater accuracy - Hope to utilize compression with GRAFLAB - GRAFLAB is a database, analysis, and plotting package used for data reduction, analysis, and archival purposes at Sandia.
  • 8. Presentation Outline Introduction Project Overview – Sam Sterns Data Compression Uses for Data Compression Types of Data Compression Three Algorithms Testing Procedure Compression/Decompression Example Findings Conclusion Page 6
  • 10. Data Compression Definition: The process of encoding information using fewer units of storage than an un-encoded representation of data, through the use of specific encoding schemes. [3] Data compression, or sometimes called source coding, is the process of converting input data into another data stream that has a smaller size, but retains the essential information contained within the original data stream. Page 8
  • 11. Presentation Outline Introduction Project Overview – Sam Sterns Data Compression Uses for Data Compression Types of Data Compression Three Algorithms Testing Procedure Compression/Decompression Example Findings Conclusion Page 9
  • 12. Data Compression Implementations Page 10 - Compression is useful because it helps reduce the consumption of resources, such as hard disk space or transmission bandwidth. - With the interest and surge in environmental test data for the Surveillance Program, significant strains on computer storage resources will occur. - Archiving of environmental test data from legacy systems, including data for the Environment Test lab. - Familiar examples of data compressed files include .zip, .rar, .tar file extensions. [4]
  • 13. Presentation Outline Introduction Project Overview – Sam Sterns Data Compression Uses for Data Compression Types of Data Compression Three Algorithms Testing Procedure Compression/Decompression Example Findings Conclusion Page 11
  • 14. Lossless vs. Lossy Compression Two forms of compression: Lossless and Lossy Lossless compression: - These types of algorithms usually exploit statistical redundancy to represent the user’s data more concisely without error. - Most real-world data has statistical redundancy - Example – In English text, the letter ‘e’ is much more common than the letter ‘z’. Similarly the probability that the letter ‘q’ will be followed by the letter ‘z’ is very small. Page 12
  • 15. Lossless vs. Lossy Compression Lossy Compression: - Guided by research on how people perceive the data in question. - Used when some loss of fidelity is acceptable. - As an example, the human eye is more sensitive to subtle variations in luminance than to variations in color. Therefore, color complexity can be reduced to maintain the integrity of images, etc. - JPEG image compression works in part by “rounding off” some of this less important information. - Lossy data compression provides a method of obtaining the best fidelity for a given amount of compression desired. Page 13
  • 16. Presentation Outline Introduction Project Overview – Sam Sterns Data Compression Uses for Data Compression Types of Data Compression Three Algorithms Testing Procedure Compression/Decompression Example Findings Conclusion Page 14
  • 17. Compression Algorithms Page 15 Compression “2” - Quantizes the data signal and packs the result into a sequence of bytes. Compression “3” - Predicts the quantized data and packs the prediction error into a sequence of bytes. Compression “4” - Said to provide the maximum compression - Encodes the prediction error into a sequence of bytes using adaptive arithmetic coding. [5]
  • 18. Compression Algorithms cont. Page 16 Quantization - The process of mapping a continuous range of values by a relatively small set of discrete symbols or integer values. - Sampling occurs on a periodic basis to convert the continuous signal to discrete values. - Can by viewed as accumulating data in bins [6]
  • 19. Compression Algorithms cont. Page 17 Linear Prediction [7] - Signal processing tool used in which future values of a digital signal are estimated as a linear function of previous samples in the data. - Time varying digital filter, excitation function, desired output y(n) - Finding the appropriate excitation function and filter coefficients to minimize the error of the predicted y(n) and original y(n). - Also called Linear Predictive Coding - Common application: - Speech compression - Transmit only filter coefficients (Hk) and excitation sequence x(n) - For extreme compression, only transmit filter coefficients and use a fix-frequency excitation – voice-coder )( 1 0 0 )()( jnx N j M j b jjnya jny         N j j nejnyny a1 )()()(   N j j jnyn ay 1 ^ )()( )()()( ^ nnyne y
  • 20. Compression Algorithms cont. Page 18 Arithmetic Coding [8] - Long data strings are represented by a single number, which is obtained by repeatedly partitioning the range of possible values in proportion to the probabilities of the data string. - Example string: DABDDB Symbol Part 1 Part 2 – Freq. Product Total D 65 x 3 23328 A 64 x 0 3 0 B 63 x 1 3 x 1 648 D 62 x 3 3 x 1 x 2 648 D 61 x 3 3 x 1 x 2 x 3 324 B 60 x 1 3 x 1 x 2 x 3 x 3 54 25002 sFrequencieTotalDataCoded _ 2510023321325002  Part 1: - 6 digit string = Radix of 6 - Multiplied by index of letter A = 0 to D = 3 Part 2: - Multiply by frequency of accumulated product in symbol data
  • 21. Presentation Outline Introduction Project Overview – Sam Sterns Data Compression Uses for Data Compression Types of Data Compression Three Algorithms Testing Procedure Compression/Decompression Example Findings Conclusion Page 19
  • 22. Evaluation Procedure/Analysis Page 20 Classical Waveform Compression Study: - Triangle Wave - Trapezoid Wave - Sine Wave - Sawtooth Wave - Hanning Window - Harmonic Sine Waves - Combined Sine Waves - Gap Analysis - White Noise - Sine Wave with Noise - Power Spectral Density - Square Wave - .wav File Waveforms created manually in individual m-files for predictability of vector arrangement in Matlab. Frequencies and signal durations are easily modifiable.
  • 23. Waveform Examples Page 21 0 1 2 3 4 5 6 7 8 9 10 -5 0 5 Original Time (Seconds) Amplitude 0 1 2 3 4 5 6 7 8 9 10 -5 0 5 Decompressed Waveform Time (Seconds) Amplitude 0 1 2 3 4 5 6 7 8 9 10 -0.02 0 0.02 Difference Time (Seconds) Amplitude 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 50 100 Original Time (Seconds) Amplitude 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 50 100 Decompressed Waveform Time (Seconds) Amplitude 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 -5 0 5 x 10 -4 Difference Time (Seconds) Amplitude 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 -1 0 1 Original Time (Seconds) Amplitude 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 -1 0 1 Decompressed Waveform Time (Seconds) Amplitude 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 -2 0 2 x 10 -5 Difference Time (Seconds) Amplitude 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 -1 0 1 Original Time (Seconds) Amplitude 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 -1 0 1 Decompressed Waveform Time (Seconds) Amplitude 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 -2 0 2 x 10 -5 Difference Time (Seconds) Amplitude Trapezoid WaveWhite Noise Gap AnalysisSawtooth Wave
  • 24. Testing and Measurements Page 22 Implemented Analysis and Measurements: - Input and output data array sizes - Percentage accuracy of compression - Compression ratio - Relative computational time - Percent difference: Max. and Min. values of original and decompressed waveforms - Percent difference: Standard deviation value of original and decompressed waveforms - Percent error: Max. and min. values of original and decompressed waveforms - Percent error: Standard deviation value of original and decompressed waveforms - Root Mean Square values of original and decompressed waveforms - Normal values of original and decompressed waveforms - Difference in RMS values - Difference in Normal values
  • 25. Presentation Outline Introduction Project Overview – Sam Sterns Data Compression Uses for Data Compression Types of Data Compression Three Algorithms Testing Procedure Compression/Decompression Example Findings Conclusion Page 23
  • 26. Compression/Decompression Example Page 24 Using Compression “4”, the compression ratio of the file was 1.52 with an accuracy of 99.6078 percent. M-file written to create this .wav file for real-world compression/decompression testing. Compressed output using Compression “2” and “4” – Turn up your volume, the amplitude of the compressed file is much lower. Compressed data should not represent the original data string. This example demonstrates the inefficiency of Compression “2”. Original Song Compressed Song – Compression 2 Decompressed Song Compressed Song – Compression 4
  • 27. Presentation Outline Introduction Project Overview – Sam Sterns Data Compression Uses for Data Compression Types of Data Compression Three Algorithms Testing Procedure Compression/Decompression Example Findings Conclusion Page 25
  • 28. Findings Page 26 Compression “2”: - Generally, this algorithm produced a compression ratio of about 1 in most cases. For simple waveforms like the square wave, compression did occur. - Fastest compression algorithm of the three - Inefficient compression – Compression ratio of 1 = No compression Compression “3” and “4”: - Compression ratio increases with increased data length/duration - Increased data length/duration causes longer calculation times – Within limits - Compression “4” produced a much higher compression ratio in comparison to other algorithms - Compression “4” is the slowest algorithm – Three compression methods Special Cases: - The square wave produces 100% accuracy and very high compression with all three algorithms - White Noise does not seem to compress much past a ratio of 1 - Code has been modified to handle gaps in the input data - The accuracy of compression/decompression for all three algorithms has proven to be above 99% in all cases
  • 29. Presentation Outline Introduction Project Overview – Sam Sterns Data Compression Uses for Data Compression Types of Data Compression Three Algorithms Testing Procedure Compression/Decompression Example Findings Conclusion Page 27
  • 30. Future Work Page 28 - Similar waveform analysis with the raw data files provided by Dr. Sam Sterns - Additional error or warning messages - Noise - Gaps - Invalid array data - Implementation of compression algorithms into Graflab database - Investigate possibilities of real-time compression/decompression Recommendations: - Filter noise from data prior to compression - Compress all data, disregarding size - Continue implementation of replacing gaps with zeros
  • 31. Summer Work Applicability / Benefit Page 29 - Applicability to our organization - Meaningful work - Storing new and legacy environmental test data from the surveillance program - Environmental Test lab data storage - Opportunity to continue education - Improved Matlab skills - Introduction to Labview - ORCAD familiarity - Organizational and leadership skills – Management course - Assimilation to Albuquerque, work environment at Sandia National Laboratories, and Aircraft Compatibility [9] [10]
  • 32. Citations and Questions [1] University of New Mexico – ECE, “Dr. Samuel D. Stearns,” 2010. [Online]. Available: http://www.ece.unm.edu/faculty/stearns/. [Accessed: July 2010]. [2] Plus Magazine, “Text, Bytes and Videotape,” January 1, 2003. [Online]. Available: http://plus.maths.org/issue23/features/data/data.jpg. [Accessed: August 2010]. [3] Wikipedia, “Data compression,” July 20, 2010. [Online]. Available: http://en.wikipedia.org/wiki/Data_compression. [Accessed: August 2010]. [4] Hoax-slyer.com, “Burning-hard-drive,” 2010. [Online]. Available: http://www.hoax- slayer.com/images/burning-hard-drive.jpg. [Accessed: August 2010]. [5] S. Sterns, Encoding and Decoding of Instrumentation and Telemetry Waveforms. Samuel D. Sterns: Sandia National Laboratories. January 25, 2008. [6] Wikipedia, “Quantization (signal processing),” July 2, 2010. [Online]. Available: http://en.wikipedia.org/wiki/Quantization_(signal_processing). [Accessed: June 2010]. [7] Connexions, “Linear Prediction and Cross Synthesis,” March 18, 2008. [Online]. Available: http://cnx.org/content/m15478/latest/ . [Accessed: June 2010]. [8] Wikipedia, “Arithmetic coding,” August 7, 2010. [Online]. Available: http://en.wikipedia.org/wiki/Arithmetic_coding. [Accessed: June 2010]. [9] Rice University, Home page, 2010. [Online]. Available: http://www.rice.edu. [Accessed: August 2010]. Appendix I
  • 33. Citations and Questions [10] Sandia National Laboratories, Home page, 2010. [Online]. Available: http://www.sandia.gov. [Accessed: August 2010]. [11] T. Skousen. (private communication). 2010. [12] J. Cap. (private communication). 2010. Appendix II