1. Lecture 5, Wednesday 17th September 2014
DEPARTMENT OF GEOGRAPHY AND ENVIRONMENT
UNIVERSITY OF DHAKA
2. According to NCDCDS (The US National Committee for Digital Cartographic Data Standards) there are five dimensions for geographic data quality. In addition, ICA proposed two more dimensions.
1.Lineage of Geographic data
2.Positional Accuracy of Geographic data
3.Attribute Accuracy of Geographic data
4.Logical consistency
5.Completeness of Geographic data
6.Temporal accuracy
7.Semantic accuracy
3. This refers to the sources of materials from which a specific set of geographic data was derived
Lineage provides following questions to a user about data:
1.Who collected data?
2.When were the data collected?
3.How collected?
4.How were the data converted?
5.What algorithms were used to process the data?
6.What was the precision of computation?
4. “Closeness” of coordinate values to the “true” positions of the real world
Generally, maps are accurate to roughly one line width or 0.5 mm. This is known as minimum mapping unit. A 0.5 mm resolution is equivalent to 5 m on 1:10000 scale maps and 125 m on 1:250000 scale maps.
Positional accuracy of data can be measured by two ways:
1.Planimetric accuracy
2.Height accuracy
6. Defined as the “closeness” of the descriptive data in the geographic database to the true or assumed values of the real world features that they may represent
Different ways are used to measure attribute accuracy:
For metric attribute (DEM, TIN), accuracy may always be simply expressed as measurement error
For categorical attributes (land use classification) it is very difficult to measure accuracy of spatial data. In such case, attribute accuracy usually evaluated in terms of other factors, such as-
1.The classification scheme
2.The amount of gross error
3.The degree of heterogeneity of the polygons
7. Defined as a square array of values, denoted as C, which cross- tabulates the number of sample spatial data units assigned to a particular category relative to the actual category as verified by the reference data
Constructed to show the frequency of discrepancies between encoded values and their corresponding reference values of sample
In the error matrix, rows represent the categories of the classification of the database obtained by the user
The columns indicate the classification of the reference data obtained by source data or field visit
8. Diagonal elements represent correctly classified spatial data
Off-diagonal elements represent the frequencies of misclassification of various categories
If in a particular error matrix, all the non-zero entries lie on the diagonal, it indicates that no misclassification at the sample locations has occurred and an overall accuracy of 100% is obtained
When misclassifications occur, it can be termed either as an error of commission/user accuracy (error of inclusion) or an error of omission/ producers accuracy (errors of exclusion)
9. Overall Accuracy
Computed by dividing the total number of correctly classified pixels by the total number of reference pixels
The maximum value of the overall accuracy is 100 when there is perfect agreement between the database and the reference data. The minimum value is 0.
10. OA can also be termed as PCC (Percent Correctly Classified). The following equation can be used:
PCC or OA= (Sd /n)* 100%
Where,
Sd = sum of values along diagonal
N= total number of sample locations
11. Sample Data
Reference Data
Total
Exposed soil
Cropland
Range
Sparse woodland
Forest
Water
Exposed soil
1
2
0
0
0
0
3
Cropland
0
5
0
2
3
0
10
Range
0
3
5
1
0
0
9
Sparse woodland
0
0
4
4
0
0
8
Forest
0
0
0
0
4
0
4
Water
0
0
0
0
0
1
1
Total
1
10
9
7
7
1
35
12. This can be computed by dividing the number of correctly classified pixels in each category (on the major diagonal) by number of training set pixels used for that category (the column total)
Producer’s accuracy= (C i / C t) *100%
Where,
Ci= correctly classified sample locations in column
Ct= total number of sample locations in column
EO=100-producer’s accuracy
13. Calculation of PA
Exposed soil =1/1 =100%
Cropland =5/10 =50%
Range =5/9 =55.6%
Sparse woodland =4/7 =57.1%
Forest =4/7 =57.1%
Water body =1/1 =100%
14. Computed by dividing the number of correctly classified pixels in each category by the total number of pixels that were classified in that category (the row total)
This figure is a measure of commission error and indicates the probability that a pixel classified into a given category actually represents that category on the ground
UA= (Ri / Rt) *100
Where,
Ri= correctly classified sample locations in row
Rt= total number of sample locations in row
Error of commission=100-users accuracy
15. Calculation of UA
Exposed soil =1/3 =33.3%
Cropland =5/10 =50%
Range =5/9 =55.6%
Sparse woodland =4/8 =50%
Forest =4/4 =100%
Water body =1/1 =100%
16. 4. Logical consistency
Description of the fidelity of the relationships between the real world and encoded geographic data
In GIS, topological model is an example of assigning logical consistency
>> consistency of the data model
>> consistency of the positional and attribute data
>> consistency between data files
17. 5. Completeness of Geographic data
Are all possible objects included within the database?
A.Spatial completeness
B.B. Thematic completeness
18. 6. Temporal accuracy
Measure of data quality with respect to the representation of time in geographic database
A.World time
B.Database time
19. 7. Semantic accuracy
>> how correctly spatial objects are labeled on named
>> correct encoding in accordance with a set of features
20. Datum
A geodetic datum (plural datums, not data) is a reference from which measurements are made.
In surveying and geodesy, a datum is a set of reference points on the Earth's surface against which position measurements are made.
21. Horizontal datums are used for describing a point on the earth's surface, in latitude and longitude or another coordinate system.
Vertical datums are used to measure elevations or underwater depths.
22.
23. A coordinate system defines the location of a point on a planar or spherical surface.
Types of coordinate system
A.Based on Nature
B.Based on Extent
24. A. Based on Nature
1. Plane coordinate system
2. Geographic coordinate system
B. Based on Extent
1. Global coordinate system
2. Local coordinate system
27. 3. WGS 84
The World Geodetic System 1984 (WGS84) is the datum used by the Global Positioning System (GPS). The datum is defined and maintained by the United States National Geospatial- Intelligence Agency (NGA).
Coordinates computed from GPS receivers are likely to be provided in terms of the WGS84 datum and the heights in terms of the WGS84 ellipsoid.
28. 4. Everest 1830
India and other countries of the world made measurements in their countries and defined reference surface to serve as Datum for mapping.
In India the reference surface was defined by Sir George Everest, who was Surveyor General of India from 1830 to 1843.
It has served as reference for all mapping in India. Indian system can be called Indian Geodetic System as all coordinates are referred to it. The reference surface was called Everest Spheroid.
29. Geoid
An imaginary surface that coincides with mean sea level in the ocean and its extension through the continents.
A hypothetical surface that corresponds to mean sea level and extends at the same level under the continents.
The geoid is used as a reference surface for astronomical measurements and for the accurate measurement of elevationon the Earth's surface.
Ellipsoid
A geometric surface, symmetrical about the three coordinate axes, whose plane sections are ellipses or circles