The document discusses XML document structure and validation. It introduces well-formed and valid XML documents, and the DTD and XML Schema used to define the structure and elements of valid XML documents. It provides examples of DTDs defining the elements and attributes of sample XML documents.
Ch10-Global Supply Chain - Cadena de Suministro.pdf
02 well formed and valid documents
1. UNIT I INTRODUCTION TO XML
XML document structure – Well formed and valid
documents – Namespaces – DTD – XML Schema
– X-Files.
2. Well formed and valid documents
well-structured XML document can easily be
transported b/w systems and devices
3. Introduction
• XML documents can be well formed, and they can also be valid.
• Validity implies “well-formedness,” but not vice versa.
• A valid XML document is a strict form of a well-formed XML document.
• It’s like saying that a square is a rectangle, but not vice versa.
4. Well-Formed Documents
• An XML document is well formed
– If it follows all syntax rules of XML.
– If it does not includes inappropriate markup or characters that cannot be
processed by XML parsers
• An XML document can’t be partially well formed.
• And, by definition, if a document is not well formed, it is not XML.
5. Valid Documents
• Property of “well-formedness”
– A matter of making sure the XML document complies to syntactical rules
• A well-formed XML document is considered valid only
– If it contains a proper Document Type Declaration and
– If the document obeys the constraints of that declaration
• Constraints of the declaration will be expressed as
– A DTD or
– An XML Schema.
6. Valid Vs Well-formed XML Document
• Well-formed XML documents are designed for use without any constraints
– Valid XML documents explicitly require these constraint mechanisms
• Valid XML documents can take advantage of certain advanced features of
XML
– Those advanced features of XML are not available to merely well-formed
documents due to their lack of a DTD or XML Schema.
• Creation of well-formed XML is a simple process,
– But the use of valid XML documents can greatly improve the quality of
document processes
7. Well Formed Document
• Use our XML validator to syntax-check your XML.
• An XML document with correct syntax is called "Well Formed".
• Important Syntax rules :
1. XML documents must have a root element
2. XML elements must have a closing tag
3. XML tags are case sensitive
4. XML elements must be properly nested
5. XML attribute values must be quoted
8. XML Errors Will Stop You
• Errors in XML documents will stop your XML applications.
• W3C XML specification states that
– A program should stop processing an XML document if it finds an error.
• The reason is that XML software should be small, fast, and compatible.
• HTML browsers are allowed to display HTML documents with errors
(like missing end tags).
• With XML, errors are not allowed.
9. Valid XML Documents
• A "well formed" XML document is not the same as a "valid" XML
document.
• A "valid" XML document must be well formed.
– In addition, it must conform to a document type definition.
• Two different document type definitions that can be used with XML:
1. DTD (Original Document Type Definition)
2. XML Schema (An XML-based alternative to DTD)
• A document type definition
– Defines the rules and the legal elements and attributes for an XML document.
10. Difference Between XML and HTML
• XML is not a replacement for HTML.
• XML and HTML were designed with different goals:
– XML was designed to describe data, with focus on what data is
– HTML was designed to display data, with focus on how data looks
• HTML is about displaying information,
– While XML is about carrying information.
11. Use of XML
• XML can be used to encode any structured information
• XML is good at representing
– Information that has an extensible, hierarchical format and requires
encoding of metadata.
12. XML Does Not DO Anything
• <note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
• XML document does not DO anything
• It is just information wrapped in tags.
– One must write a piece of s/w to send, receive or display it.
It has
Sender and receiver information,
A heading and a message body.
13. How Can XML be Used?
• XML Separates Data from HTML
• XML Simplifies Data Sharing
• XML Simplifies Platform Changes
• Internet Languages Written in XML
14. XML Delimiter Characters
• XML language has 4 special delimiters
Character Meaning
< Start of an XML markup tag
> End of an XML markup tag
& Start of an XML entity
; End of an XML entity
15. XML Syntax Rules
• All XML Elements Must Have a Closing Tag
• XML Tags are Case Sensitive
Example:
<message>This is incorrect</Message>
<message>This is correct</message>
• XML Elements Must be Properly Nested
• XML Documents Must Have a Root Element
• XML Attribute Values Must be Quoted
16. Empty XML Elements
• An element with no content is said to be empty.
• <element></element>
OR
• <element />
17. XML Attributes
• XML elements can have attributes, just like HTML.
• Attributes provide additional information about an element.
• Attributes often provide information that is not a part of the
data.
• XML Attribute value must be Quoted (single or double quotes)
<person gender="female">
18. XML Tree
• XML documents form a tree structure
– That starts at "the root" and branches to "the leaves".
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!
</body>
</note>
• First line is the XML declaration.
It defines the XML version
• Next line describes the
root element of the document.
<note>
• Next 4 lines describe
4 child elements of root
(to, from, heading, body)
• Last line defines the
end of the root element: </note>
20. XML Elements vs. Attributes
<person gender="female">
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
<person>
<gender>female</gender>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
• There are no rules about
when to use attributes or
when to use elements.
• In XML best advice is to avoid
them. Use elements instead.
21. XML Elements are Extensible
• XML elements can be extended to carry more information.
<note>
<to>Tove</to>
<from>Jani</from>
<body>Don't forget me this weekend!</body>
</note>
MESSAGE
To: Tove
From: Jani
Don't forget me this weekend!
22. Extended without breaking applications
• Previous XML document added some extra information to it:
<note>
<date>2008-01-10</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Should the application break or crash?
No.
Application should be able to find <to>, <from>, and <body> elements
in the XML document and produce the same output.
One of the beauties
of XML, is that it can
be extended without
breaking
applications.
23. Best way of using XML
<note date="2008-01-10">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
A date attribute is used in the first example:
<note>
<date>2008-01-10</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
A date element is used in the second example:
24. • An expanded date element is used in the third:
<note>
<date>
<year>2008</year>
<month>01</month>
<day>10</day>
</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
25. Problems With Using Attributes
1. Attributes cannot contain multiple values (elements can)
2. Attributes cannot contain tree structures (elements can)
3. Attributes are not easily expandable (for future changes)
26.
27. XML Namespaces
• XML Namespaces provide a method to avoid element name conflicts.
• This XML carries HTML table information:
• This XML carries information about a table (a piece of furniture):
<table>
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>
<table>
<name>African Coffee Table</name>
<width>80</width>
<length>120</length>
</table>
28. Solving the Name Conflict Using a Prefix
• If these XML fragments were added together,
– There would be a name conflict.
• Both contain a <table> element,
– But the elements have different content and meaning.
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
In this example,
there will be no conflict
because the two <table>
elements have different
names.
29. Document Type Definition (DTD)
• An XML document with correct syntax is called "Well Formed".
• An XML document validated against a DTD is both "Well
Formed" and "Valid".
30. Document Type Definition (DTD)
• A DTD defines the structure and the legal elements and
attributes of an XML document.
• A DTD can be declared
– Inside an XML document or
– In an external file.
31. Example of DTD inside an XML File
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>
32. Building Blocks of XML Documents
• Elements
• Attributes
• Entities
• PCDATA
• CDATA
• Elements are the main building blocks of both XML and HTML
documents.
<body>some text</body>
<message>some text</message>
33. Attributes
• Attributes provide extra information about elements.
<image src ="computer.gif" />
• The name of the element is "img".
• The name of the attribute is "src".
• The value of the attribute is "computer.gif".
• Since the element itself is empty it is closed by a " /".
34. Entities
• Some characters have a special meaning in XML,
– Like, the less than sign (<) that defines the start of an XML tag.
Entity References Character
< <
> >
& &
" "
' '
35. PCDATA - Parsed Character DATA
• PCDATA is text that WILL be parsed by a parser.
• Text will be examined by the parser for entities and markup.
CDATA - Character DATA.
• CDATA means character data.
• CDATA is text that will NOT be parsed by a parser.
• Tags inside the text will NOT be treated as markup and
entities will not be expanded.
36. Declaring Elements
• General form
• Empty Elements
<!ELEMENT element-name category>
or
<!ELEMENT element-name (element-content)>
<!ELEMENT element-name EMPTY>
Example:
<!ELEMENT br EMPTY>
38. Attribute Types
The attribute-type can be one of the following:
Type Description
CDATA The value is character data
(en1|en2|..) The value must be one from an enumerated list
ID The value is a unique id
IDREF The value is the id of another element
IDREFS The value is a list of other ids
NMTOKEN The value is a valid XML name
NMTOKENS The value is a list of valid XML names
ENTITY The value is an entity
ENTITIES The value is a list of entities
NOTATION The value is a name of a notation
xml: The value is a predefined xml value
39. The attribute-value can be one of the following:
Value Explanation
value The default value of the attribute
#REQUIRED The attribute is required
#IMPLIED The attribute is optional
#FIXED value The attribute value is fixed
41. Why Use a DTD?
• Your application can use a standard DTD
– To verify that the data you receive from the outside world is valid.
• You can also use a DTD to verify your own data.
42. Example of DTD in an external DTD file
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE note SYSTEM "Note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
"note.dtd"
If the DTD is declared in an external file, the <!DOCTYPE>
definition must contain a reference to the DTD file:
43. Example : TV Schedule DTD
<!DOCTYPE TVSCHEDULE [
<!ELEMENT TVSCHEDULE (CHANNEL+)>
<!ELEMENT CHANNEL (BANNER,DAY+)>
<!ELEMENT BANNER (#PCDATA)>
<!ELEMENT DAY (DATE,(HOLIDAY|PROGRAMSLOT+)+)>
<!ELEMENT HOLIDAY (#PCDATA)>
<!ELEMENT DATE (#PCDATA)>
<!ELEMENT PROGRAMSLOT
(TIME,TITLE,DESCRIPTION?)>
<!ELEMENT TIME (#PCDATA)>
<!ELEMENT TITLE (#PCDATA)>
<!ELEMENT DESCRIPTION (#PCDATA)>
<!ATTLIST TVSCHEDULE NAME CDATA #REQUIRED>
<!ATTLIST CHANNEL CHAN CDATA #REQUIRED>
<!ATTLIST PROGRAMSLOT VTR CDATA #IMPLIED>
<!ATTLIST TITLE RATING CDATA #IMPLIED>
<!ATTLIST TITLE LANGUAGE CDATA #IMPLIED>
]>
46. XML Schema
Referred as XML Schema Definition (XSD)
An XML Schema describes the structure of an XML
document.
47. What You Should Already Know
• Before you continue, have a basic understanding of the
following:
– HTML
– XML
– A basic understanding of DTD
48. What is an XML Schema?
• An XML Schema
– Describes the structure of an XML document, just like a DTD.
• An XML document with correct syntax is called "Well Formed".
• An XML document validated against an XML Schema is both
"Well Formed" and "Valid".
49. What XML Schema Defines?
• An XML Schema:
– Defines elements that can appear in a document
– Defines attributes that can appear in a document
– Defines which elements are child elements
– Defines the order of child elements
– Defines the number of child elements
– Defines whether an element is empty or can include text
– Defines data types for elements and attributes
– Defines default and fixed values for elements and attributes
50. Why Use XML Schemas?
• XML Schemas Support Data Types
• XML Schemas use XML Syntax
• XML Schemas Secure Data Communication
• XML Schemas are Extensible
51. Why Use XML Schemas?
• Well-Formed XML document is not Enough
– Even if documents are well-formed they can still contain errors, and
those errors can have serious consequences.
• Think of the following situation:
– You order 5 gross of laser printers, instead of 5 laser printers.
– With XML Schemas,
• Most of these errors can be caught by your validating software.
52. XSD - The <schema> Element
General Form :
<?xml version="1.0"?>
<xs:schema>
...
...
</xs:schema>
The <schema> element may contain some attributes.
53. Example
A Simple XML Document called "note.xml":
<?xml version="1.0"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
DTD file called "note.dtd"
That defines the elements of the above XML document
<!ELEMENT note (to, from, heading, body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
54. An XML Schema
• An XML Schema file "note.xsd"
– That defines the elements of the XML document above ("note.xml")
<?xml version="1.0"?>
<xs:schema xmlns:xs = "http://www.w3.org/2001/XMLSchema"
targetNamespace = "http://www.w3schools.com"
xmlns = "http://www.w3schools.com"
elementFormDefault = "qualified">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Other elements (to, from, heading, body)
are simple types because they do not contain
other elements.
Note element is a complex
type because it contains other
elements.
55. • xmlns:xs = http://www.w3.org/2001/XMLSchema
– Indicates that the elements and data types used in the schema
come from the "http://www.w3.org/2001/XMLSchema" namespace.
• targetNamespace = http://www.w3schools.com
– Indicates that the elements defined by this schema (note, to, from,
heading, body) come from the "http://www.w3schools.com"
namespace.
56. • xmlns=http://www.w3schools.com
– indicates that the default namespace is "http://www.w3schools.com".
• elementFormDefault="qualified“
– indicates that any elements used by XML instance document which
were declared in this schema must be namespace qualified.
57. What is a Simple Element?
• A simple element is an XML element that can contain only text
– It cannot contain any other elements or attributes.
• Text can be of many different types.
– It can be one of the types included in XML Schema definition
(boolean, string, date, etc.), or
– It can be a custom type that you can define yourself.
• Syntax for defining a simple element is:
<xs:element name="xxx" type="yyy"/>
58. Most Common Types
• XML Schema has a lot of built-in data types.
• Most common types are:
xs:string
xs:decimal
xs:integer
xs:boolean
xs:date (YYYY-MM-DD)
xs:time (HH:MM:SS)
59. Default Values for Simple Elements
• A fixed value is automatically assigned to the element,
– You cannot specify another value.
• In the following example the fixed value is "red":
<xs:element name="color"
type="xs:string" fixed="red"/>
60. Fixed Values for Simple Elements
• A default value is automatically assigned to the element
– When no other value is specified.
• In the following example the default value is "red":
<xs:element name="color"
type="xs:string" default="red"/>
61. XSD Attributes
• Simple elements cannot have attributes.
• If an element has attributes,
– It is considered to be of a complex type.
• But the attribute itself is always declared as a simple type.
<xs:attribute name="xxx" type="yyy"/>
xxx is the name of the attribute
yyy specifies the data type of the attribute.
63. XML Document Structure
• An XML document consists of a number of discrete components
• Not all the sections of an XML document may be necessary,
– But their inclusion helps to make for a well-structured XML document
• A well-structured XML document can
– Easily be transported between systems and devices
64. Major portions of an XML document
• The major portions of an XML document include the following:
– The XML declaration
– The Document Type Declaration (DTD)
– The element data
– The attribute data
– The character data or XML content
65. XML Declaration
• XML Declaration is a definite way of stating exactly
– What the document contains.
• XML document can optionally have an XML declaration
– It must be the first statement of the XML document
• XML declaration is a processing instruction of the form
<?xml ...?>
66. Components of XML Declaration
Component Meaning
<?xml Starts the beginning of the processing instruction
Version= “xxx” Describes the specific version of XML being used
standalone= “xxx” Defines whether documents are allowed to contain
external markup declarations
encoding= “xxx” Indicates the character encoding that the document uses.
The default is “US-ASCII” but can be set to any value
Example :
67. Document Type Declaration (DTD)
• A DTD defines the structure and the legal elements and
attributes of an XML document.
• An application can use a DTD to verify that XML data is valid.
• If the DTD is declared inside the XML file, it must be wrapped
inside the <!DOCTYPE> definition:
• Document Type Declaration (DOCTYPE) gives a name to the
XML content
• A Document Type Declaration (DOCTYPE)
– Names the document type and
– Identifies the internal content
68.
69. OOA
• OOA emphasis on
– Finding and describing the objects (or concepts in the problem
domain)
• OOD emphasis on
– Defining software What the system does (its static structure and
behavior),
• OOD focuses on
– How the system does it (it’s run-time implementation).
70. OOA
• During OOA, the most important purpose is
– To identify objects and describing them in a proper way.
• Objects should be identified with responsibilities.
• Responsibilities are the functions performed by the object.
– Each and every object has some type of responsibilities to be performed.
• When these responsibilities are collaborated the purpose of the system is
fulfilled.
71. OOD
• Second phase is OOD. During this phase
– Emphasis is given upon the requirements and their fulfillment.
• In this stage,
– The objects are collaborated according to their intended association.
• After the association is complete the design is also complete.
• Third phase is object oriented implementation.
– In this phase the design is implemented using object oriented languages like
Java, C++ etc.
72. UML
• OO design is transformed into UML diagrams according to the
requirement.
• Input from the OO analysis and design is the input to the UML diagrams.
• UML is a modeling language used to model s/w and non s/w systems.
• Although UML is used for non software systems
– Emphasis is on modeling object oriented software applications.
73. Conceptual model of UML
• A conceptual model can be defined as
– A model which is made of concepts and their relationships
• A conceptual model is the first step before drawing a UML diagram.
• It helps to
– Understand the entities in the real world and how they interact with each other.
• Conceptual model of UML can be mastered by learning the 3 major
elements:
1. UML building blocks
2. Rules to connect the building blocks
3. Common mechanisms of UML
74. UML building blocks
• Three building blocks of UML are:
– Things
– Relationships
– Diagrams
75.
76. Object
• Object is a term used to represent a real world entity within
the system being modeled
• Objects has its own attributes and operations (methods).
– Consider the real world object “Car”
make = ford
model = escape
year = 2002
color = green
maximum speed = 130 mph
current speed = 50 mph
accelerate ()
decelerate ()
refuel ()
77. Network
• During the OOA phase object-modeling techniques are used
• To reflect the structure and behaviour of the system
– A range of models can be created using such objects