STR | Structured Data |
Structured data refers to organising information according to well-defined constraints. While historically this largely meant using relational databases, the growth of markup languages, document databases and other 'NoSQL' approaches has led to a range of approaches for specifying more flexible structures. This course covers the two leading technologies for data interchange in this space: XML and JSON. It will benefit those wishing to gain an in-depth understanding of the relative merits of these technologies, and best practices for their use.
The Extensible Markup Language (XML) is a language designed for the definition of document structures, and the production of structured documents. It can be used to define application-specific representations that are easy to process and transform. JSON (JavaScript Object Notation) is a more lightweight data-interchange format arising from the web programming community. It uses syntax familiar from mainstream programming languages, and is both easy for humans to read and write, and for machines to parse and generate. Both approaches facilitate the interchange of information between different systems and components.
Frequency
This course normally runs once a year.
Course dates
Objectives
At the end of the course, students will understand how data can be structured using XML and JSON. They will be able to
- select an appropriate approach for different situations;
- design XML vocabularies and JSON structures;
- create schema documents to specify and validate XML documents and JSON data;
- use JSON-LD for representing Linked Data
- query an XML document using XPath;
- transform an XML description into other language representations, such as HTML or alternative XML representations;
- work with XML databases using XQuery.
Contents
- Introduction
- Motivation for and history of XML & JSON
- Data modelling
- Representing data in XML & JSON; common vocabularies
- Validating XML
- Defining the structure and content of a document; a type system for XML; contrasting XML Schema and RELAX NG
- Describing JSON structures
- JSON Schema; JSON-LD; comparing XML & JSON validation approaches
- XPath
- Locating content within an XML document; computation
- XSLT
- Transforming XML documents for presentation and for processing; functional programming in XSLT
- XQuery
- Querying and updating XML databases
- Structured data in context
- Programming language support & libraries for processing XML & JSON; advanced validation with Schematron; other related standards
Each topic will be introduced with a lecture and key concepts explained. There will then be small practical exercises to enable students to get to grips with the topic. An extended case study running throughout the week will allow students to see how these technologies operate in a more realistic scenario. More than 50% of the week will be practical.
Requirements
There are no particular requirements for this course.