Chapter 2 - Data and Data Processing

Chapter 2 Part 3 Data formats 





Part 3: Data formats


Data format - File format

Data format

Describes the structure and organization of the data, regardless of how it is stored or transmitted. It determines how the data is to be interpreted.

Examples:

  • JSON: A text-based data structure for data exchange.
  • CSV: Table structure with values separated by commas.
  • XML: Hierarchical structure with tags to describe data.

Properties:

  • Can be used both in files and directly in data transfer protocols.
  • Independent of the physical storage location of the data.

File format

Describes how data is stored in a file and specifies the programs or systems with which this file can be opened or processed.

Examples:

  • .docx: File format for Microsoft Word documents.
  • .jpg: Format for image files.
  • .mp4: File format for video.

Properties:

  • Identified by the file extension (e.g. .txt, .pdf) or by specific header information within the file.
  • Focuses on the storage of data on a storage medium.

Data format and data processing

Within data processing, the term data format defines how data is structured and displayed and how it is to be interpreted during processing.

Examples:

  • Characters and digits
  • Figures in a wide variety of formats
  • Logical statements (true or false)

For data fields:

  • The length of the data field,
  • the number of decimal places,
  • the type of presentation,
  • which values a field can assume,
  • and other specific information.

Data format and data processing


Digit as a number: 1 has a numerical value. It is the smallest natural number and the basis for many mathematical operations. For example: 1+1=2.

Digit as a character: In texts and documents, the 1 can be used as a character to represent information without having a numerical value. For example, in a telephone number: "09913615199" or password: abs1xr!y

Numbers and their representation in computer systems

Number systems have developed historically. Roman numeral writing, for example, is an addition system.

The decimal system, a place value system in which 10 digits are assigned a value via their position within a number, is now common worldwide. The individual digits are defined by 100, 101, 102, 103, … or 1, 10, 100, 1000, ... digits.

In the computer world, the dual or binary system is mainly used. Here, only 2 digits (0 and ) are used, resulting in the place values 20, 21, 22, 23, 24, … or 1, 2, 4, 8, 16, ... .

For reasons of simplification, the octal (0-7) or hexadecimal (0-F) system is also used in some cases.
The binary number 01011 therefore corresponds to a decimal value of 11 (1+2+0+8+0=11)

The dual system in computer technology is based on the principle of circuits that can be off (0) or on (1).

Working with dual values

  • Decimal system
    1011 = 1*103+0*102+1*101+1*100 = 1000+0+10+1 = 1011
  • Dual system
    1011 = 1*23+0*22+1*21+1*20 = Dezimal: 9
  • What is 123 decimal in dual system

Screenshot of Office programs


Characters and their representation in computer systems

  • Like numbers, characters in computer systems are also represented in dual form. The ASCII or EBDI code is used for this.
  • Classification as a character or word (STRING) means that the corresponding dual code is not interpreted as a number but as a character.

Screenshot of Office programs





<<< back to 2.2

continue to 2.4 >>>