Chapter 2 - Data and Data Processing

Chapter 2 Part 1 Concepts of data, information and knowledge 





Part 1: Concepts of data, information and knowledge

Structure of the knowledge pyramid


The concept of data

The term data refers to information or values that are available in a raw or processed form and serve as the basis for decisions, calculations or analyses. It can take the form of numbers, text, images, audio files or other formats and represents the building blocks of knowledge when placed in context.

Characteristics of data

  • Neutrality: Data is initially neutral information without interpretation or context.
    • Example: "25" is a data value that has no meaning without context.
  • Formats: Data can be available in various forms, e.g. as numbers, letters, symbols, images, videos or sounds.
  • Use: Data is the basis for analyses, calculations, reports and decisions.
  • Storage: Data can be stored digitally (e.g. in databases, files) or analog (e.g. on paper).
  • Data can be divided into different categories. These classifications help to better understand, organize and use data in a targeted manner.

Data classification

Data clissification

By structure
by structure

Structured data

  • Data that follows a fixed format or schema, e.g. tables with columns and rows.
  • Examples:
    • Tables in a database
    • Excel tables
    • Sensor values (e.g. temperature: 22°C)

Unstructured data

  • Data without a standardized format or schema.
  • Examples:
    • Texts in documents
    • Pictures, videos, audio files
    • Emails

Semi-structured data

  • Data with a partially fixed structure, often organized by tags or key-value pairs.
  • Examples:
    • XML or JSON files
    • Log files
    • HTML pages



By origin
by origin

Primary data

  • Data collected directly from a source or observation.
  • Examples:
    • Results of surveys
    • Raw data from sensors
    • Sales data of a POS system

Secondary data

  • Data that has already been processed or collected from other sources.
  • Examples:
    • Reports based on primary data
    • Data from public statistics
    • Summarized research results




By representation
by representation

Numerical data

  • Data represented by numbers.
  • Examples:
    • Age (e.g. 25 years)
    • Temperature (e.g. 22.5°C)

Categorical data

  • Data that can be divided into groups or categories.
  • Examples:
    • Gender (male, female, diverse)
    • Colors (red, blue, green)

Textual data

  • Data in the form of text or speech.
  • Examples:
    • Blog posts
    • Customer ratings
    • Books

Multimedia data

  • Data in the form of images, videos or audio files.
  • Examples:
    • Photos
    • Movies
    • Music files




By access method
by access method

Static data

  • Data that does not change or rarely changes.
  • Examples:
    • Archive data
    • Legal texts

Dynamic data

  • Data that changes frequently or is updated in real time.
  • Examples:
    • Share prices
    • Traffic data
    • Weather forecasts




After use
after use

Transactional data

  • Data that is used in operational processes.
  • Examples:
    • Orders in an online store
    • Bank transactions

Analytical data

  • Data used for reports and analyses.
  • Examples:
    • Turnover statistics
    • Market analyses
    • User behavior




By sensitivity
by sensitivity

Public data

  • Data that is accessible to everyone.
  • Examples:
    • Weather data
    • Wikipedia content

Confidential data

  • Data with restricted access.
  • Examples:
    • Company reports
    • Customer lists

Personal data

  • Data relating to individuals.
  • Examples:
    • Name, address, telephone number
    • Health data

Sensitive data

  • Data that requires special protection.
  • Examples:
    • Passwords
    • Bank details
    • Military secrets


Example



By storage form
by storage form

Local data

  • Data stored on a single device or server.
  • Examples:
    • Files on a hard disk
    • Local database

Cloud data

  • Data stored in a cloud environment.
  • Examples:
    • Google Drive documents
    • Data in Amazon Web Services (AWS)





<<< back to 2.0

continue to 2.2 >>>