신생공NEWBORN SPACE

PROJECT OVERVIEW

Rapid advancements in media technology have blurred the boundaries between physical reality and virtual space, fundamentally reshaping how we enjoy and experience space. Contemporary individuals, accustomed to the infinite variations of digital space, demand new stimuli and experiences even within essentially fixed physical environments. However, these fixed physical forms face limitations in keeping pace with the rapidly changing speed of sensory experience. Amidst this flow of the times, 'Newborn' seeks sustainable values that allow physical space to be continuously enjoyed.

Project 'Newborn Space' is an experiment that predicts and implements future cultural patterns through contemporary technology. It is an attempt to grant a new form—befitting current and future technological environments—to things destined to disappear due to technology, thereby allowing their existence to continue. In this context, 'Newborn Space' moves away from the vision that has dominated the basis of spatial perception and summons the marginalized 'auditory sense' to the center. While vision clearly separates and defines objects, sound flows without boundaries, permeating and filling the gaps of space, functioning as an invisible medium. Auditory information, which we perceive but often do not consciously register, indirectly conveys information about a space. This project brings these auditory layers to the forefront, attempting 'Spatial Upcycling' to endow familiar physical spaces with invisible value. Instead of physical reconstruction, it adopts a method of collecting and reinterpreting sound information inherent in a location to overlay layers of invisible experience onto the space. This is a complete paradigm shift that subverts solid Ocularcentrism and allows space to be sensed anew through audible vibrations rather than visible forms.

'Newborn Space' is not a fixed entity but an organic landscape that is constantly created and extinguished according to sound waves. Artificial intelligence, having learned 360-degree audiovisual data collected from various locations, constructs a virtual space solely based on input sound, with visual information eliminated. The AI in the work functions beyond a simple computing device; it acts as a 'Synesthetic Narrator' that senses forests, oceans, and unknown spaces within the collected data. The process where cold urban noise is reduced to images of dense forests, while the flowing sounds of nature transition into dry mechanical forms, provides the audience with an intense synesthetic expansion. Paradoxically, this allows contemporaries accustomed to the infinite variations of digital space to experience the most immersive 'The Real' within a physical space.

Through a landscape of sound that rejects fixed forms and flows endlessly, 'Newborn Space' makes the audience perceive familiar daily spaces as unfamiliar and rediscover the infinite possibilities inherent within them. Standing before a space where auditory senses interpret and AI recreates—beyond the physical reality defined by vision—the audience comes to gaze at the flip side of the physical world we stand on. This is an artistic performance that goes beyond simple aesthetic appreciation of space, presenting a new existential meaning for physical space in response to the expanding digital realm, and fundamentally rethinking the way humans relate to space.

TECHNICAL PROCESS

'Newborn Space' is a project that transforms auditory information inherent in physical space into visual space using artificial intelligence. The AI, trained on 360-degree audiovisual data collected from various locations, identifies the correlation between sound and image, generating virtual space based on input sound. The space generated in this way is delivered to the audience through various media, forming a fluid landscape that constantly changes in reaction to ambient sound.

The project performs systematic version control based on data collection methods, model structures, and input audio characteristics. Each version number represents: Major (structural transitions), Minor (gradual improvements), and Patch (detailed modifications). As data accumulates and the model improves through repetitive learning, a more concrete form of 'Newborn Space' is being realized, and each version clearly records this process of technological evolution.

2023

Version 1.x

Data Collection

  • 4x Action Cameras (Panoramic Setup)
  • Ambisonic Audio Recorder

AI Model

  • Base : pix2pix
  • Dataset : Small Scale

As the project's initial iteration, audiovisual data was collected using four action cameras arranged in a panoramic configuration alongside an ambisonic spatial audio recorder. By training a pix2pix-based model on a limited dataset, this stage experimented with the feasibility of sound-to-image translation.

2024

Version 2.0.x

Data Collection

  • 360° Camera
  • Ambisonic Audio Recorder

AI Model

  • Base : pix2pix
  • Dataset : Expanded

The utilization of a 360-degree camera enabled the acquisition of spatial data where image and sound are more seamlessly integrated. This version leveraged a significantly expanded dataset for training compared to the previous iteration.

2025 - Now

Version 2.1.x

Data Collection

  • 360° Camera
  • Ambisonic Audio Recorder

AI Model

  • Base : Modified pix2pix
  • Dataset : Expanded

The pix2pix model was adapted to be optimized for training on 360-degree Equirectangular Images, accompanied by a fundamental restructuring of the input data format. By converting audio Mel Spectrograms into the equirectangular format for training, this version establishes a methodology that directly maps the auditory characteristics of sound onto spatial information.

In Development

Version 3.x

Focus

  • Spatial Dimensionality

Approach

  • 3D Scanning & 2D to 3D Conversion

Focusing on spatial dimensionalization as a core objective, future research will explore methods to convey the sound-generated 'Newborn Space' with three-dimensional depth and volumetric presence.

CURRENT WORKFLOW

01. DATA COLLECTION

Audiovisual data for AI training is collected using 360-degree cameras and spatial audio recorders. Please drag the map on the below to explore the data.

TOTAL LOCATION194
REGIONSKRJPFRDENL

02. DATA PRE-PROCESS

The collected 360-degree Ambisonic audio is separated by direction, converted into Mel Spectrograms, and processed into tensor formats optimized for AI training. Specifically, Spherical Coordinate Mapping is applied to preserve spatial acoustic information. This design enables the AI to precisely learn the correspondence between sound characteristics—across the entire frequency spectrum—and their specific visual locations and elements.

20kHz
20Hz
BeforeAfter

Current Approach (v2.1.x)

Ambisonics Processing

  • Ambisonics A-format → B-format Conversion
  • Audio Normalization
  • Channel Alignment

Directional Audio Extraction

  • Extract Audio by Direction (0°~360°)
  • Spatial Audio Decomposition
  • Generate Directional Audio Signal

Mel-Spectrogram Conversion

  • Time-Frequency Transformation per Direction
  • Mel-scale Frequency Mapping
  • Generated Spectrograms for Each Direction

Spherical Coordinate Mapping

  • Map audio energy to spherical grid
  • Coordinates: (θ: Azimuth, φ: Elevation)
  • Energy distribution per direction

Frequency Layer Stacking

  • Divide into multiple frequency bands
  • Stack as multi-layer tensor
  • Each layer = Energy at one frequency band

03. AI TRAINING

During the training process, the AI iteratively learns from preprocessed audio-image pairs to identify correlations between auditory and visual information. The model establishes a translation system that interprets frequency distribution, spatial directionality, and temporal changes as spatial form, color, and texture. This methodology is continuously refined through improvements in data structures and model architecture.

BeforeAfter

ARTWORKS