by Simone C Niquille /Technoflesh
Format: HD Video, 3D Animation
Duration: 12 min 45 sec
Audio: 4 Channel Surround Sound
Commissioned by Housing the Human with support by Swissnex San Francisco and Fotomuseum Winterthur.
The limits of my language mean the limits of my world.
The limits of my categories mean the limits of my world.
The limits of my data mean the limits of my world.
And I am left with a million questions.
Language Fails Me.
Household robots rely on computer vision to navigate their domestic environment. However, a camera alone does not know what it is looking at. In order to recognise and understand the spaces, objects and things that it encounters, the robot needs to learn about its future environment. For this purpose large datasets of 3D files are virtually assembled into model homes. These training datasets pose a set of challenges: Where to find representational data on tens of thousands of homes, people's most private space? How to sort objects and spaces into categories, is this a vase or a bowl or a cup? Each category asks for a different behaviour and context, information that is largely banal to the human being but incredibly complex to teach a robot. Limited to the available data, a training dataset will hardly be able to represent all of ‘reality’ and ultimately reduce and omit everything that is not recorded. This confusion between models of reality and reality itself has been address by scientist and philosopher Alfred Korzybski as 'the map is not the territory’. If the map is not the territory, is the database the home? If the database is not the home, what then are the architectural and bodily consequences of cohabiting with computer vision?
Set within a scenography assembled with the contents of one of the largest training datasets SceneNet RGB-D, the short film HOMESCHOOL makes visible the training data that usually is sealed within the technology. This virtual environment is explored by an unknown first person narrator, learning by seeing, struggling to understand, dwelling in ambiguities.
Visually the film explores computational vision. The images rendered in grey or orange gradient use what is referred to as Z-depth in three dimensional imaging. Z-depth is the rendering of distance in virtual space, graphically represented by a receding gradient. To the human eye the gradient reads as an effect, to the computational eye it contains vital distance data. These images are generally not visible and processed by computer vision mechanisms internally to facilitate orientation.
Another set of images appear blurry, smeared in places, sharp in others. They are rendered with 3D software using one pass only and afterwards applying an artificially intelligent denoising filter. A higher amount of render passes usually results in a clearer, less noisy image. Generally it also leads to longer render times. Denoising is the process of reducing noise in 3D rendered images. Before introducing AI technology to denoising, various mechanism of blurring and color matching pixels were used. In HOMESCHOOL a denoising filter has been applied to noisy renders that were barely readable to the human eye. The denoising filter used is based on NVIDIA’s OptiX AI-Accelerated Denoiser which has been trained by tens of thousands of images, many of which depict similar domestic scenes as proposed by the SceneNet RGB-D dataset. As such the denoiser approximates the information of the noise rendered images with the information learned from its training dataset. The resulting image is the noisy render as seen through the 'eyes' of the denoising filter: an amalgam of training datasets optimizing computer vision at the cost of reducing information.
Training datasets render a world in which only what can be named and has been captured exists. Everything unknown is non-existent. Language in this case is not a tool for searching, describing but rather a tool to exclude everything that does not have a name. HOMESCHOOL makes visible the training data sealed within the resulting technology posing questions on categorisation, cultural bias and assumptions designed into these digital domestic environments. Are domestic spaces defined by the objects they hold or by the people, rituals, behavior that live in them?
• Regarding the Pain of Spotmini
Machine Landscapes: Architectures of the Post Anthropocene
ed. Liam Young. AD Wiley 2018
• Research, Script & Animation: Simone C Niquille
• Music: Jeff Witscher
(after Pink Floyd 'Come in Number 51, Your Time Is Up', Zabriskie Point Soundtrack)
• Voice Over: Kiara K
• Interior & Furniture Assets: SceneNet-RGBD, Dyson Robotics Lab at Imperial College London
→ Feauturing paraphrased excerpts of Ludwig Wittgenstein's 1922 writing on language and the limits of thought in 'Tractatus Logico-Philosophicus’, reinterpreted for the age of computer vision systems.