REFRAMING AUTHORSHIP
The Evolving Role of Architects in the Age of Generative AI
In collaboration with Lee-Su Huang
,
In recent months, the architectural field has witnessed a significant shift towards the mass production of images. This trend might suggest that architects have assumed a more passive role, acting primarily as curators rather than creators of these images1. These images are not original creations but iterations of potential projects that fuel inspiration for future works (Figures 01-02). The authorship of these images is a collective effort involving not only the architects but also the researchers who develop the algorithms, the individuals who compile the training data, and the designers who execute the image generation2. Each of these roles is crucial and contributes significantly to the final product. This collaborative nature underlines the importance of not attributing all design tasks to a single algorithm. It is crucial to ensure that both the data used for training and the algorithm’s architecture are sufficiently robust. Architecture transcends the mere formal aspects that an image can capture; it is the culmination of efforts from all parties involved in the creation process.
DIFFUSION MODELS
To effectively utilize a diffusion model (DM) in this context, it’s essential to have a comprehensive understanding of what a DM entails. A DM is a machine learning generative model that methodically refines a noise distribution to generate complex data samples, such as images or text. The process begins with a noisy input, incrementally cleaned up through a series of steps to yield a clear, coherent output. This progression is guided by a neural network trained to predict the reverse diffusion steps from noise back to data. Understanding this process empowers architects and researchers to make informed decisions about the application of DM in architectural image creation. Understanding the relationship between the input provided and its influence on the algorithm’s output is crucial when using a DM. To delve into this input-output relationship, we can examine the qualitative effects of the two primary inputs, the prompt and the control image3, by their Level of Detail (LOD). Figures 3 and 4 show the x-axis for Prompt LOD and the y-axis for Image LOD. This setup allows for an easy comparison across both dimensions. By modifying the steps and sequentially denoising the input, we can test the text input’s LOD. To make these comparisons, we also fix the random generator seed of the initial noise input in the latent image.
IMAGE AND TEXT PROMPTS: IDENTIFYING LEVELS OF DETAIL
As each axis increases towards a higher LOD, the DM’s ability to infer context and detail is also enhanced. For example, higher prompt specificity brings out contextual shading, while a high-detail input sketch yields more accurate renderings of intricate features like fenestration. Given the DM’s rapid output capabilities, regulating the number of generation cycles for each image in this initial experiment was necessary. Further exploring image inputs with varying line weights—from light to heavy—using the same text prompt reveals a clear relationship between stroke weight and LOD. The algorithm interprets heavier line weights as volume indicators, introducing shadows and giving the image a sense of depth and perspective (Figure 5). This exploration underscores the importance of the detail level and line weight used in control images, highlighting their influence on the perceived LOD in the final output.
APPLICATION
The prior investigation helps architects understand the underlying principles and the relationship between text prompt LOD, image LOD, the number of diffusion steps, and denoising strength. The next step is understanding how one might use DM generatively as a natural part of the design process. To answer this question, we return to the beginnings of architecture education in the Ecole des Beaux-Arts, and more specifically, the ‘analytique’ (Figure 6). Popular during the late 19th and early 20th Century, these were composite drawings that were cross-scalar and combined different forms of representation of a project through plans, sections, detail, ornament, and perspectives4. As an analog drawing, this multi-modal representation has a certain efficiency that allows for the simultaneous interrogation of part-to-whole relationships, proportion, composition, tectonics, geometry, texture, and shadow. As others have reconsidered the analytique more recently, we can reconceptualize what the equivalent might be in the context of Generative AI and DMs5.
As an early-stage schematic design exercise, these “AI analytique” image grids bring together conceptual parti sketches, 3D massing, site axonometric views, elevations, and perspectives with a combination of line drawings and DM reinterpretations (Figures 7-8). This is achieved by leveraging algorithms and AI models, notably depth-map and edge detection with ControlNet models, to constrain the DM to the given image inputs and control how closely it is meant to follow the design intent6-7. This yields a range of cross-scalar images that span from conceptual/diagrammatic to detailed street perspectives and aids in understanding what the potential of a massing scheme with only very simple massing model inputs. Multiple iterations with this level of visual detail and clarity would not be feasible time-wise without the recent advances in DMs and is thus a particularly potent tool for students and practitioners alike as an early-phase schematic design tool. More importantly, it echoes the major goals of the Beaux-Arts analytique as a combinatorial document that works across multiple scales, design options, and viewpoints and captures the fluid nature of design exploration and conceptualization not only as a singular representation but as a multitude of possibilities to be evaluated and curated.
CONCLUSIONS
Integrating DM in architecture has prompted a reevaluation of the architect’s role from creator to curator. While these technological advances offer unprecedented capabilities in image generation and design iteration, they challenge traditional notions of authorship and creative control8. Architects, algorithm developers, and data curators now share the responsibility for creating these powerful tools. This collaborative environment somewhat democratizes design and emphasizes the ethical considerations of using such technologies—particularly regarding data transparency and algorithmic accountability9. As we move forward, it is crucial for the architectural community to actively engage with these tools, ensuring that they enhance their design process in an informed and equitable manner.
ACKNOWLEDGEMENTS
The authors would like to thank students from Institution Name Redacted who enrolled in the Course Name Redacted course.
Karla Saldaña Ochoa is a Tenure-track Assistant Professor in the School of Architecture at the University of Florida. She leads the SHARE Lab, a research group focused on developing human-cen-tered AI projects on design practices. Her teaching and research investigate the interplay of Artificial and Human Intelligence to empower creativity and solve problems for the social good. www. ai-share-lab.com/
Lee-Su Huang received his Bachelor of Architecture from Feng-Chia University in Taiwan and his Master in Architecture degree from Harvard University’s Graduate School of Design. He has prac-ticed in Taiwan, in the United States with Preston Scott Cohen Inc., and with LASSA Architects in Seoul. As co-founder and principal of SHO, his research and practice centers on digital design+fab-rication methodology, parametric design optimization strategies, as well as kinetic/interactive architectural prototypes. Lee-Su is currently an Instructional Associate Professor at the University of Florida’s School of Architecture, teaching design studios, digital media, and parametric modeling courses. www.sh-o.us/
ENDNOTES
1. Ar Mohesh Radhakrishnan, “Is midjourney-AI the new anti-hero of architectural imagery & creativity?,” Global Scientific Journals Volume 11.1 (2023): 94-104.
2. Sandra Manninger and Matias del Campo, "Deep Mining Authorship," In Phygital Intelligence. CDRF 2023. Computational Design and Robotic Fabrication. Yan, C., Chai, H., Sun, T., Yuan, P.F. (eds) Springer, Singapore, 2024. https://doi.org/10.1007/978-981-99-8405-3_1
3. Lvmin Zhang, Anyi Rao, and Maneesh Agrawala. "Adding conditional control to text-to-image diffusion models." In Proceedings of the IEEE/ CVF International Conference on Computer Vision, pp. 3836-3847. 2023. https://arxiv.org/abs/2302.05543
4. John F. Harbeson, The Study of Architectural Design : With Special Reference to the Program of the Beaux-Arts Institute of Design. New York: Pencil Points Press, 1926.
5. Katie Kingery-Page, “The Post-Modern Analytique.” in Proceedings: CELA 2008-2009 Teaching + Learning Landscape, Council of Educators in Landscape Architecture, 2009. p. 155-165.
6. John Canny, "A Computational Approach To Edge Detection," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-8. 679 - 698. Nov. 1986, doi:10.1109/TPAMI.1986.4767851.
7. Zhang, Rao and Agrawala, “Adding conditional control", 3813-3824.
8. Matias del Campo, Alexandra Carlson, and Sandra Manninger, "Towards Hallucinating Machines - Designing with Computational Vision," In International Journal of Architectural Computing. 2021;19(1):88-103. doi:10.1177/1478077120963366
9. Tatiana Lau, Scott Carter, Francine Chen, Brandon Huynh, Everlyne Kimani, Matthew L Lee, and Kate A Sieck, "Democratizing Design through Generative AI," In Companion Publication of the 2024 ACM Designing Interactive Systems Conference (DIS '24 Companion). Association for Computing Machinery, New York, NY, USA, 2024. 239–244. https://doi.org/10.1145/3656156.3663703